WebmasterID logoWebmasterID
User agents

Wget user agent

GNU Wget is a non-interactive command-line tool for downloading files and mirroring sites over HTTP, HTTPS, and FTP. Its default user agent contains a Wget product token with a version. The raw token is a long-standing signal of scripted downloading, though it is trivially changed with a flag.

Partially verified

What this means

GNU Wget is a widely available command-line utility for retrieving content over HTTP, HTTPS, and FTP. It is non-interactive, so it runs in scripts, cron jobs, and one-off commands. Its default user agent includes a Wget product token and version.

Wget can fetch a single URL or recursively mirror a whole site. Recursive mode can generate many requests quickly, which is why aggressive Wget runs sometimes look like a crawl in your logs.

Identifying and overriding Wget

The default Wget token is easy to spot. However, Wget supports a flag to set an arbitrary user agent, so operators sometimes change it to mimic a browser or to provide a descriptive identifier. The raw token therefore indicates default, often casual, usage rather than the full population of Wget traffic.

When the token is present, combine it with the request pattern: recursive depth, rapid sequential fetches of many URLs, and a lack of asset/JS execution all point to scripted downloading rather than human browsing.

Allow, rate-limit, or block

Legitimate Wget use includes backups, mirroring with permission, and automation you control. For those, a descriptive custom user agent and rate-aware settings keep the activity transparent.

For unwanted recursive mirroring, rate-limiting and robots.txt (which Wget can be asked to honour) are more effective than blocking the exact token, which is trivially changed. Never rely on the user agent alone for access control.

How it appears in analytics and logs

A user agent containing a Wget token is the Wget command-line tool fetching content — automation, not a human visit. Aggressive recursive Wget mirroring can resemble a crawl; the token plus request pattern usually makes the intent clear.

Diagnostic use case

Recognise command-line downloading and site-mirroring traffic by the Wget default token, while remembering the token can be overridden and is sometimes set to mimic a browser.

What WebmasterID can help detect

WebmasterID classifies Wget traffic as scripted automation, so command-line downloads and site mirrors appear in bot-intelligence rather than inflating human page-view counts.

Common mistakes

Privacy and accuracy notes

Wget is identified from the user-agent token alone — a tool run by a script or operator, not a browsing person. WebmasterID records it as automation, separate from human analytics.

Related pages

Sources and verification notes

Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.