Wget user agent
GNU Wget is a non-interactive command-line tool for downloading files and mirroring sites over HTTP, HTTPS, and FTP. Its default user agent contains a Wget product token with a version. The raw token is a long-standing signal of scripted downloading, though it is trivially changed with a flag.
What this means
GNU Wget is a widely available command-line utility for retrieving content over HTTP, HTTPS, and FTP. It is non-interactive, so it runs in scripts, cron jobs, and one-off commands. Its default user agent includes a Wget product token and version.
Wget can fetch a single URL or recursively mirror a whole site. Recursive mode can generate many requests quickly, which is why aggressive Wget runs sometimes look like a crawl in your logs.
Identifying and overriding Wget
The default Wget token is easy to spot. However, Wget supports a flag to set an arbitrary user agent, so operators sometimes change it to mimic a browser or to provide a descriptive identifier. The raw token therefore indicates default, often casual, usage rather than the full population of Wget traffic.
When the token is present, combine it with the request pattern: recursive depth, rapid sequential fetches of many URLs, and a lack of asset/JS execution all point to scripted downloading rather than human browsing.
- Default token: Wget plus version
- User agent is overridable with a command-line flag
- Recursive mirroring can look like a crawl
Allow, rate-limit, or block
Legitimate Wget use includes backups, mirroring with permission, and automation you control. For those, a descriptive custom user agent and rate-aware settings keep the activity transparent.
For unwanted recursive mirroring, rate-limiting and robots.txt (which Wget can be asked to honour) are more effective than blocking the exact token, which is trivially changed. Never rely on the user agent alone for access control.
How it appears in analytics and logs
A user agent containing a Wget token is the Wget command-line tool fetching content — automation, not a human visit. Aggressive recursive Wget mirroring can resemble a crawl; the token plus request pattern usually makes the intent clear.
Diagnostic use case
Recognise command-line downloading and site-mirroring traffic by the Wget default token, while remembering the token can be overridden and is sometimes set to mimic a browser.
What WebmasterID can help detect
WebmasterID classifies Wget traffic as scripted automation, so command-line downloads and site mirrors appear in bot-intelligence rather than inflating human page-view counts.
Common mistakes
- Assuming all Wget traffic shows the default token — it is easily overridden.
- Counting recursive Wget mirroring as human page views.
- Blocking only the exact token instead of rate-limiting the behaviour.
Privacy and accuracy notes
Wget is identified from the user-agent token alone — a tool run by a script or operator, not a browsing person. WebmasterID records it as automation, separate from human analytics.
Related pages
- curl, wget and script user agents
Command-line and library HTTP clients send a default user agent that names the tool: curl/x.y, Wget, python-requests, Go-http-client, and similar. These are scripts, not browsers, and seeing them is normal. This page explains the patterns and how to treat them without over- or under-reacting.
- libwww-perl user agent
libwww-perl, commonly abbreviated LWP, is a long-established HTTP client library for Perl. Its default user agent contains a libwww-perl token with a version. Because it has been a default in many simple scripts and scrapers for decades, the raw token is widely treated as a sign of automated, non-browser traffic.
- Empty or missing user-agent strings
The User-Agent header is not mandatory, so some requests arrive with an empty string or no header at all. This usually points to a script, a misconfigured client, or an old device — not a specific identity. This page explains what a missing UA means and how to handle it without over-blocking.
- Bot intelligence
Categorise command-line downloaders separately from human browsers.
Sources and verification notes
- GNU Wget — manualDefault user agent contains a Wget token; overridable via flag.
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.