Netcraft survey crawler
Netcraft is a security and internet-research company known for its long-running Web Server Survey, which measures the software, hosting, and configuration of public web servers across the internet. Its crawler fetches public endpoints to record server signals rather than to index page content for search. It appears in logs as periodic survey probes associated with Netcraft's research and anti-phishing operations.
What this means
Netcraft has run its Web Server Survey since the 1990s, measuring which server software and hosting providers run the world's public sites. The survey crawler fetches public endpoints to read server-identifying signals rather than to build a content search index.
Netcraft also operates security services such as anti-phishing and brand-protection. Its crawling is research and security measurement, not consumer search, so treat it accordingly when categorising traffic.
How it identifies itself
Netcraft survey requests carry a Netcraft-identifying user-agent referencing its survey or research. Match on the documented Netcraft identity rather than an exact version, which changes between survey runs.
As always, a user-agent is a claim and can be copied. For requests where authenticity matters, corroborate with behaviour rather than relying solely on the string.
- Operator: Netcraft (internet research and security)
- Long-running Web Server Survey measuring server software/hosting
- Purpose: infrastructure measurement, not page indexing
robots.txt considerations
Netcraft's survey is designed to measure public infrastructure. Where you wish to express a crawl preference, target the documented Netcraft user-agent token in robots.txt.
robots.txt is honoured by compliant crawlers and is not a security control. Restricting the survey crawler limits Netcraft's measurement of your server but does not change how your site responds to other clients.
How it appears in analytics and logs
A Netcraft request usually means its internet survey reached your server to record software and hosting signals. It is measurement bot traffic, not a human visit and not a search-index crawl; it reflects Netcraft's periodic census of public web infrastructure.
Diagnostic use case
Recognise Netcraft survey probes in logs, distinguish internet-measurement crawling from search indexing and from security scanning, and read it as research-scale server profiling.
What WebmasterID can help detect
WebmasterID classifies Netcraft survey traffic server-side as a research/monitoring bot and surfaces it on the bot-intelligence surface, so you can see infrastructure-survey activity separate from human analytics.
Common mistakes
- Mistaking a Netcraft survey probe for a hostile scanner or a search-engine crawl.
- Counting survey probes as human visits in analytics.
- Assuming the survey indexes your page content for a public search engine.
Privacy and accuracy notes
Identification uses only the request user-agent and survey context. No visitor identity is involved. WebmasterID records the probe as a bot event, separate from human analytics, and never attaches it to a profile.
Related pages
- Web intelligence and traffic crawlers — overview
Web-intelligence and traffic crawlers fetch public pages to build market-research, traffic-estimation, and internet-measurement datasets rather than to power consumer search. This overview explains how to recognise them, why they are distinct from search and SEO crawlers, and how to set policy. They build private analytics or research datasets, so their crawling reflects measurement coverage rather than audience.
- SimilarWeb crawler
SimilarWeb is a digital-intelligence company whose crawler fetches publicly accessible web pages as one input to its market-research, traffic-estimation, and competitive-analytics products. It is a data-collection crawler, not a search engine: it gathers signals about websites rather than building a public search index. SimilarWeb publishes a self-identifying crawler user-agent and a page describing the bot so operators can recognise and control it.
- Security scanners vs search crawlers
Security scanners (Censys, Shodan, BinaryEdge, Qualys and similar) probe hosts, ports, and application surface to assess exposure and find vulnerabilities. Search crawlers (Googlebot, Bingbot) fetch and index content to rank it. Confusing the two leads to wrong robots.txt decisions and misread logs: robots.txt governs content crawling, not port scanning, and scan traffic should never be counted as audience.
- Website observability
See research and monitoring bots reaching your site, recorded server-side.
Sources and verification notes
- Netcraft — Web Server SurveyLong-running internet infrastructure survey; survey crawler documented.
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.