Web Crawler & Traffic Intelligence Encyclopedia
A source-grounded reference encyclopedia for webmasters and SEO operators: AI crawlers, search bots, user agents, referrers, UTM tracking, robots.txt, crawl diagnostics, and privacy-safe geo traffic.
Every page is written to be independently useful: a clear explanation, a practical diagnostic angle, privacy-safe language, internal links, and source notes. Where a fact cannot be confirmed against a primary source, the page says so rather than inventing it.
Reference hubs
Eight categories, each a hub over a growing set of detail pages.
- AI crawlers
103 AI crawlers — Reference pages for AI and LLM crawlers — GPTBot, ClaudeBot, PerplexityBot, Google-Extended, CCBot and more. What each is, how it appears in logs, and robots.txt considerations.
- Search bots
134 search bots — Reference pages for search-engine and SEO crawlers — Googlebot, Bingbot, YandexBot, Baiduspider, AhrefsBot, SemrushBot and more. Purpose, verification caveats, and diagnostic use.
- User agents
121 user-agent families — How to interpret user-agent families: browser vs bot, spoofing caveats, safe logging, and the WebmasterID diagnostic angle. Pattern-based, never invented UA strings.
- Referrers
137 traffic sources — What each traffic source means, why referrers go missing, browser/privacy caveats, and the UTM recommendation for each — Reddit, X, LinkedIn, Google, newsletters, dark social and more.
- UTM tracking
120 campaign sources — Recommended UTM structure, worked examples, and common mistakes for every channel — Reddit, LinkedIn, X, newsletters, ads and more. Privacy-safe, no fabricated benchmarks.
- Robots & crawl control
125 robots topics — Safe, copy-pasteable robots.txt guidance: directives, blocking specific crawlers, crawl-delay, sitemap, meta robots vs X-Robots-Tag, AI crawler policy, and llms.txt basics.
- Crawl diagnostics
135 diagnostic topics — What each HTTP status means for crawlers, plus crawl-diagnostic playbooks — redirect chains, crawl-budget waste, blocked crawlers, bot spikes. Operator action checklists.
- Geo traffic
128 geo topics — How to interpret country and geo signals in analytics without overclaiming location: edge-header limits, CDN vs user country, VPN/proxy mismatch, bot vs human country, unknown country.
Cross-cutting topics
- Bot vs human traffic — how to separate automated traffic from real visitors, and why the line is fuzzier than it looks.
This encyclopedia is the reference layer behind WebmasterID’s product surfaces: Bot intelligence, AI referrals, AI visibility analytics, and the operator docs.