WebmasterID logoWebmasterID
AI visibility

AI crawler visibility for publishers and SEO sites

Why crawler signals matter, how WebmasterID separates AI/search bots from human traffic, and what you can decide with that data.

Published

Most analytics tools were designed before AI assistants started reading the web on their own schedule. The default behavior — count everything as a page view, treat every user-agent as a human — gives you a misleading picture if your site is being indexed by GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Applebot, Bingbot, and friends.

For a publisher or a content network, AI crawler activity is its own signal. It tells you which models are interested in your content, how often they re-crawl, and where they concentrate. That is information you can act on — robots policy, sitemap priority, editorial decisions about what to publish behind a login.

Two-track ingestion

WebmasterID classifies every incoming event at the edge of the ingest API. Requests with a recognised AI/search-bot user-agent are written to a separate bot_visits table; everything else goes to the main events table. Human aggregates stay clean by construction; the AI side is queryable as a first-class signal.

The current detector list lives in @webmasterid/ai-visibility. The full set is on the AI visibility page.

AI referrals are the other half

When a real human clicks a citation in ChatGPT, Claude, or Perplexity, the resulting visit looks like any other browser page view — except the referrer is recognisable. WebmasterID tags those events with traffic_category = ai_referral so you can size demand from each AI surface without losing the human dimension.

What you can do with this data

Three concrete uses, each on a different timescale:

  • Editorial: which articles are being re-crawled by which AI surfaces, and which ones generate human AI referrals.
  • SEO/AEO operations: robots policy and sitemap priority informed by actual crawler behaviour, not guessed defaults.
  • Architecture: for AI-native sites, knowing which surfaces are reading you helps shape the structured-data and machine-readability investments that pay off.

For implementation details, see /architecture. For who this matters most for, see /use-cases.