AI crawler visibility for publishers and SEO sites

Most analytics tools were designed before AI assistants started reading the web on their own schedule. The default behavior — count everything as a page view, treat every user-agent as a human — gives you a misleading picture if your site is being indexed by GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Applebot, Bingbot, and friends.

For a publisher or a content network, AI crawler activity is its own signal. It tells you which models are interested in your content, how often they re-crawl, and where they concentrate. That is information you can act on — robots policy, sitemap priority, editorial decisions about what to publish behind a login.

Two-track ingestion

WebmasterID classifies every incoming event at the edge of the ingest API. Requests with a recognised AI/search-bot user-agent are written to a separate bot_visits table; everything else goes to the main events table. Human aggregates stay clean by construction; the AI side is queryable as a first-class signal.

The current detector list lives in @webmasterid/ai-visibility. The full set is on the AI visibility page.

AI referrals are the other half

When a real human clicks a citation in ChatGPT, Claude, or Perplexity, the resulting visit looks like any other browser page view — except the referrer is recognisable. WebmasterID tags those events with traffic_category = ai_referral so you can size demand from each AI surface without losing the human dimension.

What you can do with this data

Three concrete uses, each on a different timescale:

Editorial: which articles are being re-crawled by which AI surfaces, and which ones generate human AI referrals.
SEO/AEO operations: robots policy and sitemap priority informed by actual crawler behaviour, not guessed defaults.
Architecture: for AI-native sites, knowing which surfaces are reading you helps shape the structured-data and machine-readability investments that pay off.

For implementation details, see /architecture. For who this matters most for, see /use-cases.

← All articles