WebmasterID logoWebmasterID
AI crawlers

AI2Bot — Allen Institute for AI crawler

AI2Bot is the crawler operated by the Allen Institute for AI (AI2) to gather web data for its datasets and research. AI2 documents the crawler and its robots.txt token. Where a specific is not clearly covered it is marked partially verified rather than guessed.

Partially verified

What this means

AI2Bot is the crawler the Allen Institute for AI uses to gather public web data for its datasets and research. It appears in logs as an automated fetcher carrying the AI2Bot token.

AI2 publishes guidance on the crawler. Where a particular specific is not clearly covered, this entry describes the stable identification pattern rather than asserting unsourced detail.

How AI2Bot identifies itself

AI2Bot uses the robots.txt user-agent token AI2Bot. Its user-agent string contains that token together with a self-identifying URL. Match on the stable token rather than a full version string.

The user agent is a claim and can be copied. Use AI2's published guidance where authenticity matters, and do not invent IP ranges.

robots.txt considerations

To disallow AI2Bot site-wide:

User-agent: AI2Bot Disallow: /

AI2Bot is expected to honour robots.txt as a compliant crawler. robots.txt is a request, not an access-control boundary.

How it appears in analytics and logs

A request carrying the AI2Bot token is the Allen Institute for AI's crawler fetching a URL for its datasets — a bot event, not a human visit. Identify it by the token and treat any unverified specifics conservatively.

Diagnostic use case

Identify AI2Bot in logs by its token and set robots.txt policy for the Allen Institute for AI's dataset crawler.

What WebmasterID can help detect

WebmasterID classifies AI2Bot server-side by its token and surfaces it on the bot-intelligence and AI-visibility surfaces, so you can see Allen Institute crawl activity per page without parsing logs.

Common mistakes

Privacy and accuracy notes

Detection uses only the request user-agent. No human identity is involved. WebmasterID records the crawl as a bot event, separate from human analytics, and never attaches it to a visitor profile.

Related pages

Sources and verification notes

Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.