Search bots

OpenAlex crawler

OpenAlex, run by the non-profit OurResearch, is a free and open catalogue of the global research system — papers, authors, institutions, venues, and concepts — offered as data and an API. Its crawler and harvesters gather scholarly metadata and links to build an open scientific knowledge graph. It is a research-metadata aggregator rather than a general web search engine.

Verified against primary sources

What this means

OpenAlex provides an open, comprehensive index of scholarly entities and their relationships, intended as a successor to the discontinued Microsoft Academic Graph. It is published openly as data and via an API by the non-profit OurResearch.

If you host scholarly content or metadata, OpenAlex may gather links and metadata about it to enrich its open graph. This is research-metadata aggregation, not general web search indexing.

How it identifies itself

OpenAlex collection carries an OpenAlex/OurResearch-identifying user-agent, and much of its data is assembled from open sources, Crossref, and repository metadata rather than broad page crawling. Match on the OpenAlex identity rather than an exact version string.

As with any crawler, the user-agent is a claim and can be copied. Corroborate with behaviour where authenticity matters.

Operator: OurResearch (OpenAlex open scholarly catalogue)
Scope: papers, authors, institutions, venues, concepts
Built largely from open metadata sources and Crossref

robots.txt considerations

To express a crawl preference for OpenAlex, target its documented user-agent token in robots.txt. Because OpenAlex assembles much of its graph from open metadata and partner sources, blocking a direct crawler may not remove metadata sourced elsewhere.

robots.txt is honoured by compliant crawlers and is not an access control.

How it appears in analytics and logs

An OpenAlex request means an open scholarly catalogue harvested research metadata or links related to your content. It is academic-metadata bot traffic, not a human visit and not a general web-search crawl.

Diagnostic use case

Recognise OpenAlex harvesting in scholarly logs, distinguish open research-metadata aggregation from general web search, and read it as inclusion in an open scientific graph.

What WebmasterID can help detect

WebmasterID classifies OpenAlex harvesting server-side as an academic-metadata bot and surfaces it on the bot-intelligence surface, so research-graph aggregation stays separate from human analytics.

Common mistakes

Confusing an open scholarly-metadata catalogue with a general web search engine.
Assuming a robots.txt block removes metadata sourced from Crossref or repositories.
Counting metadata-harvesting hits as human readers in analytics.

Privacy and accuracy notes

Identification uses only the request user-agent and harvesting context. No visitor identity is involved. WebmasterID records the fetch as a bot event, separate from human analytics, and never attaches it to a profile.

↑ All search bots in Search bots

Sources and verification notes

OpenAlex — open catalogue of scholarly works (OurResearch)Open scholarly graph and API; assembled from open metadata sources.

Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.