WebmasterID logoWebmasterID
Search bots

Crawler IP verification methods

Because user-agent strings are trivially copied, the reliable way to confirm a crawler is to check its source. The two documented methods are reverse DNS with a forward-confirm step, and matching the source IP against the engine's published IP ranges. Together they defend against spoofed crawler traffic.

Verified against primary sources

Reverse DNS with forward-confirm

The reverse-DNS method looks up the source IP to get a hostname, then confirms that hostname belongs to the engine — for example googlebot.com or google.com for Google, search.msn.com for Bing, or a Yandex domain for Yandex. The essential second step is a forward lookup on that hostname, confirming it resolves back to the original IP.

The forward-confirm step matters because reverse-DNS records alone can be set by whoever controls the IP's PTR record. Requiring both directions to agree closes that gap.

Published IP-range matching

Several engines publish their crawler IP ranges, sometimes as downloadable, regularly updated lists. Matching a request's source IP against the current published ranges confirms it originates from the engine, without a per-request DNS lookup.

The trade-off is maintenance: published ranges change, so you must refresh them rather than hardcoding addresses. Never paste raw IPs into documentation or rules as permanent facts — treat the engine's published list as the live source of truth. Many operators combine IP matching with reverse DNS for defence in depth.

How it appears in analytics and logs

A user-agent string is a claim that anyone can copy. IP verification establishes whether the request actually came from the engine's infrastructure. A crawler claim that fails both reverse-DNS and IP-range checks is spoofed.

Diagnostic use case

Confirm a crawler's authenticity before trusting it, using reverse DNS or published IP ranges, so spoofed user agents cannot drive your decisions.

What WebmasterID can help detect

WebmasterID classifies crawlers server-side and distinguishes verified crawlers from spoofed lookalikes, so the verification logic runs once centrally instead of on every request you inspect by hand.

Common mistakes

Privacy and accuracy notes

Verification inspects the request's own IP and DNS records, which belong to crawler infrastructure, not to any human visitor. WebmasterID applies verification to classify bots and never builds human profiles from it.

Related pages

Sources and verification notes

Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.