Robots & crawl control

How to block the Gigablast crawler

Gigablast was an independent search engine whose crawler, GigaBot, fetched public pages to build its index. The service is no longer operating as it once did, but the token can still appear in logs from residual or impersonating clients. This page shows the robots.txt rule and how to interpret leftover GigaBot activity.

Partially verified

What this means

Gigablast was a long-running independent search engine that ran its own crawler, GigaBot, to build its index. The public search service has effectively wound down, so meaningful indexing-driven crawling is unlikely. Despite that, the GigaBot token can still surface in logs.

When a token from a defunct service appears, the most likely explanations are residual infrastructure, archived crawl jobs, or another client reusing the recognisable token. Treat such hits as low-trust and verify behaviour rather than assuming a legitimate search crawl.

How to block it

To disallow GigaBot, target its token in its own group:

User-agent: GigaBot Disallow: /

Given the service's status, the practical effect of allowing or blocking it on real search visibility is minimal. If GigaBot-tokened requests persist and ignore the rule, that strongly suggests impersonation rather than the original crawler, in which case a firewall or WAF rule is the appropriate control.

robots.txt token to target: GigaBot
Original Gigablast search service is largely defunct
Persistent GigaBot hits likely mean impersonation — use a firewall

How it appears in analytics and logs

A request carrying the GigaBot token is attributed to Gigablast's crawler. Because the original service is largely defunct, current GigaBot hits may be residual infrastructure or a client impersonating the token, so treat them with extra scepticism.

Diagnostic use case

Disallow GigaBot in robots.txt and understand why a crawler from a discontinued search engine might still show up in your logs.

What WebmasterID can help detect

WebmasterID classifies GigaBot server-side and surfaces its activity, so you can see whether the token still appears in your traffic and whether a robots.txt rule changes anything.

Common mistakes

Assuming GigaBot activity reflects live search-engine indexing today.
Trusting the token without confirming behaviour in logs.
Counting residual crawler hits as human traffic.

Privacy and accuracy notes

Blocking Gigablast relies only on the request user-agent token. No human identity is involved. WebmasterID records the crawl as a bot event, separate from human analytics, and never attaches it to a visitor profile.

↑ All robots topics in Robots & crawl control

Sources and verification notes

Gigablast — open-source search engine projectGigablast's crawler GigaBot; the public search service is no longer actively operating.
Robots Exclusion Protocol (RFC 9309)

Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.