Gigablast crawler (GigaBot)
Gigablast was an independent search engine, known for running its own web index and open-sourcing parts of its technology. Its crawler (associated with the GigaBot identity) fetched public pages to build that index. Gigablast's public search has wound down, so its crawler is largely a legacy token seen in historic logs rather than an active mainstream engine.
What this means
Gigablast was a self-built search engine that maintained its own large web index, notable in the independent-search community and for releasing parts of its codebase. Its crawler fetched public pages to populate that index.
As the search market consolidated, Gigablast's public-facing search wound down. Its crawler is therefore best read today as a legacy independent search crawler rather than a current mainstream engine.
How it identifies itself
Gigablast's crawling has historically been associated with the GigaBot identity. Match on the documented token rather than an exact version string. As always, a user-agent is a claim and can be copied.
Because Gigablast's public search has wound down and current documentation is sparse, this entry is marked partially verified: the crawler's historic existence and association are documented, but current activity and IP ranges are not actively published.
- Operator: Gigablast, an independent search engine
- Historic crawler identity: GigaBot
- Status today: largely legacy / residual
robots.txt considerations
To control Gigablast's crawler, target its documented token in robots.txt. Given its legacy status, this mainly helps tidy historic crawl noise.
robots.txt is a request honoured by compliant crawlers, not an access control, and cannot stop a client that merely copies the user-agent.
How it appears in analytics and logs
A Gigablast/GigaBot request was the engine's crawler fetching a page. In modern logs it typically indicates legacy or residual activity rather than an active major engine; treat it as bot traffic.
Diagnostic use case
Recognise the historic Gigablast crawler in legacy logs, understand it as an independent search engine's crawler rather than a current major one, and set robots.txt policy if needed.
What WebmasterID can help detect
WebmasterID classifies the Gigablast crawler token server-side as a search crawler and surfaces its activity, so legacy/independent crawling stays visible.
Common mistakes
- Assuming Gigablast is a current major engine driving meaningful traffic.
- Treating a copied user-agent as proof of Gigablast origin.
- Counting legacy crawl hits as human visits.
Privacy and accuracy notes
Identification uses only the request user-agent. No visitor identity is involved. WebmasterID records the crawl as a bot event, separate from human analytics.
Related pages
- ExaBot (Exalead) crawler
ExaBot is the crawler associated with Exalead, a French-origin web search engine that built its own index. ExaBot fetched public pages to populate Exalead's search results. Exalead's consumer web search has long since wound down, so ExaBot is largely a legacy token: you may still see it in historic logs or from residual crawling, identified by the ExaBot user-agent.
- Regional search engines overview
In several markets a regional search engine leads instead of Google: Yandex in Russian-language search, Baidu in China, Naver in South Korea, Seznam in the Czech Republic, and Coc Coc in Vietnam. Recognising their crawlers matters because being indexed by them is how you reach those audiences.
- Cliqz search crawler
Cliqz was a German privacy-focused search engine and browser project that built its own search index rather than relying on the major engines. Its crawler fetched public pages for that index. The Cliqz project was discontinued, so its crawler is a legacy token: you may see it in historic logs, associated with the Cliqz identity.
- Web crawlers
How independent and legacy search crawlers are categorised.
Sources and verification notes
- Gigablast — independent search engineIndependent search engine and crawler; public search wound down, current crawl activity not actively published.
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.