Sitebulb crawler — desktop/cloud SEO auditor
Sitebulb is a desktop and cloud SEO auditing tool whose crawler fetches pages to map site structure, internal links, and on-page issues. It is a third-party SEO tool crawler, not a search engine. Sitebulb documents its user agent and supports robots.txt handling and a configurable crawl identity.
What this means
Sitebulb is a popular SEO auditing tool available as a desktop app and a cloud service. Its crawler fetches pages to build structure, internal-link, and on-page reports. Like other audit crawlers, it does not feed a search index and does not affect rankings.
Sitebulb audits are configured by the operator, so seeing it usually means a planned audit — though desktop crawls originate from the operator's own machine, which affects how their requests look.
How the Sitebulb crawler identifies itself
Sitebulb's crawler self-identifies with a Sitebulb token in its user-agent string, and the tool lets the operator configure the user agent (for example to mimic a browser or a search bot during testing). Because the UA is configurable, this entry is marked partially verified — match on the documented Sitebulb token but be aware an operator may run it under a different UA, in which case it can resemble other clients.
The user agent is a claim that can be copied; verify where authenticity matters.
- robots.txt token: Sitebulb's documented crawler token (verify current value)
- User agent is configurable, so it may differ from the default token
- An SEO audit crawler, not a search-engine indexer
robots.txt control
Sitebulb can be set to respect robots.txt during an audit, and it offers an option to ignore it for the operator's own site. To disallow the default crawler site-wide, target its token with a standard Disallow rule.
Because the UA is configurable, robots.txt rules only apply when the operator runs the crawl under the default Sitebulb identity and with robots.txt respect enabled. robots.txt is a request honoured by compliant crawlers, not an access-control mechanism.
How it appears in analytics and logs
A request carrying Sitebulb's crawler token is the Sitebulb tool auditing a URL on an operator's behalf — a bot event, not a human visit. It usually reflects a deliberate audit and should be counted as crawl coverage, not audience.
Diagnostic use case
Identify Sitebulb when an SEO audit runs against your site, allow it for your own audits, and restrict it via robots.txt when an external crawl is unwanted.
What WebmasterID can help detect
WebmasterID classifies Sitebulb server-side as an SEO crawler and surfaces its activity on the bot-intelligence surface, separate from human analytics, so you can see SEO audit hits without log parsing.
Common mistakes
- Assuming all Sitebulb crawls use the default UA — operators can change it.
- Counting audit hits as human sessions in analytics.
- Expecting robots.txt to stop a Sitebulb crawl configured to ignore it.
Privacy and accuracy notes
Sitebulb crawler detection uses only the request user-agent. No human identity is involved. WebmasterID records the crawl as a bot event, separate from human analytics, and never attaches it to a visitor profile.
Related pages
- Screaming Frog SEO Spider
Screaming Frog SEO Spider is a desktop application that site owners and SEO professionals run themselves to audit a site. It is not a public, continuously operating crawler like Googlebot; its user agent is user-controlled and its crawling is initiated by whoever runs the tool.
- JetOctopus crawler — technical-SEO auditor
JetOctopus is a technical-SEO platform whose crawler audits large sites for structure, indexability, and on-page issues, alongside log-file analysis. It is a third-party SEO tool crawler, not a search engine. JetOctopus documents the crawler and supports robots.txt and crawl-rate controls.
- Managing third-party SEO crawler load
Third-party SEO crawlers such as AhrefsBot and SemrushBot can generate significant request volume without contributing to search visibility. You can manage their load by targeting their tokens in robots.txt, using crawl-delay where the crawler supports it, and blocking those that bring no value to you.
- Bot intelligence
Deterministic categorisation of crawlers, search bots, and audit tools.
Sources and verification notes
- Sitebulb — documentationSitebulb documents its crawler and configurable user agent; verify current default token.
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.