How to block the SISTRIX crawler
SISTRIX runs a crawler to gather SEO visibility and ranking data for its platform. This page shows how to disallow the SISTRIX crawler in robots.txt, how to throttle it instead of blocking, and how to confirm the directive is honoured.
robots.txt rule
SISTRIX documents the token its crawler uses. To disallow it site-wide:
User-agent: SISTRIX Crawler Disallow: /
Confirm the exact token from your access logs, because SISTRIX has used more than one identifier over time. SISTRIX documents that its crawler respects robots.txt, so a correctly targeted Disallow is the supported opt-out.
- Target the documented SISTRIX crawler token
- Confirm the exact token from your logs
- Crawler respects robots.txt per SISTRIX docs
Throttle as an alternative
If you would rather slow the SISTRIX crawler than block it, SISTRIX documents crawl-delay support:
User-agent: SISTRIX Crawler Crawl-delay: 10
This asks the crawler to space out its requests. Keep the directive inside the SISTRIX group; crawl-delay is honoured only by crawlers that support it and is ignored by Googlebot.
How it appears in analytics and logs
SISTRIX crawler hits mean an SEO analytics platform is collecting data about your pages. It is third-party tooling, not search indexing, and brings no organic referral traffic.
Diagnostic use case
Prevent the SISTRIX crawler from consuming crawl resources or mapping your site when you do not use the SISTRIX toolbox.
What WebmasterID can help detect
WebmasterID classifies the SISTRIX crawler as an SEO crawler, so you can verify a block takes effect and keep these requests out of human analytics.
Common mistakes
- Using an outdated SISTRIX token so the rule never matches.
- Putting Crawl-delay where Googlebot reads it, expecting it to apply.
- Counting SISTRIX crawler hits as human sessions.
Privacy and accuracy notes
The rule matches the SISTRIX crawler token only. No visitor data is involved, and robots.txt is a request to compliant crawlers rather than an access control.
Related pages
- SISTRIX crawler — SISTRIXCrawler bot
The SISTRIX crawler fetches pages to build data for the SISTRIX SEO toolbox, including its visibility and on-page analyses. It is a third-party SEO tool crawler based in Germany, not a search engine. SISTRIX documents the crawler and provides guidance for operators who want to identify or restrict it.
- How to block SemrushBot in robots.txt
SemrushBot is the crawler Semrush uses to build its SEO datasets. Semrush documents several specialised sub-bots under related tokens, so this page covers the base disallow rule and explains why you may need to target multiple tokens to cover the activity you care about.
- The crawl-delay directive in robots.txt
Crawl-delay is a non-standard robots.txt directive that asks a crawler to wait between requests. Support is uneven: Google does not use it and points to Search Console instead, while Bing and Yandex have historically honoured it. This page explains the directive and the safer alternatives.
- Bot intelligence
Separate the SISTRIX crawler from genuine visitors.
Sources and verification notes
- SISTRIX — crawler and robots.txt documentationOfficial SISTRIX crawler page: token, robots.txt and crawl-delay.
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.