How to block Sogou Spider
Sogou Spider is the web crawler for Sogou, a Chinese search engine. This page shows how to disallow it in robots.txt using its documented user-agent tokens, explains what blocking does and does not affect, and how to confirm the rule is honoured.
robots.txt rule
Sogou's crawler identifies itself with user-agent tokens beginning with Sogou (for example a web-spider token). To disallow it site-wide, target the token in its own group:
User-agent: Sogou web spider Disallow: /
Verify the exact token from your own access logs before committing, because Sogou operates more than one crawler token and a partial match may miss some of them. Match on the documented token, not a full version string.
- Group targets the Sogou crawler token only
- Confirm the exact token(s) from your logs
- Leaves Google, Bing, and other engines unaffected
What blocking does and does not do
A Disallow asks compliant crawlers to stop fetching; it does not remove already-indexed pages and is not a firewall. If you need pages dropped from a search index, use noindex on a page the crawler can still read, not a robots.txt block that hides the directive.
Because any client can send the Sogou user agent, treat the user agent as a claim. If hits persist from outside expected networks, the source may be a non-compliant scraper rather than Sogou itself.
How it appears in analytics and logs
Continued Sogou Spider hits after a Disallow usually mean a token mismatch, a not-yet-refreshed robots.txt cache, or a non-compliant client copying the Sogou user agent.
Diagnostic use case
Reduce crawl load from Sogou Spider when your audience is not in the Chinese Sogou market, or keep specific sections out of Sogou's index, without affecting Google or Bing.
What WebmasterID can help detect
WebmasterID records Sogou Spider hits as search-bot events, so after adding a Disallow you can watch whether the crawler's activity actually tapers — the practical signal that the rule is being honoured.
Common mistakes
- Blocking one Sogou token while another keeps crawling.
- Expecting a Disallow to de-index pages already in Sogou.
- Trusting the Sogou user agent without considering spoofing.
Privacy and accuracy notes
Blocking Sogou Spider concerns a crawler, not a person. The rule matches a user-agent token and involves no visitor data; robots.txt is a request, not an access control.
Related pages
- Sogou Spider — Sogou's web crawler
Sogou Spider is the crawler for Sogou, a Chinese search engine. Its user agent contains the Sogou identifier. English-language documentation is limited and verification options are not well published, so this entry is marked partially verified.
- How to control Baiduspider in robots.txt
Baiduspider is the crawler for Baidu, the dominant search engine in China. You can target it with the Baiduspider token in robots.txt. Blocking it removes you from Baidu over time, which chiefly matters for sites serving Chinese-language or China-based audiences.
- User-agent groups and matching in robots.txt
robots.txt rules are organised into user-agent groups. A crawler does not combine every group — it selects the single most specific group whose token matches its name, falling back to the * group only when no named group matches. Understanding this prevents rules that never apply.
- Bot intelligence
See whether Sogou Spider activity falls after you add a robots.txt rule.
Sources and verification notes
- Sogou — webmaster help (crawler identification)Sogou publishes crawler help; exact token set should be confirmed from logs.
- Google — How Google interprets robots.txtGeneral robots.txt syntax and user-agent matching.
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.