WebmasterID logoWebmasterID
Robots & crawl control

How to block magpie-crawler

magpie-crawler is a web crawler associated with Brandwatch's social and web monitoring platform, which gathers public content for brand and media analysis. This page shows the robots.txt token to target, what the crawler does, and why a Disallow steers only compliant fetchers.

Partially verified

What magpie-crawler is

magpie-crawler is a web crawler associated with Brandwatch, a social and web monitoring platform that collects public content for brand listening and media analysis. Operators who do not want their pages folded into that monitoring can disallow it.

Match on the documented magpie-crawler user-agent token rather than a version string. The user agent is self-identifying and contains a URL pointing at the operator.

robots.txt rule

To ask magpie-crawler to stay off your site:

User-agent: magpie-crawler Disallow: /

This targets only that token and leaves search and AI crawlers unaffected. robots.txt is honoured by compliant crawlers and is not enforcement, so confirm with crawl behavior that the crawler actually backed off.

How it appears in analytics and logs

Requests carrying the magpie-crawler token are monitoring-crawler events, not human visits. They indicate a brand/media-intelligence platform is collecting your public content; classify them as bot traffic.

Diagnostic use case

Keep a brand-monitoring crawler from harvesting your public pages for social/media analytics and confirm the rule reached the correct token.

What WebmasterID can help detect

WebmasterID classifies magpie-crawler server-side as a crawler and shows whether it keeps reaching your pages after a robots.txt rule is added.

Common mistakes

Privacy and accuracy notes

Blocking magpie-crawler uses only the request user-agent token. No visitor identity is involved, and WebmasterID records the crawl as a bot event separate from human analytics.

Related pages

Sources and verification notes

Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.