WebmasterID logoWebmasterID
Robots & crawl control

How to block the BinaryEdge scanner

BinaryEdge runs internet-wide scans that catalogue exposed services and web properties for its attack-surface and threat-intelligence datasets. Where it crawls web content with a declared token, robots.txt can ask it to stop; but much internet-wide scanning operates below the HTTP-courtesy layer, so a firewall rule is usually the real control. This page covers both.

Partially verified

What this means

BinaryEdge builds datasets about exposed internet services and web properties for security and attack-surface monitoring. When it fetches web content with a declared crawler token, robots.txt can ask it to stop. But a large part of internet-wide scanning happens at the service level — probing ports and endpoints — which does not consult robots.txt at all.

So treat a robots.txt block as covering the courteous, token-carrying web crawl, while accepting that broad scanning of public IP space is a separate matter handled at the network layer.

How to block it

Target the BinaryEdge crawler token in its own group:

User-agent: BinaryEdge Disallow: /

Then confirm in your logs whether token-carrying requests stop. For scanning that continues without honouring robots.txt, the appropriate control is a firewall or WAF rule, optionally rate-limiting or blocking the offending sources. robots.txt is a request to compliant crawlers and is never an access-control mechanism for security scanners.

How it appears in analytics and logs

A request carrying a BinaryEdge token is an attack-surface or threat-intelligence scan, not a human visit. It is bot traffic. Internet-wide scanners often probe directly, so the absence of token-carrying requests does not always mean you are not being scanned.

Diagnostic use case

Ask BinaryEdge's web crawler to skip your pages, and decide when a firewall rule is the correct control for scanning that ignores robots.txt.

What WebmasterID can help detect

WebmasterID classifies scanning crawlers server-side, so you can see BinaryEdge web-crawl activity and judge whether robots.txt is enough or a firewall rule is needed.

Common mistakes

Privacy and accuracy notes

Blocking BinaryEdge relies only on the request user-agent token. No human identity or raw IP is exposed as a feature. WebmasterID records the scan as a bot event, separate from human analytics.

Related pages

Sources and verification notes

Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.