PanguBot — Huawei's AI crawler
PanguBot is a crawler reported in third-party crawler directories and operator logs as associated with Huawei's Pangu large-model effort. The robots.txt token PanguBot is observed and catalogued, but Huawei publishes limited official operator documentation, so this entry identifies it by token while marking unverifiable specifics as such rather than inventing them.
What this means
PanguBot is the robots.txt token reported for a crawler associated with Huawei's Pangu large-model work. It appears in independent crawler directories and in operator logs, which is how the token is catalogued.
Huawei does not publish the kind of detailed operator documentation that OpenAI or Google provide for their crawlers, so we identify PanguBot by its token but stop short of asserting specifics we cannot source. That is why this entry is partially verified rather than verified.
How PanguBot identifies itself
PanguBot uses the robots.txt user-agent token PanguBot, which is the stable identifier to match on. We do not assert an exact full user-agent string, version, or IP range, because Huawei does not publish verifiable operator material for it — and inventing those would violate the no-fabrication rule.
Because any client can copy a user-agent token and no official IP range is published, treat PanguBot identification as a claim you cannot fully verify. Classify it by token and apply policy accordingly, conservatively.
- robots.txt token: PanguBot (catalogued in crawler directories)
- Associated with Huawei's Pangu large-model effort
- No published IP ranges — token identification only, not full verification
robots.txt considerations
If PanguBot honours robots.txt, you can request it stay out with:
User-agent: PanguBot Disallow: /
Whether a given crawler complies can only be confirmed by observing whether the disallowed paths stop being fetched. robots.txt is a request to compliant crawlers, not enforcement, and compliance for this token is not independently documented here.
How it appears in analytics and logs
A request carrying the PanguBot token is a crawler associated with Huawei's Pangu models fetching a URL — a bot event, not a human visit. Because official operator docs are limited, treat the token as identification, not full proof of origin.
Diagnostic use case
Identify the PanguBot token in logs and set robots.txt policy for it, while treating unpublished specifics (IP ranges, exact behaviour) as not verified.
What WebmasterID can help detect
WebmasterID classifies the PanguBot token server-side as an AI crawler and surfaces its activity on the bot-intelligence surface, so you can see this crawler's coverage without parsing logs.
Common mistakes
- Asserting PanguBot's IP ranges or exact UA string — Huawei does not publish them.
- Assuming robots.txt compliance without observing whether disallowed paths stop.
- Counting PanguBot crawl hits as human traffic.
Privacy and accuracy notes
Detection uses only the request user-agent token. No human identity is involved. WebmasterID records the crawl as a bot event, separate from human analytics, and never as a visitor profile.
Related pages
- Monitoring for new AI crawlers
New AI crawlers appear regularly, often with tokens you have never seen. Monitoring for them means surfacing unfamiliar bot-like user agents, checking each against the operator's documentation before deciding policy, and resisting both reflexive blocking and reflexive trust. The aim is a deliberate, sourced decision for each new token rather than a static, stale allow/block list.
- Verifying AI crawlers
Any client can copy a user-agent string, so a token alone is a claim, not proof. Some vendors, such as OpenAI for GPTBot, publish IP ranges or verification guidance; many do not. Verify before trusting, and never invent IP ranges to fill the gap.
- Geographic patterns in AI crawl traffic
AI crawl traffic often originates from a small set of cloud regions where the operator runs infrastructure. The coarse edge region of a request is not the operator's headquarters and not a person's location — it reflects where the crawl is hosted. Reading crawl geography privately means treating region as a coarse infrastructure estimate, never a precise or personal one.
- Bot intelligence
Surface lesser-documented crawler tokens like PanguBot, server-side.
Sources and verification notes
- Dark Visitors — AI crawler directory (PanguBot)Third-party directory cataloguing the PanguBot token; official Huawei operator docs are limited.
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.