ImagesiftBot — image dataset crawler
ImagesiftBot is an image-focused web crawler associated with ImageSift (linked to Hive). Its robots.txt token is ImagesiftBot. Public documentation is limited in places, so specifics that cannot be confidently sourced are marked partially verified rather than guessed.
What this means
ImagesiftBot is an image-focused web crawler associated with ImageSift, which is linked to Hive. It appears in logs as an automated fetcher carrying the ImagesiftBot token, and it focuses on image content.
Public documentation is limited in places. This entry therefore describes the stable identification pattern and avoids asserting specifics — such as exact dataset use or scope — that cannot be confidently sourced.
How ImagesiftBot identifies itself
ImagesiftBot uses the robots.txt user-agent token ImagesiftBot. Its user-agent string contains that token together with a self-identifying URL. Match on the stable token rather than a full version string.
The user agent is a claim and can be copied. Do not invent IP ranges; identify it by the token and treat trust-sensitive decisions conservatively.
- robots.txt token: ImagesiftBot
- Image-focused crawler associated with ImageSift/Hive
- Some specifics: not fully documented publicly
robots.txt considerations
To disallow ImagesiftBot site-wide:
User-agent: ImagesiftBot Disallow: /
Treat the rule as a request rather than a guarantee where documentation is incomplete. robots.txt is never an access-control boundary.
How it appears in analytics and logs
A request carrying the ImagesiftBot token is an image-focused dataset crawler fetching a URL — a bot event, not a human visit. Identify it by the token and treat undocumented specifics conservatively.
Diagnostic use case
Identify ImagesiftBot in logs by its token and set robots.txt policy for the image-focused dataset crawler.
What WebmasterID can help detect
WebmasterID classifies ImagesiftBot server-side by its token and surfaces it on the bot-intelligence surface, so you can see its activity per page without parsing logs.
Common mistakes
- Asserting documented behaviour where public docs are sparse.
- Inventing IP ranges to verify ImagesiftBot.
- Counting crawler hits as human sessions.
Privacy and accuracy notes
Detection uses only the request user-agent. No human identity is involved. WebmasterID records the crawl as a bot event, separate from human analytics, and never attaches it to a visitor profile.
Related pages
- CCBot — Common Crawl crawler
CCBot is the crawler operated by Common Crawl to build its open, freely available web dataset. That dataset is widely reused as a training source by many AI projects. Common Crawl documents the crawler and its robots.txt token, and CCBot honours robots.txt.
- Omgilibot — Webz.io data crawler
Omgilibot is a web data crawler operated by Webz.io, also seen under the omgili name. Its robots.txt token is omgilibot. Public documentation is limited in places, so specifics that cannot be confidently sourced are marked partially verified rather than guessed.
- Bot intelligence
Deterministic categorisation of crawlers, search bots, and automation.
Sources and verification notes
- ImageSift — crawler reference (token observed)Token ImagesiftBot is observed; comprehensive official docs are limited, so some specifics are marked partially verified.
Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.