WebmasterID logoWebmasterID
Robots & crawl control

ai.txt, TDM reservation, and llms.txt

Beyond robots.txt, several conventions aim to express AI and machine-use preferences: ai.txt proposals, text-and-data-mining (TDM) reservation signals tied to EU copyright law, and llms.txt. Adoption and legal weight vary and are still settling, so this page describes the intent without overclaiming enforcement.

Partially verified

The conventions in brief

Several signals have been proposed to express AI-related preferences. ai.txt has been floated as a robots-style file specifically for AI usage. Text-and-data-mining (TDM) reservation is a rights-reservation concept connected to EU copyright law, under which rightsholders can reserve their works against certain data-mining uses in a machine-readable way. llms.txt is a community proposal for a content map aimed at language models.

These differ in origin and intent: some are informal proposals, while TDM reservation is tied to a legal framework. None is a single, universally adopted standard.

Uncertain status — do not overclaim

Adoption and enforceability vary widely and are still evolving. robots.txt remains the most broadly honoured technical signal for crawl control. The newer conventions may carry preference or, in the TDM case, legal-reservation weight in some jurisdictions, but whether a given crawler or party respects them is not guaranteed.

Treat these as supplementary expressions of intent. Where you need a real technical limit, combine robots.txt with authentication; where rights matter, consult qualified legal advice rather than relying on a file alone.

How it appears in analytics and logs

These conventions signal preferences for AI and data-mining use. Their presence does not change crawl permissions and does not guarantee any tool or party honours them.

Diagnostic use case

Understand the emerging AI-control conventions that sit alongside robots.txt, and decide whether to publish them while treating their effect as uncertain.

What WebmasterID can help detect

WebmasterID shows which AI crawlers and assistants reach your pages, so you can observe real activity regardless of whether any party honours these emerging conventions.

Common mistakes

Privacy and accuracy notes

Like robots.txt, these files are public. They express usage preferences, not access control, and involve no visitor data.

Related pages

Sources and verification notes

Last reviewed 2026-06-24. Facts are checked against primary/official sources where available; uncertain specifics are marked “Data not yet verified” rather than guessed.