Anthropic’s crawler: AI ignores websites’ anti-scraping policies.

TLDR:

Anthropic’s web crawler, ClaudeBot, has been scraping websites like iFixit, ignoring their anti-AI scraping policies. iFixit CEO Kyle Wiens called out Anthropic for violating their Terms of Use and using up their resources. While Anthropic claims they respect robots.txt files, their actions have caused concerns among website owners and led to site outages in some cases.

In a 24-hour period, the ClaudeBot crawler accessed iFixit’s website almost a million times, prompting a response from the company’s CEO. Despite iFixit’s Terms of Use explicitly prohibiting the reproduction or distribution of their content for AI training, Anthropic’s crawler continued to scrape their site. This behavior has been reported by other website owners as well, indicating a pattern of disregard for anti-scraping policies.

While some AI companies, like OpenAI, rely on robots.txt files to block crawlers, this method lacks the flexibility for website owners to specify what scraping is and isn’t allowed. Perplexity, another AI company, has been known to ignore these exclusions entirely. The incident with Anthropic’s ClaudeBot highlights the challenges faced by website owners in protecting their data from being used for AI training purposes.