AI Security

AI haters build tarpits to trap and trick AI scrapers that ignore robots.txt

Ars Technica Unknown April 07, 2025 0.2
AI haters build tarpits to trap and trick AI scrapers that ignore robots.txt
Last summer, Anthropic inspired backlash when its ClaudeBot AI crawler was accused of hammering websites a million or more times a day. And it wasn't the only artificial intelligence company making headlines for supposedly ignoring instructions in robots.txt files to avoid scraping web content on certain sites. Around the same time, Reddit's CEO called out all AI companies whose crawlers he said were "a pain in the ass to block," despite the tech industry otherwise agreeing to respect "no scraping" robots.txt rules. Watching the controversy unfold was a software developer whom Ars has granted anonymity to discuss his development of malware (we'll call him Aaron). Shortly after he noticed Facebook's crawler exceeding 30 million hits on his site, Aaron began plotting a new kind of attack on crawlers "clobbering" websites that he told Ars he hoped would give "teeth" to robots.txt. Building on an anti-spam cybersecurity tactic known as tarpitting, he created Nepenthes, malicious software named after a carnivorous plant that will "eat just about anything that finds its way inside." Aaron clearly warns users that Nepenthes is aggressive malware. It's not to be deployed by site owners uncomfortable with trapping AI crawlers and sending them down an "infinite maze" of static files with no exit links, where they "get stuck" and "thrash around" for months, he tells users. Once trapped, the crawlers can be fed gibberish data, aka Markov babble, which is designed to poison AI models. That's likely an appealing bonus feature for any site owners who, like Aaron, are fed up with paying for AI scraping and just want to watch AI burn. Tarpits were originally designed to waste spammers' time and resources, but creators like Aaron have now evolved the tactic into an anti-AI weapon. As of this writing, Aaron confirmed that Nepenthes can effectively trap all the major web crawlers. So far, only OpenAI's crawler has managed to escape.
Share
Related Articles
Critical Vulnerability Discovered in Popular AI Development Framework

A critical vulnerability in DeepLearn AI framework could allow attackers to...

October 24, 2025 Read
3 takeaways from red teaming 100 generative AI products | Microsoft Security Blog

The growing sophistication of AI systems and Microsoft’s increasing...

April 11, 2025 Read
New hack uses prompt injection to corrupt Gemini’s long-term memory

There’s yet another way to inject malicious prompts into chatbots.

April 10, 2025 Read
New Defense Against Adversarial Attacks Demonstrates 90% Effectiveness

A new defense against adversarial attacks on computer vision systems shows...

April 10, 2025 Read
Using ChatGPT to make fake social media posts backfires on bad actors

OpenAI claims cyber threats are easier to detect when attackers use ChatGPT.

April 09, 2025 Read