Do you remember the good times when a simple robots.txt at the root of your website was enough to keep indexing robots at bay? Well, those days are over because today AI companies’ crawlers behave like digital noise coming to pump out all your content, blithely ignoring any form of digital politeness. Worse yet, they are constantly changing their names and finding ways to circumvent traditional protections. In short, it’s boring!
This is where comes into play Nepenthea tool created by Aaron B. and named in homage to these fascinating carnivorous plants that digest their prey in their pitcher-shaped trap.
Like the tarpit over SSH which I spoke to you about a few months ago, the principle is just as effective. Because rather than a simple static trap, Nepenthes creates a infinite maze specially designed for crawlers. Each page leads them to even more pages, and even more pages, and even more pages, in an endless loop of randomly generated links. The crawler downloads a URL, sees links, follows them… and thus finds himself trapped in an endless spiral.
Here is an example nginx configuration for deploying Nepenthes:
nginxlocation /nepenthes-demo/ { proxy_pass http://localhost:8893; proxy_set_header X-Prefix '/nepenthes-demo'; proxy_set_header X-Forwarded-For $remote_addr; proxy_buffering off;}
Be careful though because this tool is deliberately malicious. So only deploy it if you fully understand its implications because otherwise legitimate crawlers (Google, Bing…) could also be affected, potentially impacting your SEO.
If you want to see what it looks like, you can test Nepenthes in action on this demo page. It’s deliberately slow because the idea is to slow down these famous crawlers! And to deploy your own instance, head to project page who will explain everything to you.
Really, I find that even if it’s a little touchy to deploy, Nepenthes is a good and very creative answer to this growing problem of wild crawlers. Moreover, its creator describes it as “a work of art born from rage at the evolution of the internet into a panopticon of monetary extraction”.
So who will put this in place?
Source link
Subscribe to our email newsletter to get the latest posts delivered right to your email.
Comments