
16 Jul
2025
16 Jul
'25
6:57 p.m.
Once upon a time, Marco Moock <mm@dorfdsl.de> said:
Place a link to a file that is hidden to normal people. Exclude the directory via robots.txt.
Then use fail2ban to block all IP addresses that poll the file.
The problem with a lot of the "AI" scrapers is that they're apparently using botnets and will often only make a single request from a given IP address, so reactive blocking doesn't work (and can cause other issues, like trying to block 100,000 IPs, which fail2ban for example doesn't really handle well). -- Chris Adams <cma@cmadams.net>