
As Chris states, broad IP based blocking is unlikely to be very effective , and likely more problematic down the line anyway. For the slightly more 'honorable' crawlers, they'll respect robots.txt, and you can block their UAs there. Fail2ban is a very good option right now. It will be even better if nepenthes eventually integrates with it. Then you can have some real fun. On Wed, Jul 16, 2025 at 3:39 PM Andrew Latham via NANOG < nanog@lists.nanog.org> wrote:
Chris
Spot on, and I am getting the feeling this is where the value to a geo-ip service comes to play that offers defined "eyeball networks" to allow.
On Wed, Jul 16, 2025 at 12:57 PM Chris Adams via NANOG <nanog@lists.nanog.org> wrote:
Once upon a time, Marco Moock <mm@dorfdsl.de> said:
Place a link to a file that is hidden to normal people. Exclude the directory via robots.txt.
Then use fail2ban to block all IP addresses that poll the file.
The problem with a lot of the "AI" scrapers is that they're apparently using botnets and will often only make a single request from a given IP address, so reactive blocking doesn't work (and can cause other issues, like trying to block 100,000 IPs, which fail2ban for example doesn't really handle well). -- Chris Adams <cma@cmadams.net> _______________________________________________ NANOG mailing list
https://lists.nanog.org/archives/list/nanog@lists.nanog.org/message/AFJF4UQJ...
-- - Andrew "lathama" Latham - _______________________________________________ NANOG mailing list
https://lists.nanog.org/archives/list/nanog@lists.nanog.org/message/DHUYTBIX...