
On Wed, Jul 16, 2025 at 1:57 PM Chris Adams via NANOG <nanog@lists.nanog.org> wrote:
The problem with a lot of the "AI" scrapers is that they're apparently using botnets and will often only make a single request from a given IP address, so reactive blocking doesn't work (and can cause other issues,
Append a canary ID to all URLs displayed on the page. For example: https://example.com/example.html?visitor=1234ABCDEF&signature=XYZ Upon receiving a page request that is missing a valid "Visitor=" tag, or in case the visitor's IP address does not match the correct IP address linked to that visitor tag: create a new visitor tag in the database and Return an empty page with a 302 redirect redirecting the visitor back to the homepage with the new tag added and refusing to display the individual page requested, until they click a link provided by the website. Do the same if the signature= attribute is missing or fails to verify. The signature attribute is a HMAC which authenticates that the combination of the URL path and visitor ID are from a page displayed by the web server and have not been altered by the client. For example, they cannot simply learn their client ID and append it on their own to https://example.com/example.html they need the unique signature= added to the link by the web server in order to have access to the example.html page. -- -JA