
Hello, We operate as a cloud service provider, and much of our traffic is indeed—per Cloudflare’s terminology—“bot traffic.” For about a month, users behind IP addresses we announce have been prompted to solve captchas when accessing Cloudflare-proxied sites. When we contacted Cloudflare support, they referred us to their customers (e.g., Stack Overflow, OpenAI), but support from those sites directed us back to Cloudflare. I reviewed Cloudflare Radar but found it limited in actionable insights. We also announce prefixes to Cloudflare where the originating AS primarily serves end-users—and where Radar shows over 80% human traffic—yet users still encounter captchas, suggesting the issue may be related to our announcing AS. Has anyone experienced this or found effective ways to resolve it? Any advice or pointers would be greatly appreciated. Best regards, Johannes

On 01/07/2025 15:05:16, "Johannes Müller Aguilar via NANOG" <nanog@lists.nanog.org> wrote:
For about a month, users behind IP addresses we announce have been prompted to solve captchas when accessing Cloudflare-proxied sites.
I've seen that increase and now regularly get it on home broadband services, others have reported this too. I suspect many are getting it and assumed this is the new normal.
Has anyone experienced this or found effective ways to resolve it?
Keep reporting to the web sites and hope they move to another provider if not resolved seems to be the only option. brandon

Didn't CF turn off all Cloudflare served CAPTCHAs almost 2 years ago in favor of Turnstile? On Tue, Jul 1, 2025 at 10:16 AM Brandon Butterworth via NANOG < nanog@lists.nanog.org> wrote:
For about a month, users behind IP addresses we announce have been
On 01/07/2025 15:05:16, "Johannes Müller Aguilar via NANOG" <nanog@lists.nanog.org> wrote: prompted to solve captchas when accessing Cloudflare-proxied sites.
I've seen that increase and now regularly get it on home broadband services, others have reported this too. I suspect many are getting it and assumed this is the new normal.
Has anyone experienced this or found effective ways to resolve it?
Keep reporting to the web sites and hope they move to another provider if not resolved seems to be the only option.
brandon
_______________________________________________ NANOG mailing list
https://lists.nanog.org/archives/list/nanog@lists.nanog.org/message/GFNLNMGM...

On Tue, 1 Jul 2025 at 09:15, Brandon Butterworth via NANOG <nanog@lists.nanog.org> wrote:
On 01/07/2025 15:05:16, "Johannes Müller Aguilar via NANOG" <nanog@lists.nanog.org> wrote:
For about a month, users behind IP addresses we announce have been prompted to solve captchas when accessing Cloudflare-proxied sites.
I've seen that increase and now regularly get it on home broadband services, others have reported this too. I suspect many are getting it and assumed this is the new normal.
I'm seeing this on StackOverflow / StackExchange on my home broadband as well. Having to wait half a minute to glance at a search result completely ruins the use-case for said result. If your time is worth $120/h, that's a $1 for each StackOverflow visit just to open the page, obviously it's cheaper to use AI at that point, so, no idea what they're thinking killing their own market. I wish Google Search would let people blacklist StackOverflow as long as they're a Cloudflare user; or, heck, anything with these captchas. It's effectively just search spam with all those captchas. But the "best" part about the security industry, is that because I do close the window in less than a second, Cloudflare probably reports my visit attempt as saving StackOverflow from yet another bot! "Look how many bots we've saved you from!" I'd like to see the metrics from Cloudflare and the other captcha vendors on how they justify wasting billions of dollars in lost productivity. It probably costs way-way-way-way less than $0.01 to serve a page for which the legitimate users must now waste $1 in lost income. There's probably a 10000x amplification factor for real users wasting resources compared to how much resources are saved from the most basic bots that can't get through, bravo! All for what? Did anyone think of the environment, how much computing resources are wasted by everyone proving that they're not a bot? C.

The problem is the bots. The captchas are just a symptom. Josh Reynolds Chief Technology Officer | SPITwSPOTS On Tue, Jul 1, 2025, 9:04 PM Constantine A. Murenin via NANOG < nanog@lists.nanog.org> wrote:
On Tue, 1 Jul 2025 at 09:15, Brandon Butterworth via NANOG <nanog@lists.nanog.org> wrote:
On 01/07/2025 15:05:16, "Johannes Müller Aguilar via NANOG" <nanog@lists.nanog.org> wrote:
For about a month, users behind IP addresses we announce have been
prompted to solve captchas when accessing Cloudflare-proxied sites.
I've seen that increase and now regularly get it on home broadband services, others have reported this too. I suspect many are getting it and assumed this is the new normal.
I'm seeing this on StackOverflow / StackExchange on my home broadband as well.
Having to wait half a minute to glance at a search result completely ruins the use-case for said result. If your time is worth $120/h, that's a $1 for each StackOverflow visit just to open the page, obviously it's cheaper to use AI at that point, so, no idea what they're thinking killing their own market.
I wish Google Search would let people blacklist StackOverflow as long as they're a Cloudflare user; or, heck, anything with these captchas. It's effectively just search spam with all those captchas.
But the "best" part about the security industry, is that because I do close the window in less than a second, Cloudflare probably reports my visit attempt as saving StackOverflow from yet another bot! "Look how many bots we've saved you from!"
I'd like to see the metrics from Cloudflare and the other captcha vendors on how they justify wasting billions of dollars in lost productivity. It probably costs way-way-way-way less than $0.01 to serve a page for which the legitimate users must now waste $1 in lost income. There's probably a 10000x amplification factor for real users wasting resources compared to how much resources are saved from the most basic bots that can't get through, bravo! All for what?
Did anyone think of the environment, how much computing resources are wasted by everyone proving that they're not a bot?
C. _______________________________________________ NANOG mailing list
https://lists.nanog.org/archives/list/nanog@lists.nanog.org/message/ROWRSJDJ...

But the bots are not a problem if you're doing proper caching and throttling. I mean, if your site has more bots than actual users, maybe you're doing it wrong. If what looks like a static page requires a captcha, you're doing something wrong. If it takes you $1 to generate a page, so you have to make sure all your visitors waste $1 of their time to view it, you're doing something wrong. Yes, captchas are a symptom, but it's a symptom of incompetence, not of bots. Bots don't cause captchas, poor engineering does. Bots aren't a problem, captchas are. C. On Tue, 1 Jul 2025 at 21:16, Josh Reynolds <joshr@spitwspots.com> wrote:
The problem is the bots.
The captchas are just a symptom.
Josh Reynolds Chief Technology Officer | SPITwSPOTS
On Tue, Jul 1, 2025, 9:04 PM Constantine A. Murenin via NANOG <nanog@lists.nanog.org> wrote:
On Tue, 1 Jul 2025 at 09:15, Brandon Butterworth via NANOG <nanog@lists.nanog.org> wrote:
On 01/07/2025 15:05:16, "Johannes Müller Aguilar via NANOG" <nanog@lists.nanog.org> wrote:
For about a month, users behind IP addresses we announce have been prompted to solve captchas when accessing Cloudflare-proxied sites.
I've seen that increase and now regularly get it on home broadband services, others have reported this too. I suspect many are getting it and assumed this is the new normal.
I'm seeing this on StackOverflow / StackExchange on my home broadband as well.
Having to wait half a minute to glance at a search result completely ruins the use-case for said result. If your time is worth $120/h, that's a $1 for each StackOverflow visit just to open the page, obviously it's cheaper to use AI at that point, so, no idea what they're thinking killing their own market.
I wish Google Search would let people blacklist StackOverflow as long as they're a Cloudflare user; or, heck, anything with these captchas. It's effectively just search spam with all those captchas.
But the "best" part about the security industry, is that because I do close the window in less than a second, Cloudflare probably reports my visit attempt as saving StackOverflow from yet another bot! "Look how many bots we've saved you from!"
I'd like to see the metrics from Cloudflare and the other captcha vendors on how they justify wasting billions of dollars in lost productivity. It probably costs way-way-way-way less than $0.01 to serve a page for which the legitimate users must now waste $1 in lost income. There's probably a 10000x amplification factor for real users wasting resources compared to how much resources are saved from the most basic bots that can't get through, bravo! All for what?
Did anyone think of the environment, how much computing resources are wasted by everyone proving that they're not a bot?
C. _______________________________________________ NANOG mailing list https://lists.nanog.org/archives/list/nanog@lists.nanog.org/message/ROWRSJDJ...

If bots aren't a problem why are all of these companies spending money to prevent bots? Hmmmmmmmmmmmmmm Josh Reynolds Chief Technology Officer | SPITwSPOTS On Tue, Jul 1, 2025, 10:22 PM Constantine A. Murenin <mureninc@gmail.com> wrote:
But the bots are not a problem if you're doing proper caching and throttling.
I mean, if your site has more bots than actual users, maybe you're doing it wrong.
If what looks like a static page requires a captcha, you're doing something wrong.
If it takes you $1 to generate a page, so you have to make sure all your visitors waste $1 of their time to view it, you're doing something wrong.
Yes, captchas are a symptom, but it's a symptom of incompetence, not of bots.
Bots don't cause captchas, poor engineering does.
Bots aren't a problem, captchas are.
C.
On Tue, 1 Jul 2025 at 21:16, Josh Reynolds <joshr@spitwspots.com> wrote:
The problem is the bots.
The captchas are just a symptom.
Josh Reynolds Chief Technology Officer | SPITwSPOTS
On Tue, Jul 1, 2025, 9:04 PM Constantine A. Murenin via NANOG <
On Tue, 1 Jul 2025 at 09:15, Brandon Butterworth via NANOG <nanog@lists.nanog.org> wrote:
On 01/07/2025 15:05:16, "Johannes Müller Aguilar via NANOG" <nanog@lists.nanog.org> wrote:
For about a month, users behind IP addresses we announce have been
nanog@lists.nanog.org> wrote: prompted to solve captchas when accessing Cloudflare-proxied sites.
I've seen that increase and now regularly get it on home broadband services, others have reported this too. I suspect many are getting it and assumed this is the new normal.
I'm seeing this on StackOverflow / StackExchange on my home broadband as well.
Having to wait half a minute to glance at a search result completely ruins the use-case for said result. If your time is worth $120/h, that's a $1 for each StackOverflow visit just to open the page, obviously it's cheaper to use AI at that point, so, no idea what they're thinking killing their own market.
I wish Google Search would let people blacklist StackOverflow as long as they're a Cloudflare user; or, heck, anything with these captchas. It's effectively just search spam with all those captchas.
But the "best" part about the security industry, is that because I do close the window in less than a second, Cloudflare probably reports my visit attempt as saving StackOverflow from yet another bot! "Look how many bots we've saved you from!"
I'd like to see the metrics from Cloudflare and the other captcha vendors on how they justify wasting billions of dollars in lost productivity. It probably costs way-way-way-way less than $0.01 to serve a page for which the legitimate users must now waste $1 in lost income. There's probably a 10000x amplification factor for real users wasting resources compared to how much resources are saved from the most basic bots that can't get through, bravo! All for what?
Did anyone think of the environment, how much computing resources are wasted by everyone proving that they're not a bot?
C. _______________________________________________ NANOG mailing list
https://lists.nanog.org/archives/list/nanog@lists.nanog.org/message/ROWRSJDJ...

* Constantine A. Murenin [Wed 02 Jul 2025, 05:23 CEST]:
But the bots are not a problem if you're doing proper caching and throttling.
Have you been following the news at all lately? Website operators are complaining left and right about the load from scrapers related to AI companies. They're seeing 10x, 100x the normal visitor load, with not just User-Agents but also source IP addresses masked to present as regular visitors. Captchas is unfortunately one of the more visible ways to address this, even if not perfect. For example, https://arstechnica.com/ai/2025/03/devs-say-ai-crawlers-dominate-traffic-for... -- Niels.

On Wed, Jul 02, 2025 at 12:50:28PM +0200, niels=nanog--- via NANOG wrote:
Have you been following the news at all lately? Website operators are complaining left and right about the load from scrapers related to AI companies. They're seeing 10x, 100x the normal visitor load, with not just User-Agents but also source IP addresses masked to present as regular visitors. Captchas is unfortunately one of the more visible ways to address this, even if not perfect.
"Not perfect" is an interesting way to describe this, given that the very AI companies this claims to be effective against are the same AI companies who can, at will, defeat every captcha. This tactic is like trying to drown the ocean. ---rsk

On Wed, 2 Jul 2025 at 05:50, niels=nanog--- via NANOG <nanog@lists.nanog.org> wrote:
* Constantine A. Murenin [Wed 02 Jul 2025, 05:23 CEST]:
But the bots are not a problem if you're doing proper caching and throttling.
Have you been following the news at all lately? Website operators are complaining left and right about the load from scrapers related to AI companies. They're seeing 10x, 100x the normal visitor load, with not just User-Agents but also source IP addresses masked to present as regular visitors. Captchas is unfortunately one of the more visible ways to address this, even if not perfect.
For example, https://arstechnica.com/ai/2025/03/devs-say-ai-crawlers-dominate-traffic-for...
That article describes a classic case of impedance mismatch, which is an engineering issue. It also fails to provide any actual engineering details, beyond simply the fact that it's about a git webservice, and, more specifically, about a self-hosted instance of Gitea. Git requires a LOT of resources and is NOT web scale. Git is, in fact, super-fast for its tasks when used LOCALLY, compared to CVS/SVN, but it's simply NOT web scale. Putting git onto the web, for anonymous access, using a tool optimised for local access, without doing any of the most basic caching or rate limiting with nginx or the like, as well as on the app itself, is the cause of the issue here. Adjusting robots.txt is NOT a "standard defensive measure" in this case; using the rate limts in nginx would be. If your website only has 10 active daily users, with each visit lasting only a few minutes, of course 100 daily bots using the site 24/7 will bring his site down. The solution? Maybe have more than 10 daily users so you don't have to worry about the 100 bots? Rate limiting with nginx isn't even mentioned. Why is the website so inefficient that just like presumably 100 users can bring it down? Why are you even running the website by yourself instead of using GitHub for public access in that case? On another note, about this Anubis spam… The most ridiculous recent adoption of Anubis was on www.OpenWrt.org. It's literally a "static" website with like 230 pages (ironically, they didn't adopt it on their forum.openwrt.org as of now). They've been using Anubis on all pages of www.openwrt.org, even on the front page. Why? The site literally changes like twice a year, and might as well be simply cached in its entirety on a daily basis without any external users noticing a thing. That would be correct solution: * Not logged into the Wiki? You get last day's cache. * Logged in? Fresh data. And, I mean, how inefficient does your wiki has to be that 100 simultaneous users could bring the site down? What's the purpose of having Anubis again? C.

As far as I'm aware, or as far as anyone's ever made me aware when I asked, there remains zero evidence that the high-intensity high-anonymity bots some sites are seeing have anything to do with AI. If they are AI, a court just ruled that AI scraping is fair use, so maybe you should offer them a zipped copy of your site and they won't have to scrape it. The actual reason for CAPTCHAs is revenue. Site operators would like to give a zipped copy of their sites to OpenAI - for $100,000. And they want that to be the only way OpenAI can get a copy. On 2 July 2025 12:50:28 pm GMT+02:00, niels=nanog--- via NANOG <nanog@lists.nanog.org> wrote:
* Constantine A. Murenin [Wed 02 Jul 2025, 05:23 CEST]:
But the bots are not a problem if you're doing proper caching and throttling.
Have you been following the news at all lately? Website operators are complaining left and right about the load from scrapers related to AI companies. They're seeing 10x, 100x the normal visitor load, with not just User-Agents but also source IP addresses masked to present as regular visitors. Captchas is unfortunately one of the more visible ways to address this, even if not perfect.
For example, https://arstechnica.com/ai/2025/03/devs-say-ai-crawlers-dominate-traffic-for...
-- Niels. _______________________________________________ NANOG mailing list https://lists.nanog.org/archives/list/nanog@lists.nanog.org/message/L5WNOGOA...

* nanog@immibis.com [Sun 06 Jul 2025, 20:46 CEST]:
As far as I'm aware, or as far as anyone's ever made me aware when I asked, there remains zero evidence that the high-intensity high-anonymity bots some sites are seeing have anything to do with AI.
If they are AI, a court just ruled that AI scraping is fair use, so maybe you should offer them a zipped copy of your site and they won't have to scrape it.
The actual reason for CAPTCHAs is revenue. Site operators would like to give a zipped copy of their sites to OpenAI - for $100,000. And they want that to be the only way OpenAI can get a copy.
The FSF disagrees: https://news.slashdot.org/story/25/07/06/1737253/the-fsf-faces-active-ongoin... -- Niels. --

Yes, many sites are currently getting DDoSed. I remain continually surprised that none of them are attempting legal action. Proxy operators can be subpoenad and forced to log. The FSF surely has lawyers. The FSF has provided no evidence they have anything to do with AI, either. The FSF is also persistently wrong about a lot of things, such as the idea that Anubis is malware, and shouldn't be cited as an authority on anything except perhaps the text of the GPL. On 7/07/25 11:49, niels=nanog--- via NANOG wrote:
* nanog@immibis.com [Sun 06 Jul 2025, 20:46 CEST]:
As far as I'm aware, or as far as anyone's ever made me aware when I asked, there remains zero evidence that the high-intensity high-anonymity bots some sites are seeing have anything to do with AI.
If they are AI, a court just ruled that AI scraping is fair use, so maybe you should offer them a zipped copy of your site and they won't have to scrape it.
The actual reason for CAPTCHAs is revenue. Site operators would like to give a zipped copy of their sites to OpenAI - for $100,000. And they want that to be the only way OpenAI can get a copy.
The FSF disagrees: https://news.slashdot.org/story/25/07/06/1737253/the-fsf-faces-active-ongoin...
-- Niels.

On 7/1/25 8:22 PM, Constantine A. Murenin via NANOG wrote:
But the bots are not a problem if you're doing proper caching and throttling.
Not all site traffic is cacheable or can be farmed out to a CDN. Dynamic (especially per-session) requests (think ecommerce) can't be cached. Putting an item into the shopping cart is typically one of the more resource driven events. We have seen bots that will select the buy button and put items into the cart, possibly to see any discounts given. You end up with hundreds of active 'junk' cart sessions on a small site that was not designed for that much traffic. Forcing the bot (or a legit customer) to create yet another login to create a cart can help but that generates push back from the store owner. The owners don't want that until the payment details phase or they want purchasers to be able to do a guest checkout. They will point that on amazon.com you don't have to login to put an item in the cart. Rate limiting is not effective when they come from different ip ranges. The old days of using a Class C (/24) as a rate limit key are no longer effective. The bots come from all over the providers space (often Azure) but can be from any of the larger providers and often from different regions. if you throttle EVERYONE then legit customers can get locked out with 429 or even 503s And has been pointed out. Relying on the browser string is no longer effective. They use common strings and change them dynamically. Sincerely, William Kern PixelGate Networks.

On Wed, 2 Jul 2025 at 14:38, William Kern via NANOG <nanog@lists.nanog.org> wrote:
On 7/1/25 8:22 PM, Constantine A. Murenin via NANOG wrote:
But the bots are not a problem if you're doing proper caching and throttling.
Not all site traffic is cacheable or can be farmed out to a CDN.
That's just an excuse for inadequate planning and misplaced priorities. If you start with the requirement that it'd all be cacheable, then EVERYTHING can be cached, especially for the ecommerce and the catalogue stuff. OSS nginx is free and relatively easy to use, with excellent documentation, and it offers superb caching functionality. You don't need an external CDN to do the caching. You can even cache search results, especially for the non-logged users. Why would you NOT? If, to quote arstechnica, "a GitLab link is shared in a chat room", why would you want ANYONE to wait an extra millisecond, let alone "having to wait around two minutes" for Anubis proof-of-work, to access the result, if the result was already computed and known, because it was already assembled for the person who posting the link in the first place? These things could even be cached in the app itself, and even shared between all logged and non-logged users, if performance and web scale is paramount. Else, it can be architectected to be cachable with nginx.
Dynamic (especially per-session) requests (think ecommerce) can't be cached.
Putting an item into the shopping cart is typically one of the more resource driven events.
We have seen bots that will select the buy button and put items into the cart, possibly to see
any discounts given. You end up with hundreds of active 'junk' cart sessions on a small site
that was not designed for that much traffic.
Why is the simple act of placing an item in a shopping cart a resource-driven event? This can literally be done on the front-end without any server requests at all, let alone resource-driven ones. If you DO store an expensive session on the server for this, instead of in the browser, then you also likely expire said carts even for users who intended to return and complete the purchase. Does the owner know? Yes, it's more work to have a separate cookie cart for anonymous users, but if that's a business requirement, why not? This way, even if someone comes back many months later, if they've never cleared the cookies, their cart will still be there, waiting for them, at zero cost to your shopping cart database. Isn't that how it should be? Stores that empty your cart in 3 days, or which require captchas for basic product viewing, are the best example of misplaced priorities. I usually click the X button before they can complete their captcha. And won't bother adding anytihng to the shopping cart again if the store is known for data loss.
Forcing the bot (or a legit customer) to create yet another login to create a cart can help
but that generates push back from the store owner. The owners don't want that until
the payment details phase or they want purchasers to be able to do a guest checkout.
They will point that on amazon.com you don't have to login to put an item in the cart.
Rate limiting is not effective when they come from different ip ranges. The old days of using
Rate limiting would make sense for expensive things like search (and `git blame`), which should also be combined with caching, too, especially if you aren't even using AI or past purchases/views. Things like adding an item to a cart, should be a local event for anonymous users, so, it should be impossible to rate-limit that. Product listings and categories should 100% be cached, absolutely no exceptions. Search pages also absolutely have to be cached, I dunno who ever though of the brilliant idea that search somehow isn't cacheable, especially on all those sites where it's 100% deterministic and identical for all users. If someone wants to get the entire site of all the products, I don't see a good reason to preclude that. In the old days, any vendor would be happy to send you the entire catalogue of their offerings, all at once, in print form in the US for major brands, and in Microsoft Excel for the more local vendors, but now suddenly we want to prevent people from viewing several products all at a time, or being able to shop the way they want to, or see the prices for more than a handful of products at a time?! Misplaced priorities 100%. Best regards, Constantine.
a Class C (/24) as a rate limit key are no longer effective. The bots come from all over the providers space
(often Azure) but can be from any of the larger providers and often from different regions.
if you throttle EVERYONE then legit customers can get locked out with 429 or even 503s
And has been pointed out. Relying on the browser string is no longer effective. They use
common strings and change them dynamically.
Sincerely,
William Kern
PixelGate Networks.

* Constantine A. Murenin [Thu 03 Jul 2025, 00:46 CEST]:
Why is the simple act of placing an item in a shopping cart a resource-driven event?
Tell me you don't know how modern e-commerce works without saying you don't know how modern e-commerce works. Say you're a modern seller. You have widgets you're looking to sell. You have a certain number on hand and will need to order more in time to satisfy future demand. The moment somebody places a widget in their shopping basket you make a reservation in the backend system so you know you're soon running out. People leaving shopping baskets just sitting around are an active drag on the JIT delivery process. That's why some e-commerce websites will send you emails with discount coupons to incentivise you to make up your mind and order if you leave items in your shopping basket for too long. -- Niels. --

On Thu, 3 Jul 2025 at 04:39, niels=nanog--- via NANOG <nanog@lists.nanog.org> wrote:
* Constantine A. Murenin [Thu 03 Jul 2025, 00:46 CEST]:
Why is the simple act of placing an item in a shopping cart a resource-driven event?
Tell me you don't know how modern e-commerce works without saying you don't know how modern e-commerce works.
Sometimes it takes an outsider to see the inefficiency in the process. The bottom line is that there's absolutely no justification for having captchas everywhere, and especially on ecommerce and other cacheable things. And as the quote goes: 'It is difficult to get a man to understand something, when his salary depends on his not understanding it.'
Say you're a modern seller. You have widgets you're looking to sell. You have a certain number on hand and will need to order more in time to satisfy future demand.
The moment somebody places a widget in their shopping basket you make a reservation in the backend system so you know you're soon running out. People leaving shopping baskets just sitting around are an active drag on the JIT delivery process.
That's why some e-commerce websites will send you emails with discount coupons to incentivise you to make up your mind and order if you leave items in your shopping basket for too long.
That absolutely does make sense, but then why are you complaining about the bots placing the things in the cart? You can't have it both ways. * If it's expensive to place items in the cart, maybe make it cheaper? * If the cost is actually justified, how is it a problem that bots do it, too? Also, BTW, I kind of fail to see the business logic behind needing this info to re-order to avoid running out: * If people place items in the cart and buy right away, why would you need to reorder anything before anyone actually pays for something? How does it make any difference to reorder something a few minutes earlier than otherwise? * If people/bots place items in the cart, and never buy, how exactly is it justified to bother anyone with anything before they're able to do such a simple task, if the sale doesn't happen anyways, and no reordering is needed in the first place? Sorry, but something just doesn't add up! And how exactly do we get to a point that any captchas are required at any point? Either users order, or they don't. Why did you make it expensive for yourself to handle the case when they don't order? How exactly do the captchas help here? What makes you think actual real human users don't mind spending $1+ to solve each captcha? This is a classic example of the lack of ownership on all levels, where business requirements are misinterpreted and non-existing problems are subsequently created that now suddenly need to be urgently solved, without any sight of the original business statement that's being solved, and at a cost that is misrepresented to the owner of the store. (How exactly do you value the user having to waste 30 seconds to solve a captcha for each page at least once a day? I value it at $1 per each solve; BTW, I'm pretty certain the cost of this to bots is far LOWER than $1, with the $1 being the cost to actual, real users, in lost productivity.) Captchas are the biggest nuisance by far, and probably the biggest modern contributor to the global warming and lost productivity for everyone. C.

On Tue, Jul 01, 2025 at 09:16:02PM -0500, Josh Reynolds via NANOG wrote:
The problem is the bots.
Bots are certainly a problem, but only one of many. There are also enormous cloud operations (*cough*) that are systemic and persistent sources and sinks of abuse and attacks; there are hosting operations that are the same; there are "security researchers" that launch repeated attacks; there's the IOT (which is why the "dumpsterfire" mailing list exists); there are large email operations that source/sink/support spam and phishing; there are ~1000 worthless gTLDs that are overrun with abusers; there are rapacious/abusive AI operations; and there are, unfortunately, a fair number of people pushing idiotic security theater (e.g., captchas, passkeys) that doesn't solve these problems, only (a) covers them up and/or (b) makes them worse. The best solutions I've found for these are combinations of null routing, firewall rules (including geographic restrictions), and members-only web sites. (E.g., dumpsterfire's archive is no longer public because of AI crawlers.) We can't have nice things generously built for the common good any more because there are too many selfish and greedy thugs who don't care about anything except their own wealth, power, and egos. ---rsk

On Sun, Jul 6, 2025 at 7:11 AM Rich Kulawiec via NANOG <nanog@lists.nanog.org> wrote:
We can't have nice things generously built for the common good any more because there are too many selfish and greedy thugs who don't care about anything except their own wealth, power, and egos.
I've wondered if it'd work to place invisible links in the page and then block the source for while any time one of the invisible links is clicked. Just the classic one-pixel transparent graphic but with a link that the log reaper understands to mean "bot was here." Haven't tried it. Nearly all of my content is static so I don't have enough of a crawler problem to bother. Regards, Bill -- William Herrin bill@herrin.us https://bill.herrin.us/

On Sun, Jul 6, 2025, at 10:47 AM, William Herrin via NANOG wrote:
On Sun, Jul 6, 2025 at 7:11 AM Rich Kulawiec via NANOG <nanog@lists.nanog.org> wrote:
We can't have nice things generously built for the common good any more because there are too many selfish and greedy thugs who don't care about anything except their own wealth, power, and egos.
I've wondered if it'd work to place invisible links in the page and then block the source for while any time one of the invisible links is clicked. Just the classic one-pixel transparent graphic but with a link that the log reaper understands to mean "bot was here."
Haven't tried it. Nearly all of my content is static so I don't have enough of a crawler problem to bother.
For the crawler mitigations I've personally been involved with, this would not have worked. The source IPs numbered at least in the tens of thousands. They didn't repeat; each source IP made one request and was never seen again. They didn't aggregate together into prefixes we could filter. They didn't use any common identifier we could find to filter on, including the user-agent (which were valid-looking and randomized). In my case, we were able to simply put 99% of the site behind a login, which mitigated the problem. Many sites don't have that option.. Dan

On 7/7/25 13:46, Dan Lowe via NANOG wrote:
For the crawler mitigations I've personally been involved with, this would not have worked. The source IPs numbered at least in the tens of thousands. They didn't repeat; each source IP made one request and was never seen again. They didn't aggregate together into prefixes we could filter. They didn't use any common identifier we could find to filter on, including the user-agent (which were valid-looking and randomized).
In my case, we were able to simply put 99% of the site behind a login, which mitigated the problem. Many sites don't have that option..
A perhaps interesting question would be how the entities involved in this crawling activity have come to control so much IP space. It doesn't seem like a use-case that readily justifies it. (Yes, I know that hardly matters)

On Mon, Jul 7, 2025, at 9:50 PM, Brandon Martin via NANOG wrote:
A perhaps interesting question would be how the entities involved in this crawling activity have come to control so much IP space. It doesn't seem like a use-case that readily justifies it.
There were too many addresses to research all of them, but taking several dozen and looking at whois, I found they belonged to all sorts of different kinds of organization, in many different countries on multiple continents. IMO the reasonable conclusion is they're hiring botnets. Dan

On 8/07/25 03:50, Brandon Martin via NANOG wrote:
On 7/7/25 13:46, Dan Lowe via NANOG wrote:
For the crawler mitigations I've personally been involved with, this would not have worked. The source IPs numbered at least in the tens of thousands. They didn't repeat; each source IP made one request and was never seen again. They didn't aggregate together into prefixes we could filter. They didn't use any common identifier we could find to filter on, including the user-agent (which were valid-looking and randomized).
In my case, we were able to simply put 99% of the site behind a login, which mitigated the problem. Many sites don't have that option..
A perhaps interesting question would be how the entities involved in this crawling activity have come to control so much IP space. It doesn't seem like a use-case that readily justifies it.
(Yes, I know that hardly matters)
This is a service you can buy, and it even has different service classes. If you want to launder your traffic through residential addresses it's expensive, on the order of dollars per GB; they typically have to use real connections at real eyeball ISPs. The more reputable ones pay people money to run a proxy tool on their computer, and perhaps also get their own connections under aliases. The less reputable ones rent time on botnets. If you can deal with renting datacenter addresses, it's much cheaper. This particular provider's smallest plan is for unlimited use of 1000 datacenter proxies for $25/month. You can also rent mobile phone proxies; it's even more expensive than residential. I expect they run through a lot of burner SIM cards on a mobile phone farm, as well as paying people to run them on their legitimate phones. The service already existed prior to LLMs and was (and still is) used for a wide variety of purposes against non-cooperative websites. Apparently one surprisingly popular category is "sneaker botting", the practice of buying out stocks of limited edition shoes as soon as they become available, so you can resell them to actual people for higher prices. You can also imagine more beneficial uses such as scraping Amazon to make historical price charts (bad for Amazon, good for society) and more destructive uses such as spamming web forms. AWS Lambda used to be a popular way to get a large pool of IPs (and EC2 for a smaller pool), but site operators quickly saw that traffic from their network was almost always bots, and blocked the entirety of AWS. Blocking all of Comcast or Deutsche Telekom is harder to justify, so they throw up a CAPTCHA instead. This is why you see CAPTCHAs on your home internet connection today - it's because other users of your ISP are running AI scraping proxies for money. (In today's economy, who can blame them?) The blocking decision is outsourced to even more 3rd-party companies which compile lists of IP ranges and classify them as datacenter, residential, mobile phone, Tor exit node, etc and scores of how likely they are to be a proxy. Cloudflare probably has their own internal division, but a site like Reddit is likely to query some IP info company and if it doesn't say "residential, not a proxy" then it serves you a CAPTCHA or severely rate-limits you. The end game here is that almost all addresses are marked as "probably a proxy" and it becomes a useless signal, and half of all HTTP packets get routed across the world three times for no good reason.

* nanog@immibis.com [Wed 09 Jul 2025, 16:39 CEST]:
This is a service you can buy, and it even has different service classes.
If you want to launder your traffic through residential addresses it's expensive, on the order of dollars per GB; they typically have to use real connections at real eyeball ISPs. The more reputable ones pay people money to run a proxy tool on their computer, and perhaps also get their own connections under aliases. The less reputable ones rent time on botnets.
There's another option here: the proxy-for-rent service preys on unwitting users who installed a free app on their (mostly Android) smartphones, which shipped with an SDK that set up the phone as a proxy. The clickthrough EULA may or may not mention it. https://www.humansecurity.com/learn/blog/satori-threat-intelligence-alert-pr... -- Niels. --

Everything folks have mentioned was my functional assumption. It seems unlikely that they'd have that much IP space under their direct control, and if they did, it would be easier to block using WHOIS info, origination details, etc. Now, as mentioned, there are some botnets out there that are somewhat above board in that the controlled endpoints are voluntarily doing the stuff they're hired to do. They may even get paid a little. It's doubtful that the owner of the endpoint really understands the ramifications of their participation, but at least they're otherwise consenting and an economic beneficiary. Of course, as we all know, many or most botnets are made up primarily or essentially entirely of compromised endpoints. Now we're in "just flat out illegal" territory. Lovely. It took far less shady action than this to get (albeit very weak and not particularly timely) US federal legislation on email SPAM. I'm not holding my breath for this to get addressed any time soon, though.

Have you, or anyone, tried legal action? Surely you have some amount of reasonable suspicion that this is a proxy network, and surely if OpenAI can be forced to log all conversations, then the operator of a proxy network can be forced to log who is making connections to a certain site. You can then press charges against this person for DDoS. On 9/07/25 22:00, Brandon Martin via NANOG wrote:
Everything folks have mentioned was my functional assumption. It seems unlikely that they'd have that much IP space under their direct control, and if they did, it would be easier to block using WHOIS info, origination details, etc.
Now, as mentioned, there are some botnets out there that are somewhat above board in that the controlled endpoints are voluntarily doing the stuff they're hired to do. They may even get paid a little. It's doubtful that the owner of the endpoint really understands the ramifications of their participation, but at least they're otherwise consenting and an economic beneficiary.
Of course, as we all know, many or most botnets are made up primarily or essentially entirely of compromised endpoints. Now we're in "just flat out illegal" territory. Lovely.
It took far less shady action than this to get (albeit very weak and not particularly timely) US federal legislation on email SPAM. I'm not holding my breath for this to get addressed any time soon, though. _______________________________________________ NANOG mailing list https://lists.nanog.org/archives/list/nanog@lists.nanog.org/message/VVPZO2RI...

On 7/9/25 17:10, nanog--- via NANOG wrote:
Have you, or anyone, tried legal action? Surely you have some amount of reasonable suspicion that this is a proxy network, and surely if OpenAI can be forced to log all conversations, then the operator of a proxy network can be forced to log who is making connections to a certain site. You can then press charges against this person for DDoS.
I'm not personally impacted by this as I don't operate any web services that are being crawled. My teeth in the game is not wanting the IP space associated with the networks I run getting made unusable due to factors essentially outside my control or having my subscribers upset that they're constantly getting intercepted again for reasons outside my control. The question of course would be if the proxy networks involved are even intending to operate within legal bounds at least as far as we see them in North America. And of course that assumes there's any viable way to identify the operator of them and contact that operator.

On 10/07/25 00:02, Brandon Martin via NANOG wrote:
On 7/9/25 17:10, nanog--- via NANOG wrote:
Have you, or anyone, tried legal action? Surely you have some amount of reasonable suspicion that this is a proxy network, and surely if OpenAI can be forced to log all conversations, then the operator of a proxy network can be forced to log who is making connections to a certain site. You can then press charges against this person for DDoS.
I'm not personally impacted by this as I don't operate any web services that are being crawled. My teeth in the game is not wanting the IP space associated with the networks I run getting made unusable due to factors essentially outside my control or having my subscribers upset that they're constantly getting intercepted again for reasons outside my control.
The question of course would be if the proxy networks involved are even intending to operate within legal bounds at least as far as we see them in North America. And of course that assumes there's any viable way to identify the operator of them and contact that operator.
Of course they are. These are registered businesses in western countries (not always America) that take credit card payments. The owner either answers subpoenas, or goes to jail until he "decides" to start answering subpoenas.

On Tue, Jul 1, 2025 at 7:04 PM Constantine A. Murenin via NANOG < nanog@lists.nanog.org> wrote:
Cloudflare probably reports my visit attempt as saving StackOverflow from yet another bot! "Look how many bots we've saved you from!"
Hey now...StackOverflow wants to make sure your data is protected from everyone harvesting it to train their own AI coding models. ...so StackOverflow can profit from selling your data to people who want to train AI coding models. -A

On Wed, 2 Jul 2025 at 10:49, Aaron C. de Bruyn <aaron@heyaaron.com> wrote:
On Tue, Jul 1, 2025 at 7:04 PM Constantine A. Murenin via NANOG <nanog@lists.nanog.org> wrote:
Cloudflare probably reports my visit attempt as saving StackOverflow from yet another bot! "Look how many bots we've saved you from!"
Hey now...StackOverflow wants to make sure your data is protected from everyone harvesting it to train their own AI coding models. ...so StackOverflow can profit from selling your data to people who want to train AI coding models.
-A
But will they be able to do that more than once? Who will contribute to StackOverflow the "Cloudflare captcha" site? They'll stop converting real users, and lose the existing ones, too, and will become obsolete, with such a strategy. Meanwhile, as has been shown by the countless citations, any new AI startup worth its salt, could easily find a way to bypass said captchas programmatically. C.

I've never tried it, but allegedly, if you want to block sites from your search results, Kagi can do this. It's a paid product. Hopefully that results in better incentive alignment. Google has been caught showing worse results on purpose, so you'll view more pages of results and therefore more ads. On 2 July 2025 4:03:52 am GMT+02:00, "Constantine A. Murenin via NANOG" <nanog@lists.nanog.org> wrote:
On Tue, 1 Jul 2025 at 09:15, Brandon Butterworth via NANOG <nanog@lists.nanog.org> wrote:
On 01/07/2025 15:05:16, "Johannes Müller Aguilar via NANOG" <nanog@lists.nanog.org> wrote:
For about a month, users behind IP addresses we announce have been prompted to solve captchas when accessing Cloudflare-proxied sites.
I've seen that increase and now regularly get it on home broadband services, others have reported this too. I suspect many are getting it and assumed this is the new normal.
I'm seeing this on StackOverflow / StackExchange on my home broadband as well.
Having to wait half a minute to glance at a search result completely ruins the use-case for said result. If your time is worth $120/h, that's a $1 for each StackOverflow visit just to open the page, obviously it's cheaper to use AI at that point, so, no idea what they're thinking killing their own market.
I wish Google Search would let people blacklist StackOverflow as long as they're a Cloudflare user; or, heck, anything with these captchas. It's effectively just search spam with all those captchas.
But the "best" part about the security industry, is that because I do close the window in less than a second, Cloudflare probably reports my visit attempt as saving StackOverflow from yet another bot! "Look how many bots we've saved you from!"
I'd like to see the metrics from Cloudflare and the other captcha vendors on how they justify wasting billions of dollars in lost productivity. It probably costs way-way-way-way less than $0.01 to serve a page for which the legitimate users must now waste $1 in lost income. There's probably a 10000x amplification factor for real users wasting resources compared to how much resources are saved from the most basic bots that can't get through, bravo! All for what?
Did anyone think of the environment, how much computing resources are wasted by everyone proving that they're not a bot?
C. _______________________________________________ NANOG mailing list https://lists.nanog.org/archives/list/nanog@lists.nanog.org/message/ROWRSJDJ...

Another example of the broken security space. The security vendor points at the tech-incapable customer. The tech-incapable customer points right back at the security vendor. Wash. Rinse. Repeat. ----- Mike Hammett Intelligent Computing Solutions Midwest Internet Exchange The Brothers WISP ----- Original Message ----- From: "Johannes Müller Aguilar via NANOG" <nanog@lists.nanog.org> To: nanog@lists.nanog.org Cc: "Johannes Müller Aguilar" <JMuellerAguilar@anexia.com> Sent: Tuesday, July 1, 2025 9:05:16 AM Subject: Captchas on Cloudflare-Proxied Sites Hello, We operate as a cloud service provider, and much of our traffic is indeed—per Cloudflare’s terminology—“bot traffic.” For about a month, users behind IP addresses we announce have been prompted to solve captchas when accessing Cloudflare-proxied sites. When we contacted Cloudflare support, they referred us to their customers (e.g., Stack Overflow, OpenAI), but support from those sites directed us back to Cloudflare. I reviewed Cloudflare Radar but found it limited in actionable insights. We also announce prefixes to Cloudflare where the originating AS primarily serves end-users—and where Radar shows over 80% human traffic—yet users still encounter captchas, suggesting the issue may be related to our announcing AS. Has anyone experienced this or found effective ways to resolve it? Any advice or pointers would be greatly appreciated. Best regards, Johannes _______________________________________________ NANOG mailing list https://lists.nanog.org/archives/list/nanog@lists.nanog.org/message/NZO6QF5X...

Another example of the broken security space. The security vendor points at the tech-incapable customer. The tech-incapable customer points right back at the security vendor. Wash. Rinse. Repeat.
If a customer chooses to put their website on a CDN, they MUST know: - What the CDN does (or does not do) - What options are available to them ( free or paid ) - How to effectuate changes ( portal or ticket) that they need to get done. - How to get support from the CDN should a user encounter a problem they cannot solve A CDN is not a magic get out of jail free card that allows you to completely abdicate all technical responsibility for your website. On Tue, Jul 1, 2025 at 10:44 AM Mike Hammett via NANOG < nanog@lists.nanog.org> wrote:
Another example of the broken security space. The security vendor points at the tech-incapable customer. The tech-incapable customer points right back at the security vendor. Wash. Rinse. Repeat.
----- Mike Hammett Intelligent Computing Solutions
Midwest Internet Exchange
The Brothers WISP
----- Original Message ----- From: "Johannes Müller Aguilar via NANOG" <nanog@lists.nanog.org> To: nanog@lists.nanog.org Cc: "Johannes Müller Aguilar" <JMuellerAguilar@anexia.com> Sent: Tuesday, July 1, 2025 9:05:16 AM Subject: Captchas on Cloudflare-Proxied Sites
Hello,
We operate as a cloud service provider, and much of our traffic is indeed—per Cloudflare’s terminology—“bot traffic.”
For about a month, users behind IP addresses we announce have been prompted to solve captchas when accessing Cloudflare-proxied sites. When we contacted Cloudflare support, they referred us to their customers (e.g., Stack Overflow, OpenAI), but support from those sites directed us back to Cloudflare.
I reviewed Cloudflare Radar but found it limited in actionable insights. We also announce prefixes to Cloudflare where the originating AS primarily serves end-users—and where Radar shows over 80% human traffic—yet users still encounter captchas, suggesting the issue may be related to our announcing AS.
Has anyone experienced this or found effective ways to resolve it? Any advice or pointers would be greatly appreciated.
Best regards, Johannes
_______________________________________________ NANOG mailing list
https://lists.nanog.org/archives/list/nanog@lists.nanog.org/message/NZO6QF5X...
_______________________________________________ NANOG mailing list
https://lists.nanog.org/archives/list/nanog@lists.nanog.org/message/GJDCCAF2...

This is especially ironic given that captchas were pretty much defeated 10-15 years ago, thoroughly defeated in the last 2-3 years, and are now exclusively deployed by people who (a) haven't been paying attention and (b) like to pretend that they still work. Oh, citation needed? Okay, here are a few. Let's start in 2008: Gone in 60 seconds: Spambot cracks Live Hotmail CAPTCHA http://arstechnica.com/news.ars/post/20080415-gone-in-60-seconds-spambot-cra... Cheap CAPTCHA Solving Changes the Security Game https://freedom-to-tinker.com/blog/felten/cheap-captcha-solving-changes-secu... and 2011: Stanford researchers outsmart captcha codes http://www.physorg.com/news/2011-11-stanford-outsmart-captcha-codes.html and 2012: How a trio of hackers brought Google's reCAPTCHA to its knees | Ars Technica http://arstechnica.com/security/2012/05/google-recaptcha-brought-to-its-knee... Troy Hunt: Breaking CAPTCHA with automated humans http://www.troyhunt.com/2012/01/breaking-captcha-with-automated-humans.html and 2014: Snapchat account registration CAPTCHA defeated https://techienews.co.uk/snapchat-account-registration-captcha-defeated/ and 2017: Artificial Intelligence Beats CAPTCHA - IEEE Spectrum https://spectrum.ieee.org/artificial-intelligence-beats-captcha unCAPTCHA Breaks 450 ReCAPTCHAs in Under 6 Seconds https://www.bleepingcomputer.com/news/technology/uncaptcha-breaks-450-recapt... and 2023: AI bots are better than humans at solving CAPTCHA puzzles https://qz.com/ai-bots-recaptcha-turing-test-websites-authenticity-185073435... [2307.12108] An Empirical Study & Evaluation of Modern CAPTCHAs https://arxiv.org/abs/2307.12108 and 2024: AI researchers demonstrate 100% success rate in bypassing online CAPTCHAs https://www.tomshardware.com/tech-industry/artificial-intelligence/ai-resear... [2409.08831] Breaking reCAPTCHAv2</a> https://arxiv.org/abs/2409.08831 ---rsk

On 7/1/25 2:43 PM, Rich Kulawiec via NANOG wrote:
and 2023:
AI bots are better than humans at solving CAPTCHA puzzles https://qz.com/ai-bots-recaptcha-turing-test-websites-authenticity-185073435...
[2307.12108] An Empirical Study & Evaluation of Modern CAPTCHAs https://arxiv.org/abs/2307.12108
All I have to say on this subject is that there is a certain indignity in needing to put on my fricking reading glasses to solve them. Where can I get this AI? Mike

I don't know if it was apocryphal but the story that went around was miscreants just ran porn sites and would forward the captcha (or similar) to anyone trying to access their porn site who would solve it and the solution would be forwarded and, in. On July 1, 2025 at 15:04 nanog@lists.nanog.org (Michael Thomas via NANOG) wrote:
On 7/1/25 2:43 PM, Rich Kulawiec via NANOG wrote:
and 2023:
AI bots are better than humans at solving CAPTCHA puzzles https://qz.com/ai-bots-recaptcha-turing-test-websites-authenticity-185073435...
[2307.12108] An Empirical Study & Evaluation of Modern CAPTCHAs https://arxiv.org/abs/2307.12108
All I have to say on this subject is that there is a certain indignity in needing to put on my fricking reading glasses to solve them. Where can I get this AI?
Mike
_______________________________________________ NANOG mailing list https://lists.nanog.org/archives/list/nanog@lists.nanog.org/message/JAHODY5A...
-- -Barry Shein Software Tool & Die | bzs@TheWorld.com | http://www.TheWorld.com Purveyors to the Trade | Voice: +1 617-STD-WRLD | 800-THE-WRLD The World: Since 1989 | A Public Information Utility | *oo*

Just replace the captcha with the question on how many R’s there are in the word strawberry ;-) Jeroen Wunnink Sr. Manager - Integration Engineering www.gtt.net<http://www.gtt.net/> On 7/1/25 2:43 PM, Rich Kulawiec via NANOG wrote:
and 2023:
AI bots are better than humans at solving CAPTCHA puzzles https://qz.com/ai-bots-recaptcha-turing-test-websites-authenticity-185073435...
[2307.12108] An Empirical Study & Evaluation of Modern CAPTCHAs https://arxiv.org/abs/2307.12108
All I have to say on this subject is that there is a certain indignity in needing to put on my fricking reading glasses to solve them. Where can I get this AI? Mike _______________________________________________ NANOG mailing list https://lists.nanog.org/archives/list/nanog@lists.nanog.org/message/JAHODY5A... NOTICE: This e-mail is only intended for the person(s) to whom it is addressed and may contain confidential information. Unless stated to the contrary, any opinions or comments are personal to the writer and do not represent the official view of GTT Communications Inc or any of its affiliates. If you have received this e-mail in error, please notify us immediately by reply e-mail and then delete this message from your system. Please do not copy it or use it for any purposes, or disclose its contents to any other person. All quotes, offers, proposals and any other information in the body of this email is subject to, and limited by, the terms and conditions, signed service agreement and/or statement of work
participants (17)
-
Aaron C. de Bruyn
-
Brandon Butterworth
-
Brandon Martin
-
bzs@theworld.com
-
Constantine A. Murenin
-
Dan Lowe
-
Jeroen Wunnink
-
Johannes Müller Aguilar
-
Josh Reynolds
-
Michael Thomas
-
Mike Hammett
-
nanog@immibis.com
-
niels=nanog@bakker.net
-
Rich Kulawiec
-
Tom Beecher
-
William Herrin
-
William Kern