We run a smaller ISP of about 7.5k customers and the other day we got an email (excerpt below) from one of Google's automated tools. We are seeing automated scraping of Google Web Search from a large number of your IPs. Automated scraping violates our /robots.txt file and also our Terms of Service. We request that you terminate this traffic immediately. Failure to do so may cause your network to be blocked by our abuse systems. To allow you to identify the traffic, we are providing a list of your IPs they used today (Source field), as well as the most common destination (Google) IP and port and a timestamp of a recent request (in UTC) to aid in your identification. Note that this list may not be exhaustive, and we request that you terminate all such traffic, not just traffic from IPs in this list. All of the destination ports are either 80 or 443, so they at least appear to be legit web traffic on the surface. They are obviously spoofed IP address as there are network addresses in the list and the IP belongs to a router that doesn't appear to be compromised in any way. The initial letter included 700+ IP addresses from our network. It's now affecting our customers as they are now getting Captcha's for every couple of Google searches that they perform. Does anyone know of a good way to track the perpetrator(s) down and/or know of a way to mitigate this? -- Christopher Tyler Senior Network Engineer MTCRE/MTCNA/MTCTCE/MTCWE Total Highspeed Internet Solutions 1091 W. Kathryn Street Nixa, MO 65714 (417) 851-1107 x. 9002 www.totalhighspeed.com
On Fri, Jun 19, 2020 at 9:15 AM Christopher Tyler <chris@totalhighspeed.net> wrote:
We run a smaller ISP of about 7.5k customers and the other day we got an email (excerpt below) from one of Google's automated tools.
We are seeing automated scraping of Google Web Search from a large number of your IPs. Automated scraping violates our /robots.txt file and also our Terms of Service. We request that you terminate this traffic immediately. Failure to do so may cause your network to be blocked by our abuse systems.
To allow you to identify the traffic, we are providing a list of your IPs they used today (Source field), as well as the most common destination (Google) IP and port and a timestamp of a recent request (in UTC) to aid in your identification. Note that this list may not be exhaustive, and we request that you terminate all such traffic, not just traffic from IPs in this list.
All of the destination ports are either 80 or 443, so they at least appear to be legit web traffic on the surface. They are obviously spoofed IP address as there are network addresses in the list and the IP belongs to a router that doesn't appear to be compromised in any way. The initial letter included 700+ IP addresses from our network.
Hi Christopher, Presumably Google is smart enough to know the difference between spoofed port scanning and completed TCP connections performing a web search. If you take Google's report at face value, the addresses aren't spoofed; something else is happening. The question is how. There was a company revealed on Nanog earlier this year (or maybne last year, I'm not great with dates) which contracts small ISPs and virtual server providers to use their "spare bandwidth" to pseudonymously originate web requests. They don't require you to assign them IP addresses because they overload their activity on all of your IP addresses. In theory they do this without disturbing your customers and only access web sites whose owners have contracted them to do so, generally to test connectivity. In practice, there's a device inline with your traffic flow that injects TCP connections and captures the associated return packets across your entire address space. Including, for example, your routers' IP addresses. Do you, or perhaps your upstream have such a contract? Regards, Bill Herrin -- William Herrin bill@herrin.us https://bill.herrin.us/
On Fri, Jun 19, 2020 at 1:44 PM Sabri Berisha <sabri@cluecentral.net> wrote:
----- On Jun 19, 2020, at 9:40 AM, William Herrin bill@herrin.us wrote:
Hi,
Do you, or perhaps your upstream have such a contract?
I'd be pretty unhappy if someone that I'm paying for transit spoofs traffic with my IP space as the source.
I don't think william's description is 'spoofing', it's perhaps: "Manufacturing hosts on the fly" is still skeezy though ;(
Pretty sure that we traced to a service DiviNetworks that uses "unused" IP space/bandwidth and tracks Invalid DNS queries. Thanks for all of the input and assistance, special thanks to William Herrin who pointed this out. -- Christopher Tyler Senior Network Engineer MTCRE/MTCNA/MTCTCE/MTCWE Total Highspeed Internet Solutions 1091 W. Kathryn Street Nixa, MO 65714 (417) 851-1107 x. 9002 www.totalhighspeed.com ----- Original Message -----
From: "Christopher Morrow" <morrowc.lists@gmail.com> To: "Sabri Berisha" <sabri@cluecentral.net> Cc: "nanog" <nanog@nanog.org> Sent: Friday, June 19, 2020 12:58:46 PM Subject: Re: Google captcha issue
On Fri, Jun 19, 2020 at 1:44 PM Sabri Berisha <sabri@cluecentral.net> wrote:
----- On Jun 19, 2020, at 9:40 AM, William Herrin bill@herrin.us wrote:
Hi,
Do you, or perhaps your upstream have such a contract?
I'd be pretty unhappy if someone that I'm paying for transit spoofs traffic with my IP space as the source.
I don't think william's description is 'spoofing', it's perhaps: "Manufacturing hosts on the fly"
is still skeezy though ;(
participants (4)
-
Christopher Morrow
-
Christopher Tyler
-
Sabri Berisha
-
William Herrin