So i guess im new at internets as my colleagues told me because I havent gone around to 30-40 systems I control (minus customer self-managed gear) and installed a restrictive robots.txt everywhere to make the web less useful to everyone. Does that really mean that a big outfit like yahoo should be expected to download stuff at high speed off my customers servers? For varying values of 'high speed', ~500K/s (4Mbps+) for a 3 gig file is kinda... a bit harsh. Especially for an exe a user left exposed in a webdir, thats possibly (C) software and shouldnt have been there (now removed by customer, some kinda OS boot cd/toolset thingy). This makes it look like Yahoo is actually trafficking in pirated software, but that's kinda too funny to expect to be true, unless some yahoo tech decided to use that IP/server @yahoo for his nefarious activity, but there are better sites than my customer's box to get his 'juarez'. At any rate:
From Address To Address Proto Bytes CPS ============================================================================================================================================================================================== 67.196.xx.xx..80 67.195.112.151..44507 tcp 14872000 523000
$ host 67.195.112.151 8.8.8.8 151.112.195.67.in-addr.arpa domain name pointer b3091122.crawl.yahoo.net. CIDR: 67.195.0.0/16 NetName: A-YAHOO-US8 so that's yahoo, or really well spoofed. Is this expected/my own fault or what? A number of years ago, there were 1000s of videos on a customer site (training for elderly care, extremely exciting stuff for someone into -1-day movies to post on torrent sites). Customer called me to say his bw was gone, and I checked and found 12 yahoo crawlers hitting the site at 300K/s each (~30Mbps +) downloading all the videos. This was all the more injurious as it was only 2004 and bandwidth was more than $1/mbps back then. I did the really crass thing and nullrouted the whole /20 or whatever they were on per ARIN. It was the new-at-the-time video.yahoo.com search engine coming to index the whole site. I suppose they cant be too slow about it, or they'll never index a whole webfull of videos this century, but still, 12x 300K/s in 2004? (At the time Rasmus though it was kinda funny. I do too, now.) /kc -- Ken Chase - ken@heavycomputing.ca - +1 416 897 6284 - Toronto CANADA Heavy Computing - Clued bandwidth, colocation and managed linux VPS @151 Front St. W.