
I participate on a few other mailing lists focused on eyeball networks. For a couple years I've been hearing complaints from this CDN or that CDN was behaving badly. It's been severely ramping up the past few months. There have been some wild allegations, but I would like to develop a bit more standardized evidence collection. Initially LimeLight was the only culprit, but recently it has been Microsoft as well. I'm not sure if there have been any others. The principal complaint is that upstream of whatever is doing the rate limiting for a given customer there is significantly more capacity being utilized than the customer has purchased. This could happen briefly as TCP adjusts to the capacity limitation, but in some situations this has persisted for days at a time. I'll list out a few situations as best as I can recall them. Some of these may even be merges of a couple situations. The point is to show the general issue and develop a better process for collecting what exactly is happening at the time and how to address it. One situation had approximately 45 megabit/s of capacity being used up by a customer that had a 1.5 megabit/s plan. All other traffic normally held itself within the 1.5 megabit/s, but this particular CDN sent excessively more for extended periods of time. An often occurrence has someone with a single digit megabit/s limitation consuming 2x - 3x more than their plan on the other side of the rate limiter. Last month on my own network I saw someone with 2x - 3x being consumed upstream and they had *190* connections downloading said data from Microsoft. The past week or two I've been hearing of people only having a single connection downloading at more than their plan rate. These situations effectively shut out all other Internet traffic to that customer or even portion of the network for low capacity NLOS areas. It's a DoS caused by downloads. What happened to the days of MS BITS and you didn't even notice the download happening? A lot of these guys think that the CDNs are just a pile of dicks looking to ruin everyone's day and I'm certain that there are at least a couple people at each CDN that aren't that way. ;-) Lots of rambling, sure. What do I need to have these guys collect as evidence of a problem and who should they send it to? ----- Mike Hammett Intelligent Computing Solutions Midwest Internet Exchange The Brothers WISP

On Sep 19, 2016, at 1:34 PM, Mike Hammett <nanog@ics-il.net> wrote:
These situations effectively shut out all other Internet traffic to that customer or even portion of the network for low capacity NLOS areas
I think the growing gap between those with high speed links and so-called slower links will be an ongoing issue. I’ve heard of tricks that the SP can do to avoid super large windows from occurring, but with the increased focus on tcp fast open and this data stuffing, I would expect the impact for these less studied low-speed links to get worse. I’ve been helping a local WISP prepare their fiber OSP installation to try and mitigate some of the problems they have with local capacity and to work around the worsening NLOS situations that occur with annual tree growth. - Jared

On Mon, 19 Sep 2016, Mike Hammett wrote:
The principal complaint is that upstream of whatever is doing the rate limiting for a given customer there is significantly more capacity being utilized than the customer has purchased. This could happen briefly as TCP adjusts to the capacity limitation, but in some situations this has persisted for days at a time. I'll list out a few situations as best as I can recall them. Some of these may even be merges of a couple situations. The point is to show the general issue and develop a better process for collecting what exactly is happening at the time and how to address it.
One situation had approximately 45 megabit/s of capacity being used up by a customer that had a 1.5 megabit/s plan. All other traffic normally held itself within the 1.5 megabit/s, but this particular CDN sent excessively more for extended periods of time.
It sounds like either the rate-limiting just isn't working, or the CDNs are trying too hard to ramp up the transfer rate in spite of your dropping some/most of the packets. I assume drops are happening either as part of the rate-limiting/policing, or simply as a result of trying to stuff 45mbit/s onto a 1.5mbit/s pipe....96.5% packet loss...and they're not slowing down at the sender?!? This is kind of a funny problem though, because CDNs get paid to deliver data, and they get compared/graded according to who can deliver the bits the fastest...and here you are complaining that they're delivering the bits too fast (or at least faster than you'd like them to). ---------------------------------------------------------------------- Jon Lewis, MCP :) | I route | therefore you are _________ http://www.lewis.org/~jlewis/pgp for PGP public key_________

* Jon Lewis:
This is kind of a funny problem though, because CDNs get paid to deliver data, and they get compared/graded according to who can deliver the bits the fastest...and here you are complaining that they're delivering the bits too fast (or at least faster than you'd like them to).
Surely CDNs bill packets which are subsequently dropped by the network. :)

How come we have never seen this problem? We have a ton of DSL and many of those are slow, but no customer complaints about overloaded lines from CDN networks. Could it be that the way you throttle the bandwidth is defect? It is easy to blame the other guy but could it be that you are doing it wrong? Regards, Badur

Likewise, why was it never an issue before and why does it only affect certain types of traffic from certain CDNs? ----- Mike Hammett Intelligent Computing Solutions Midwest Internet Exchange The Brothers WISP ----- Original Message ----- From: "Baldur Norddahl" <baldur.norddahl@gmail.com> To: nanog@nanog.org Sent: Wednesday, September 21, 2016 4:02:30 AM Subject: Re: CDN Overload? How come we have never seen this problem? We have a ton of DSL and many of those are slow, but no customer complaints about overloaded lines from CDN networks. Could it be that the way you throttle the bandwidth is defect? It is easy to blame the other guy but could it be that you are doing it wrong? Regards, Badur

With so many geographically diverse complaints on many hardware routing and switching platforms, I'm going to go with a "no". On Sep 21, 2016 4:04 AM, "Baldur Norddahl" <baldur.norddahl@gmail.com> wrote:
How come we have never seen this problem? We have a ton of DSL and many of those are slow, but no customer complaints about overloaded lines from CDN networks.
Could it be that the way you throttle the bandwidth is defect? It is easy to blame the other guy but could it be that you are doing it wrong?
Regards,
Badur

It appears all complaints are from SP doing wireless. I am going to go with a yes and put forth a these that these guys have a common factor somewhere. It could be equipment from a some popular vendor of wireless or maybe some common method to throttle that is popular in the wireless community. I note that while we have slow links we have no throttling or bandwidth management going on except for the buffering that happens in the DSLAM. Also there is no way to cheat. If you send 4 mbps to a 2 mbps DSL it will drop half of the traffic and TCP will not survive that. The CDN would have an effective transfer rate approaching zero for that customer. That seems to be a rather bad business proposal seen from the view if the CDN so they would not do that. The other customers will be unaffected as the DSLAM itself has plenty of capacity. Regards Baldur Den 21. sep. 2016 14.36 skrev "Josh Reynolds" <josh@kyneticwifi.com>:
With so many geographically diverse complaints on many hardware routing and switching platforms, I'm going to go with a "no".
On Sep 21, 2016 4:04 AM, "Baldur Norddahl" <baldur.norddahl@gmail.com> wrote:
How come we have never seen this problem? We have a ton of DSL and many of those are slow, but no customer complaints about overloaded lines from CDN networks.
Could it be that the way you throttle the bandwidth is defect? It is easy to blame the other guy but could it be that you are doing it wrong?
Regards,
Badur

I've had DSL and AE service providers respond with the issues. So far there is not a common element other than CDNs. That's the point of the questions I'm asking, to gather a ton of information and then figure out how to act on it. You're assuming that the CDNs are using an unmolested, vanilla TCP stack. That may not be the case, especially if doing something like Fast TCP. ----- Mike Hammett Intelligent Computing Solutions Midwest Internet Exchange The Brothers WISP ----- Original Message ----- From: "Baldur Norddahl" <baldur.norddahl@gmail.com> To: nanog@nanog.org Sent: Wednesday, September 21, 2016 9:32:58 AM Subject: Re: CDN Overload? It appears all complaints are from SP doing wireless. I am going to go with a yes and put forth a these that these guys have a common factor somewhere. It could be equipment from a some popular vendor of wireless or maybe some common method to throttle that is popular in the wireless community. I note that while we have slow links we have no throttling or bandwidth management going on except for the buffering that happens in the DSLAM. Also there is no way to cheat. If you send 4 mbps to a 2 mbps DSL it will drop half of the traffic and TCP will not survive that. The CDN would have an effective transfer rate approaching zero for that customer. That seems to be a rather bad business proposal seen from the view if the CDN so they would not do that. The other customers will be unaffected as the DSLAM itself has plenty of capacity. Regards Baldur Den 21. sep. 2016 14.36 skrev "Josh Reynolds" <josh@kyneticwifi.com>:
With so many geographically diverse complaints on many hardware routing and switching platforms, I'm going to go with a "no".
On Sep 21, 2016 4:04 AM, "Baldur Norddahl" <baldur.norddahl@gmail.com> wrote:
How come we have never seen this problem? We have a ton of DSL and many of those are slow, but no customer complaints about overloaded lines from CDN networks.
Could it be that the way you throttle the bandwidth is defect? It is easy to blame the other guy but could it be that you are doing it wrong?
Regards,
Badur

http://www.theregister.co.uk/2016/06/08/is_win_10_ignoring_sysadmins_qos_set... This explains the recent situations (well, not really an explanation, but a bit more information from other people). Not so much for the ones going back a year or two. ----- Mike Hammett Intelligent Computing Solutions Midwest Internet Exchange The Brothers WISP ----- Original Message ----- From: "Mike Hammett" <nanog@ics-il.net> To: "NANOG" <nanog@nanog.org> Sent: Monday, September 19, 2016 12:34:48 PM Subject: CDN Overload? I participate on a few other mailing lists focused on eyeball networks. For a couple years I've been hearing complaints from this CDN or that CDN was behaving badly. It's been severely ramping up the past few months. There have been some wild allegations, but I would like to develop a bit more standardized evidence collection. Initially LimeLight was the only culprit, but recently it has been Microsoft as well. I'm not sure if there have been any others. The principal complaint is that upstream of whatever is doing the rate limiting for a given customer there is significantly more capacity being utilized than the customer has purchased. This could happen briefly as TCP adjusts to the capacity limitation, but in some situations this has persisted for days at a time. I'll list out a few situations as best as I can recall them. Some of these may even be merges of a couple situations. The point is to show the general issue and develop a better process for collecting what exactly is happening at the time and how to address it. One situation had approximately 45 megabit/s of capacity being used up by a customer that had a 1.5 megabit/s plan. All other traffic normally held itself within the 1.5 megabit/s, but this particular CDN sent excessively more for extended periods of time. An often occurrence has someone with a single digit megabit/s limitation consuming 2x - 3x more than their plan on the other side of the rate limiter. Last month on my own network I saw someone with 2x - 3x being consumed upstream and they had *190* connections downloading said data from Microsoft. The past week or two I've been hearing of people only having a single connection downloading at more than their plan rate. These situations effectively shut out all other Internet traffic to that customer or even portion of the network for low capacity NLOS areas. It's a DoS caused by downloads. What happened to the days of MS BITS and you didn't even notice the download happening? A lot of these guys think that the CDNs are just a pile of dicks looking to ruin everyone's day and I'm certain that there are at least a couple people at each CDN that aren't that way. ;-) Lots of rambling, sure. What do I need to have these guys collect as evidence of a problem and who should they send it to? ----- Mike Hammett Intelligent Computing Solutions Midwest Internet Exchange The Brothers WISP

I have witnessed this issue first hand for several years. Four for sure, maybe five or six. The very first one I remember is a customer doing Usenet downloads and using what he called an "internet download manager" which I assumed was screwing with TCP ACKs. I believe he was a 4Mbps user at the time and this download manager thing was causing 2 to maybe 2.5x his subscribed rate, as Mike says, on the upstream facing router interface. He shut down or uninstalled the software and it stopped. Yes, this customer is on PTMP fixed wireless. Traffic policing was taking place via MikroTik simple queue at the site router.. I could cut his downstream rate in half and it would follow with double still hitting the backhaul. I could also move his queue all the way to the border router and it was still there at double rate. BTW, we still have this guy as a customer on fixed wireless. He's been on 25/5Mbps for over a year. And we're about to upgrade him to 50/10Mbps with new gear. 25/5 and 50/10 is a far cry from this claimed "slow" WISP service. This shit ain't cheap to get to bumfsck Illinois so farmer Joe can watch porn and his kids can watch Netflix at the same time. Yup, we have slow NLOS service too, because customers decide they want the rural life buried in a mile of trees while "needing" the city benefits. If you want the gigabits, then move outta the sticks. Running a hundred combined miles of fiber to get to 20 customers that want to pay less than $50/mo is not feasible. /rant off Another time, maybe three years ago, we had a customer on Canopy 5.7 FSK at 4/1Mbps using the built-in QoS. He was watching Netflix and I saw 8Mbps hitting the AP's ethernet interface. I thought the Canopy scheduler was broken. Until I looked deeper and saw that it was working exactly as designed.. with 50% discard rate on his VC. I want to say this was from LLNW at the time. I could be totally wrong about that, I really don't remember. Now lets move the Windows 10 updates. A 'buried in the sticks' customer on Canopy 900 FSK. 1.5Mbps/384k. Multiple streams from Microsoft and LLNW at the same time. LLNW alone had maybe 10 streams going and was sending at over 15Mbps on average and at worst about 25Mbps... to a 1.5Mbps subscriber. I could throw in a MikroTik queue upstream which only moved the problem as that 15-25Mbps was still hitting backhaul links. And when I have a 100Mbps link going into the site, 25Mbps is a lot. We've had numerous customers call in for the last month or two with 'teh innernets is down, my phoen wyfy don't work either'. No, your Windows 10 updates are overloading your service. Shut off your PC to use your internet service. Telling a customer those exact words is ridiculous, but we have to do it. We had a known issue with a particular licensed microwave vendor's radios that we have in use. It was the ethernet buffer becoming saturated at nowhere near the RF link capacity. They put out a new software release and that was resolved. And that was well before this Windows 10 update overload stuff started. Normal TCP congestion control behavior works perfectly fine. It's not the network. It's the sender not doing normal TCP stuffs. I don't know why the CDNs and/or Microsoft thinks this is a good idea, but to me, it looks like a DDoS. I'm on some of the same lists as Mike and we know of many others reporting similar issues. A couple to the tune of 50-100Mbps overload destined for 5 or 10Mbps tier subscribers. So thanks to Mike for trying to get a conversation going on this topic. And it's not just us red headed step children WISPs. On 9/19/2016 10:05 PM, Mike Hammett wrote:
http://www.theregister.co.uk/2016/06/08/is_win_10_ignoring_sysadmins_qos_set...
This explains the recent situations (well, not really an explanation, but a bit more information from other people). Not so much for the ones going back a year or two.
----- Mike Hammett Intelligent Computing Solutions
Midwest Internet Exchange
The Brothers WISP
----- Original Message -----
From: "Mike Hammett" <nanog@ics-il.net> To: "NANOG" <nanog@nanog.org> Sent: Monday, September 19, 2016 12:34:48 PM Subject: CDN Overload?
I participate on a few other mailing lists focused on eyeball networks. For a couple years I've been hearing complaints from this CDN or that CDN was behaving badly. It's been severely ramping up the past few months. There have been some wild allegations, but I would like to develop a bit more standardized evidence collection. Initially LimeLight was the only culprit, but recently it has been Microsoft as well. I'm not sure if there have been any others.
The principal complaint is that upstream of whatever is doing the rate limiting for a given customer there is significantly more capacity being utilized than the customer has purchased. This could happen briefly as TCP adjusts to the capacity limitation, but in some situations this has persisted for days at a time. I'll list out a few situations as best as I can recall them. Some of these may even be merges of a couple situations. The point is to show the general issue and develop a better process for collecting what exactly is happening at the time and how to address it.
One situation had approximately 45 megabit/s of capacity being used up by a customer that had a 1.5 megabit/s plan. All other traffic normally held itself within the 1.5 megabit/s, but this particular CDN sent excessively more for extended periods of time.
An often occurrence has someone with a single digit megabit/s limitation consuming 2x - 3x more than their plan on the other side of the rate limiter.
Last month on my own network I saw someone with 2x - 3x being consumed upstream and they had *190* connections downloading said data from Microsoft.
The past week or two I've been hearing of people only having a single connection downloading at more than their plan rate.
These situations effectively shut out all other Internet traffic to that customer or even portion of the network for low capacity NLOS areas. It's a DoS caused by downloads. What happened to the days of MS BITS and you didn't even notice the download happening? A lot of these guys think that the CDNs are just a pile of dicks looking to ruin everyone's day and I'm certain that there are at least a couple people at each CDN that aren't that way. ;-)
Lots of rambling, sure. What do I need to have these guys collect as evidence of a problem and who should they send it to?
----- Mike Hammett Intelligent Computing Solutions
Midwest Internet Exchange
The Brothers WISP

On 20 Sep 2016 9:14 am, "George Skorup" <george@cbcast.com> wrote:
Now lets move the Windows 10 updates. A 'buried in the sticks' customer on Canopy 900 FSK. 1.5Mbps/384k. Multiple streams from Microsoft and LLNW at the same time. LLNW alone had maybe 10 streams going and was sending at over 15Mbps on average and at worst about 25Mbps... to a 1.5Mbps subscriber. I could throw in a MikroTik queue upstream which only moved the
problem as that 15-25Mbps was still hitting backhaul links. And when I have a 100Mbps link going into the site, 25Mbps is a lot. Maybe I'm being naive but this sounds like an issue primarily with buffers. Police rather than shape the traffic, and reduce the burst size, and a lot of this should disappear... M

What do most broadband platforms do for rate limiting? ----- Mike Hammett Intelligent Computing Solutions Midwest Internet Exchange The Brothers WISP ----- Original Message ----- From: "Matthew Walster" <matthew@walster.org> To: "George Skorup" <george@cbcast.com> Cc: "nanog list" <nanog@nanog.org> Sent: Tuesday, September 20, 2016 2:44:24 AM Subject: Re: CDN Overload? On 20 Sep 2016 9:14 am, "George Skorup" <george@cbcast.com> wrote:
Now lets move the Windows 10 updates. A 'buried in the sticks' customer on Canopy 900 FSK. 1.5Mbps/384k. Multiple streams from Microsoft and LLNW at the same time. LLNW alone had maybe 10 streams going and was sending at over 15Mbps on average and at worst about 25Mbps... to a 1.5Mbps subscriber. I could throw in a MikroTik queue upstream which only moved the
problem as that 15-25Mbps was still hitting backhaul links. And when I have a 100Mbps link going into the site, 25Mbps is a lot. Maybe I'm being naive but this sounds like an issue primarily with buffers. Police rather than shape the traffic, and reduce the burst size, and a lot of this should disappear... M

This is what I'm asking of them: ===== Have you seen a CDN overloading a customer? Help me gather information on the issue. What CDN? What have you identified the traffic to be? What is the access network? Where is the rate limiting done? How is the rate limiting done (policing vs. queueing, SFQ, PFIFO, etc,, etc.)? What is doing the rate limiting? What is the rate-limit set to? Upstream of the rate-limiter, what are you seeing for inbound traffic? One connection or many? How much traffic? How does other traffic behave when exceeding the rate limit? Where is NAT performed? What is doing NAT? Shared NAT or isolated to that customer? Have you done a packet capture before and after the rate limiter? The NAT device? Would you be willing to send a filtered packet capture (only the frames that relate to this CDN) to the CDN if they want it? There have been reports of CDNs sending more traffic than the customer can handle and ignores TCP convention to slow down. Trying to investigate this thoroughly so we can get the CDN to fix their system. Multiple CDNs have been shown to do this. ===== ----- Mike Hammett Intelligent Computing Solutions Midwest Internet Exchange The Brothers WISP ----- Original Message ----- From: "Mike Hammett" <nanog@ics-il.net> To: "NANOG" <nanog@nanog.org> Sent: Monday, September 19, 2016 12:34:48 PM Subject: CDN Overload? I participate on a few other mailing lists focused on eyeball networks. For a couple years I've been hearing complaints from this CDN or that CDN was behaving badly. It's been severely ramping up the past few months. There have been some wild allegations, but I would like to develop a bit more standardized evidence collection. Initially LimeLight was the only culprit, but recently it has been Microsoft as well. I'm not sure if there have been any others. The principal complaint is that upstream of whatever is doing the rate limiting for a given customer there is significantly more capacity being utilized than the customer has purchased. This could happen briefly as TCP adjusts to the capacity limitation, but in some situations this has persisted for days at a time. I'll list out a few situations as best as I can recall them. Some of these may even be merges of a couple situations. The point is to show the general issue and develop a better process for collecting what exactly is happening at the time and how to address it. One situation had approximately 45 megabit/s of capacity being used up by a customer that had a 1.5 megabit/s plan. All other traffic normally held itself within the 1.5 megabit/s, but this particular CDN sent excessively more for extended periods of time. An often occurrence has someone with a single digit megabit/s limitation consuming 2x - 3x more than their plan on the other side of the rate limiter. Last month on my own network I saw someone with 2x - 3x being consumed upstream and they had *190* connections downloading said data from Microsoft. The past week or two I've been hearing of people only having a single connection downloading at more than their plan rate. These situations effectively shut out all other Internet traffic to that customer or even portion of the network for low capacity NLOS areas. It's a DoS caused by downloads. What happened to the days of MS BITS and you didn't even notice the download happening? A lot of these guys think that the CDNs are just a pile of dicks looking to ruin everyone's day and I'm certain that there are at least a couple people at each CDN that aren't that way. ;-) Lots of rambling, sure. What do I need to have these guys collect as evidence of a problem and who should they send it to? ----- Mike Hammett Intelligent Computing Solutions Midwest Internet Exchange The Brothers WISP

That's interesting. Once, a few years/jobs/etc ago, I observed a flow from mobile youtube being really really bursty, peaking to a 40-50mbps on a 10mbps circuit, but that was the only time I've ever seen such an issue. After that one flow died, it never happened again. That aside, I do work for a business-focused mostly-wireless SP at the moment and I haven't had any issues with CDNs so far. The only similar incidents I can recall involved customers running programs like aspera and signiant which, when misconfigured, can result in quite some volume coming your way. My thoughts and words are my own. Spyros This e-mail and any attachment(s) contained within are confidential and are intended only for the use of the individual to whom they are addressed. The information contained in this communication may be privileged, or exempt from disclosure. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender and delete the communication without retaining any copies. Connecticore SA is not responsible for, nor endorses, any opinion, recommendation, conclusion, solicitation, offer or agreement or any information contained in this communication.

https://goo.gl/forms/LvgFRsMdNdI8E9HF3 I have made this into a Google Form to make it easier to track compared to randomly formatted responses on multiple mailing lists, Facebook Groups, etc. ----- Mike Hammett Intelligent Computing Solutions Midwest Internet Exchange The Brothers WISP ----- Original Message ----- From: "Mike Hammett" <nanog@ics-il.net> To: "NANOG" <nanog@nanog.org> Sent: Monday, September 19, 2016 12:34:48 PM Subject: CDN Overload? I participate on a few other mailing lists focused on eyeball networks. For a couple years I've been hearing complaints from this CDN or that CDN was behaving badly. It's been severely ramping up the past few months. There have been some wild allegations, but I would like to develop a bit more standardized evidence collection. Initially LimeLight was the only culprit, but recently it has been Microsoft as well. I'm not sure if there have been any others. The principal complaint is that upstream of whatever is doing the rate limiting for a given customer there is significantly more capacity being utilized than the customer has purchased. This could happen briefly as TCP adjusts to the capacity limitation, but in some situations this has persisted for days at a time. I'll list out a few situations as best as I can recall them. Some of these may even be merges of a couple situations. The point is to show the general issue and develop a better process for collecting what exactly is happening at the time and how to address it. One situation had approximately 45 megabit/s of capacity being used up by a customer that had a 1.5 megabit/s plan. All other traffic normally held itself within the 1.5 megabit/s, but this particular CDN sent excessively more for extended periods of time. An often occurrence has someone with a single digit megabit/s limitation consuming 2x - 3x more than their plan on the other side of the rate limiter. Last month on my own network I saw someone with 2x - 3x being consumed upstream and they had *190* connections downloading said data from Microsoft. The past week or two I've been hearing of people only having a single connection downloading at more than their plan rate. These situations effectively shut out all other Internet traffic to that customer or even portion of the network for low capacity NLOS areas. It's a DoS caused by downloads. What happened to the days of MS BITS and you didn't even notice the download happening? A lot of these guys think that the CDNs are just a pile of dicks looking to ruin everyone's day and I'm certain that there are at least a couple people at each CDN that aren't that way. ;-) Lots of rambling, sure. What do I need to have these guys collect as evidence of a problem and who should they send it to? ----- Mike Hammett Intelligent Computing Solutions Midwest Internet Exchange The Brothers WISP

https://docs.google.com/spreadsheets/d/1Jdm0dOBf81kSnXEvVfI6ZJbWFNt5AbYUV8CD... I have made the anonymized answers public. This will obviously have some bias to it given that I mostly know fixed wireless operators, but I'm hoping this gets some good distribution to catch more platforms. ----- Mike Hammett Intelligent Computing Solutions Midwest Internet Exchange The Brothers WISP ----- Original Message ----- From: "Mike Hammett" <nanog@ics-il.net> To: "NANOG" <nanog@nanog.org> Sent: Wednesday, September 21, 2016 9:08:55 AM Subject: Re: CDN Overload? https://goo.gl/forms/LvgFRsMdNdI8E9HF3 I have made this into a Google Form to make it easier to track compared to randomly formatted responses on multiple mailing lists, Facebook Groups, etc. ----- Mike Hammett Intelligent Computing Solutions Midwest Internet Exchange The Brothers WISP ----- Original Message ----- From: "Mike Hammett" <nanog@ics-il.net> To: "NANOG" <nanog@nanog.org> Sent: Monday, September 19, 2016 12:34:48 PM Subject: CDN Overload? I participate on a few other mailing lists focused on eyeball networks. For a couple years I've been hearing complaints from this CDN or that CDN was behaving badly. It's been severely ramping up the past few months. There have been some wild allegations, but I would like to develop a bit more standardized evidence collection. Initially LimeLight was the only culprit, but recently it has been Microsoft as well. I'm not sure if there have been any others. The principal complaint is that upstream of whatever is doing the rate limiting for a given customer there is significantly more capacity being utilized than the customer has purchased. This could happen briefly as TCP adjusts to the capacity limitation, but in some situations this has persisted for days at a time. I'll list out a few situations as best as I can recall them. Some of these may even be merges of a couple situations. The point is to show the general issue and develop a better process for collecting what exactly is happening at the time and how to address it. One situation had approximately 45 megabit/s of capacity being used up by a customer that had a 1.5 megabit/s plan. All other traffic normally held itself within the 1.5 megabit/s, but this particular CDN sent excessively more for extended periods of time. An often occurrence has someone with a single digit megabit/s limitation consuming 2x - 3x more than their plan on the other side of the rate limiter. Last month on my own network I saw someone with 2x - 3x being consumed upstream and they had *190* connections downloading said data from Microsoft. The past week or two I've been hearing of people only having a single connection downloading at more than their plan rate. These situations effectively shut out all other Internet traffic to that customer or even portion of the network for low capacity NLOS areas. It's a DoS caused by downloads. What happened to the days of MS BITS and you didn't even notice the download happening? A lot of these guys think that the CDNs are just a pile of dicks looking to ruin everyone's day and I'm certain that there are at least a couple people at each CDN that aren't that way. ;-) Lots of rambling, sure. What do I need to have these guys collect as evidence of a problem and who should they send it to? ----- Mike Hammett Intelligent Computing Solutions Midwest Internet Exchange The Brothers WISP

Mike, I will forward to the requisite group for a look. Have you brought this to our attention previously? I don't see anything. If you did, please forward me the ticket numbers or message(s) (peering@ is best) so wee can track down and see if someone already has it in queue. Jared alluded to fasttcp a few emails ago. Astute man. Best, Martin Hannigan AS 20940 // AS 32787
On Sep 21, 2016, at 14:30, Mike Hammett <nanog@ics-il.net> wrote:
https://docs.google.com/spreadsheets/d/1Jdm0dOBf81kSnXEvVfI6ZJbWFNt5AbYUV8CD...
I have made the anonymized answers public. This will obviously have some bias to it given that I mostly know fixed wireless operators, but I'm hoping this gets some good distribution to catch more platforms.
----- Mike Hammett Intelligent Computing Solutions
Midwest Internet Exchange
The Brothers WISP
----- Original Message -----
From: "Mike Hammett" <nanog@ics-il.net> To: "NANOG" <nanog@nanog.org> Sent: Wednesday, September 21, 2016 9:08:55 AM Subject: Re: CDN Overload?
https://goo.gl/forms/LvgFRsMdNdI8E9HF3
I have made this into a Google Form to make it easier to track compared to randomly formatted responses on multiple mailing lists, Facebook Groups, etc.
----- Mike Hammett Intelligent Computing Solutions
Midwest Internet Exchange
The Brothers WISP
----- Original Message -----
From: "Mike Hammett" <nanog@ics-il.net> To: "NANOG" <nanog@nanog.org> Sent: Monday, September 19, 2016 12:34:48 PM Subject: CDN Overload?
I participate on a few other mailing lists focused on eyeball networks. For a couple years I've been hearing complaints from this CDN or that CDN was behaving badly. It's been severely ramping up the past few months. There have been some wild allegations, but I would like to develop a bit more standardized evidence collection. Initially LimeLight was the only culprit, but recently it has been Microsoft as well. I'm not sure if there have been any others.
The principal complaint is that upstream of whatever is doing the rate limiting for a given customer there is significantly more capacity being utilized than the customer has purchased. This could happen briefly as TCP adjusts to the capacity limitation, but in some situations this has persisted for days at a time. I'll list out a few situations as best as I can recall them. Some of these may even be merges of a couple situations. The point is to show the general issue and develop a better process for collecting what exactly is happening at the time and how to address it.
One situation had approximately 45 megabit/s of capacity being used up by a customer that had a 1.5 megabit/s plan. All other traffic normally held itself within the 1.5 megabit/s, but this particular CDN sent excessively more for extended periods of time.
An often occurrence has someone with a single digit megabit/s limitation consuming 2x - 3x more than their plan on the other side of the rate limiter.
Last month on my own network I saw someone with 2x - 3x being consumed upstream and they had *190* connections downloading said data from Microsoft.
The past week or two I've been hearing of people only having a single connection downloading at more than their plan rate.
These situations effectively shut out all other Internet traffic to that customer or even portion of the network for low capacity NLOS areas. It's a DoS caused by downloads. What happened to the days of MS BITS and you didn't even notice the download happening? A lot of these guys think that the CDNs are just a pile of dicks looking to ruin everyone's day and I'm certain that there are at least a couple people at each CDN that aren't that way. ;-)
Lots of rambling, sure. What do I need to have these guys collect as evidence of a problem and who should they send it to?
----- Mike Hammett Intelligent Computing Solutions
Midwest Internet Exchange
The Brothers WISP

Thanks Marty. I have only experienced this on my network once and it was directly with Microsoft, so I haven't done much until a couple days ago when I started this campaign. I don't know if anyone else has brought this to anyone's attention. I just sent an e-mail to Owen when I saw yours. ----- Mike Hammett Intelligent Computing Solutions Midwest Internet Exchange The Brothers WISP ----- Original Message ----- From: "Martin Hannigan" <hannigan@gmail.com> To: "Mike Hammett" <nanog@ics-il.net> Cc: "NANOG" <nanog@nanog.org> Sent: Wednesday, September 21, 2016 8:19:35 PM Subject: Re: CDN Overload? Mike, I will forward to the requisite group for a look. Have you brought this to our attention previously? I don't see anything. If you did, please forward me the ticket numbers or message(s) (peering@ is best) so wee can track down and see if someone already has it in queue. Jared alluded to fasttcp a few emails ago. Astute man. Best, Martin Hannigan AS 20940 // AS 32787 On Sep 21, 2016, at 14:30, Mike Hammett < nanog@ics-il.net > wrote: https://docs.google.com/spreadsheets/d/1Jdm0dOBf81kSnXEvVfI6ZJbWFNt5AbYUV8CD... I have made the anonymized answers public. This will obviously have some bias to it given that I mostly know fixed wireless operators, but I'm hoping this gets some good distribution to catch more platforms. ----- Mike Hammett Intelligent Computing Solutions Midwest Internet Exchange The Brothers WISP ----- Original Message ----- From: "Mike Hammett" < nanog@ics-il.net > To: "NANOG" < nanog@nanog.org > Sent: Wednesday, September 21, 2016 9:08:55 AM Subject: Re: CDN Overload? https://goo.gl/forms/LvgFRsMdNdI8E9HF3 I have made this into a Google Form to make it easier to track compared to randomly formatted responses on multiple mailing lists, Facebook Groups, etc. ----- Mike Hammett Intelligent Computing Solutions Midwest Internet Exchange The Brothers WISP ----- Original Message ----- From: "Mike Hammett" < nanog@ics-il.net > To: "NANOG" < nanog@nanog.org > Sent: Monday, September 19, 2016 12:34:48 PM Subject: CDN Overload? I participate on a few other mailing lists focused on eyeball networks. For a couple years I've been hearing complaints from this CDN or that CDN was behaving badly. It's been severely ramping up the past few months. There have been some wild allegations, but I would like to develop a bit more standardized evidence collection. Initially LimeLight was the only culprit, but recently it has been Microsoft as well. I'm not sure if there have been any others. The principal complaint is that upstream of whatever is doing the rate limiting for a given customer there is significantly more capacity being utilized than the customer has purchased. This could happen briefly as TCP adjusts to the capacity limitation, but in some situations this has persisted for days at a time. I'll list out a few situations as best as I can recall them. Some of these may even be merges of a couple situations. The point is to show the general issue and develop a better process for collecting what exactly is happening at the time and how to address it. One situation had approximately 45 megabit/s of capacity being used up by a customer that had a 1.5 megabit/s plan. All other traffic normally held itself within the 1.5 megabit/s, but this particular CDN sent excessively more for extended periods of time. An often occurrence has someone with a single digit megabit/s limitation consuming 2x - 3x more than their plan on the other side of the rate limiter. Last month on my own network I saw someone with 2x - 3x being consumed upstream and they had *190* connections downloading said data from Microsoft. The past week or two I've been hearing of people only having a single connection downloading at more than their plan rate. These situations effectively shut out all other Internet traffic to that customer or even portion of the network for low capacity NLOS areas. It's a DoS caused by downloads. What happened to the days of MS BITS and you didn't even notice the download happening? A lot of these guys think that the CDNs are just a pile of dicks looking to ruin everyone's day and I'm certain that there are at least a couple people at each CDN that aren't that way. ;-) Lots of rambling, sure. What do I need to have these guys collect as evidence of a problem and who should they send it to? ----- Mike Hammett Intelligent Computing Solutions Midwest Internet Exchange The Brothers WISP

No problem. If you can drop a pcap file somewhere we can reach (and drop me an email where) that was created during the event that'd be great. Thanks again, and great use of the list. Best, Martin Hannigan AS 20940 // AS 32787
On Sep 21, 2016, at 15:29, Mike Hammett <nanog@ics-il.net> wrote:
Thanks Marty. I have only experienced this on my network once and it was directly with Microsoft, so I haven't done much until a couple days ago when I started this campaign. I don't know if anyone else has brought this to anyone's attention. I just sent an e-mail to Owen when I saw yours.
----- Mike Hammett Intelligent Computing Solutions
Midwest Internet Exchange
The Brothers WISP
From: "Martin Hannigan" <hannigan@gmail.com> To: "Mike Hammett" <nanog@ics-il.net> Cc: "NANOG" <nanog@nanog.org> Sent: Wednesday, September 21, 2016 8:19:35 PM Subject: Re: CDN Overload?
Mike,
I will forward to the requisite group for a look. Have you brought this to our attention previously? I don't see anything. If you did, please forward me the ticket numbers or message(s) (peering@ is best) so wee can track down and see if someone already has it in queue.
Jared alluded to fasttcp a few emails ago. Astute man.
Best,
Martin Hannigan AS 20940 // AS 32787
On Sep 21, 2016, at 14:30, Mike Hammett <nanog@ics-il.net> wrote:
https://docs.google.com/spreadsheets/d/1Jdm0dOBf81kSnXEvVfI6ZJbWFNt5AbYUV8CD...
I have made the anonymized answers public. This will obviously have some bias to it given that I mostly know fixed wireless operators, but I'm hoping this gets some good distribution to catch more platforms.
----- Mike Hammett Intelligent Computing Solutions
Midwest Internet Exchange
The Brothers WISP
----- Original Message -----
From: "Mike Hammett" <nanog@ics-il.net> To: "NANOG" <nanog@nanog.org> Sent: Wednesday, September 21, 2016 9:08:55 AM Subject: Re: CDN Overload?
https://goo.gl/forms/LvgFRsMdNdI8E9HF3
I have made this into a Google Form to make it easier to track compared to randomly formatted responses on multiple mailing lists, Facebook Groups, etc.
----- Mike Hammett Intelligent Computing Solutions
Midwest Internet Exchange
The Brothers WISP
----- Original Message -----
From: "Mike Hammett" <nanog@ics-il.net> To: "NANOG" <nanog@nanog.org> Sent: Monday, September 19, 2016 12:34:48 PM Subject: CDN Overload?
I participate on a few other mailing lists focused on eyeball networks. For a couple years I've been hearing complaints from this CDN or that CDN was behaving badly. It's been severely ramping up the past few months. There have been some wild allegations, but I would like to develop a bit more standardized evidence collection. Initially LimeLight was the only culprit, but recently it has been Microsoft as well. I'm not sure if there have been any others.
The principal complaint is that upstream of whatever is doing the rate limiting for a given customer there is significantly more capacity being utilized than the customer has purchased. This could happen briefly as TCP adjusts to the capacity limitation, but in some situations this has persisted for days at a time. I'll list out a few situations as best as I can recall them. Some of these may even be merges of a couple situations. The point is to show the general issue and develop a better process for collecting what exactly is happening at the time and how to address it.
One situation had approximately 45 megabit/s of capacity being used up by a customer that had a 1.5 megabit/s plan. All other traffic normally held itself within the 1.5 megabit/s, but this particular CDN sent excessively more for extended periods of time.
An often occurrence has someone with a single digit megabit/s limitation consuming 2x - 3x more than their plan on the other side of the rate limiter.
Last month on my own network I saw someone with 2x - 3x being consumed upstream and they had *190* connections downloading said data from Microsoft.
The past week or two I've been hearing of people only having a single connection downloading at more than their plan rate.
These situations effectively shut out all other Internet traffic to that customer or even portion of the network for low capacity NLOS areas. It's a DoS caused by downloads. What happened to the days of MS BITS and you didn't even notice the download happening? A lot of these guys think that the CDNs are just a pile of dicks looking to ruin everyone's day and I'm certain that there are at least a couple people at each CDN that aren't that way. ;-)
Lots of rambling, sure. What do I need to have these guys collect as evidence of a problem and who should they send it to?
----- Mike Hammett Intelligent Computing Solutions
Midwest Internet Exchange
The Brothers WISP

I have seen traffic from Microsoft in Europe to single hosts on our campus that seemed to be unusually (high bps) and long. I don’t recall if the few multiple hosts I noticed this on over time were only on our campus wifi. If not perhaps the common factor is longer latency? Both connects over wireless and connections from Europe to the US would have longer latency. Perhaps this longer latency combined with some other factor is triggering a but in modern TCP Congestion Control algorithms? This mentions that there have been bugs in TCP Congestion Control algorithm implementations. Perhaps there could be other bugs that result in the descried issue? https://www.microsoft.com/en-us/research/wp-content/uploads/2016/08/ms_feb07... I have seen cases on our campus where too small buffers on an ethernet switch caused a Linux TCP Congestion Control algorithm to act badly resulting in slower downloads than a simple algorithm that depended on dropped packets rather than trying to determine window sizes etc. The fix in that case was to increase the buffer size. Of course buffer bloat is also known to play havoc with TCP Congestion Control algorithms. Just wondering if some combination of higher latency and another unknown variable or just a bug might cause a TCP Congestion Control algorithm to think it can safely try to increase the transmit rate?
On Sep 21, 2016, at 8:29 PM, Mike Hammett <nanog@ics-il.net> wrote:
Thanks Marty. I have only experienced this on my network once and it was directly with Microsoft, so I haven't done much until a couple days ago when I started this campaign. I don't know if anyone else has brought this to anyone's attention. I just sent an e-mail to Owen when I saw yours.
----- Mike Hammett Intelligent Computing Solutions
Midwest Internet Exchange
The Brothers WISP
----- Original Message -----
From: "Martin Hannigan" <hannigan@gmail.com> To: "Mike Hammett" <nanog@ics-il.net> Cc: "NANOG" <nanog@nanog.org> Sent: Wednesday, September 21, 2016 8:19:35 PM Subject: Re: CDN Overload?
Mike,
I will forward to the requisite group for a look. Have you brought this to our attention previously? I don't see anything. If you did, please forward me the ticket numbers or message(s) (peering@ is best) so wee can track down and see if someone already has it in queue.
Jared alluded to fasttcp a few emails ago. Astute man.
Best,
Martin Hannigan AS 20940 // AS 32787
On Sep 21, 2016, at 14:30, Mike Hammett < nanog@ics-il.net > wrote:
https://docs.google.com/spreadsheets/d/1Jdm0dOBf81kSnXEvVfI6ZJbWFNt5AbYUV8CD...
I have made the anonymized answers public. This will obviously have some bias to it given that I mostly know fixed wireless operators, but I'm hoping this gets some good distribution to catch more platforms.
----- Mike Hammett Intelligent Computing Solutions
Midwest Internet Exchange
The Brothers WISP
----- Original Message -----
From: "Mike Hammett" < nanog@ics-il.net > To: "NANOG" < nanog@nanog.org > Sent: Wednesday, September 21, 2016 9:08:55 AM Subject: Re: CDN Overload?
https://goo.gl/forms/LvgFRsMdNdI8E9HF3
I have made this into a Google Form to make it easier to track compared to randomly formatted responses on multiple mailing lists, Facebook Groups, etc.
----- Mike Hammett Intelligent Computing Solutions
Midwest Internet Exchange
The Brothers WISP
----- Original Message -----
From: "Mike Hammett" < nanog@ics-il.net > To: "NANOG" < nanog@nanog.org > Sent: Monday, September 19, 2016 12:34:48 PM Subject: CDN Overload?
I participate on a few other mailing lists focused on eyeball networks. For a couple years I've been hearing complaints from this CDN or that CDN was behaving badly. It's been severely ramping up the past few months. There have been some wild allegations, but I would like to develop a bit more standardized evidence collection. Initially LimeLight was the only culprit, but recently it has been Microsoft as well. I'm not sure if there have been any others.
The principal complaint is that upstream of whatever is doing the rate limiting for a given customer there is significantly more capacity being utilized than the customer has purchased. This could happen briefly as TCP adjusts to the capacity limitation, but in some situations this has persisted for days at a time. I'll list out a few situations as best as I can recall them. Some of these may even be merges of a couple situations. The point is to show the general issue and develop a better process for collecting what exactly is happening at the time and how to address it.
One situation had approximately 45 megabit/s of capacity being used up by a customer that had a 1.5 megabit/s plan. All other traffic normally held itself within the 1.5 megabit/s, but this particular CDN sent excessively more for extended periods of time.
An often occurrence has someone with a single digit megabit/s limitation consuming 2x - 3x more than their plan on the other side of the rate limiter.
Last month on my own network I saw someone with 2x - 3x being consumed upstream and they had *190* connections downloading said data from Microsoft.
The past week or two I've been hearing of people only having a single connection downloading at more than their plan rate.
These situations effectively shut out all other Internet traffic to that customer or even portion of the network for low capacity NLOS areas. It's a DoS caused by downloads. What happened to the days of MS BITS and you didn't even notice the download happening? A lot of these guys think that the CDNs are just a pile of dicks looking to ruin everyone's day and I'm certain that there are at least a couple people at each CDN that aren't that way. ;-)
Lots of rambling, sure. What do I need to have these guys collect as evidence of a problem and who should they send it to?
----- Mike Hammett Intelligent Computing Solutions
Midwest Internet Exchange
The Brothers WISP
--- Bruce Curtis bruce.curtis@ndsu.edu Certified NetAnalyst II 701-231-8527 North Dakota State University

Do we have any contacts at Microsoft that we can talk to about this? This time around, they are the common denominator. I know people have been complaining about this for longer than Windows 10 has been out, so there must be some other reasons why other parties we are to blame. -----Mike HammettIntelligent Computing SolutionsMidwest Internet ExchangeThe Brothers WISP ----- Original Message ----- From: Bruce Curtis <bruce.curtis@ndsu.edu> To: Mike Hammett <nanog@ics-il.net> Cc: Martin Hannigan <hannigan@gmail.com>, NANOG <nanog@nanog.org> Sent: Thu, 22 Sep 2016 16:28:17 -0500 (CDT) Subject: Re: CDN Overload? I have seen traffic from Microsoft in Europe to single hosts on our campus that seemed to be unusually (high bps) and long. I don’t recall if the few multiple hosts I noticed this on over time were only on our campus wifi. If not perhaps the common factor is longer latency? Both connects over wireless and connections from Europe to the US would have longer latency. Perhaps this longer latency combined with some other factor is triggering a but in modern TCP Congestion Control algorithms? This mentions that there have been bugs in TCP Congestion Control algorithm implementations. Perhaps there could be other bugs that result in the descried issue? https://www.microsoft.com/en-us/research/wp-content/uploads/2016/08/ms_feb07... I have seen cases on our campus where too small buffers on an ethernet switch caused a Linux TCP Congestion Control algorithm to act badly resulting in slower downloads than a simple algorithm that depended on dropped packets rather than trying to determine window sizes etc. The fix in that case was to increase the buffer size. Of course buffer bloat is also known to play havoc with TCP Congestion Control algorithms. Just wondering if some combination of higher latency and another unknown variable or just a bug might cause a TCP Congestion Control algorithm to think it can safely try to increase the transmit rate?
On Sep 21, 2016, at 8:29 PM, Mike Hammett <nanog@ics-il.net> wrote:
Thanks Marty. I have only experienced this on my network once and it was directly with Microsoft, so I haven't done much until a couple days ago when I started this campaign. I don't know if anyone else has brought this to anyone's attention. I just sent an e-mail to Owen when I saw yours.
----- Mike Hammett Intelligent Computing Solutions
Midwest Internet Exchange
The Brothers WISP
----- Original Message -----
From: "Martin Hannigan" <hannigan@gmail.com> To: "Mike Hammett" <nanog@ics-il.net> Cc: "NANOG" <nanog@nanog.org> Sent: Wednesday, September 21, 2016 8:19:35 PM Subject: Re: CDN Overload?
Mike,
I will forward to the requisite group for a look. Have you brought this to our attention previously? I don't see anything. If you did, please forward me the ticket numbers or message(s) (peering@ is best) so wee can track down and see if someone already has it in queue.
Jared alluded to fasttcp a few emails ago. Astute man.
Best,
Martin Hannigan AS 20940 // AS 32787
On Sep 21, 2016, at 14:30, Mike Hammett < nanog@ics-il.net > wrote:
https://docs.google.com/spreadsheets/d/1Jdm0dOBf81kSnXEvVfI6ZJbWFNt5AbYUV8CD...
I have made the anonymized answers public. This will obviously have some bias to it given that I mostly know fixed wireless operators, but I'm hoping this gets some good distribution to catch more platforms.
----- Mike Hammett Intelligent Computing Solutions
Midwest Internet Exchange
The Brothers WISP
----- Original Message -----
From: "Mike Hammett" < nanog@ics-il.net > To: "NANOG" < nanog@nanog.org > Sent: Wednesday, September 21, 2016 9:08:55 AM Subject: Re: CDN Overload?
https://goo.gl/forms/LvgFRsMdNdI8E9HF3
I have made this into a Google Form to make it easier to track compared to randomly formatted responses on multiple mailing lists, Facebook Groups, etc.
----- Mike Hammett Intelligent Computing Solutions
Midwest Internet Exchange
The Brothers WISP
----- Original Message -----
From: "Mike Hammett" < nanog@ics-il.net > To: "NANOG" < nanog@nanog.org > Sent: Monday, September 19, 2016 12:34:48 PM Subject: CDN Overload?
I participate on a few other mailing lists focused on eyeball networks. For a couple years I've been hearing complaints from this CDN or that CDN was behaving badly. It's been severely ramping up the past few months. There have been some wild allegations, but I would like to develop a bit more standardized evidence collection. Initially LimeLight was the only culprit, but recently it has been Microsoft as well. I'm not sure if there have been any others.
The principal complaint is that upstream of whatever is doing the rate limiting for a given customer there is significantly more capacity being utilized than the customer has purchased. This could happen briefly as TCP adjusts to the capacity limitation, but in some situations this has persisted for days at a time. I'll list out a few situations as best as I can recall them. Some of these may even be merges of a couple situations. The point is to show the general issue and develop a better process for collecting what exactly is happening at the time and how to address it.
One situation had approximately 45 megabit/s of capacity being used up by a customer that had a 1.5 megabit/s plan. All other traffic normally held itself within the 1.5 megabit/s, but this particular CDN sent excessively more for extended periods of time.
An often occurrence has someone with a single digit megabit/s limitation consuming 2x - 3x more than their plan on the other side of the rate limiter.
Last month on my own network I saw someone with 2x - 3x being consumed upstream and they had *190* connections downloading said data from Microsoft.
The past week or two I've been hearing of people only having a single connection downloading at more than their plan rate.
These situations effectively shut out all other Internet traffic to that customer or even portion of the network for low capacity NLOS areas. It's a DoS caused by downloads. What happened to the days of MS BITS and you didn't even notice the download happening? A lot of these guys think that the CDNs are just a pile of dicks looking to ruin everyone's day and I'm certain that there are at least a couple people at each CDN that aren't that way. ;-)
Lots of rambling, sure. What do I need to have these guys collect as evidence of a problem and who should they send it to?
----- Mike Hammett Intelligent Computing Solutions
Midwest Internet Exchange
The Brothers WISP
--- Bruce Curtis bruce.curtis@ndsu.edu Certified NetAnalyst II 701-231-8527 North Dakota State University

Mike, I have the right contact there and I'll flag this thread that way in case they havent already seen it. Best, Martin Hannigan AS 20940 // AS 32787 On Thursday, September 22, 2016, Mike Hammett <nanog@ics-il.net> wrote:
Do we have any contacts at Microsoft that we can talk to about this? This time around, they are the common denominator. I know people have been complaining about this for longer than Windows 10 has been out, so there must be some other reasons why other parties we are to blame.
-----Mike HammettIntelligent Computing SolutionsMidwest Internet ExchangeThe Brothers WISP
----- Original Message ----- From: Bruce Curtis <bruce.curtis@ndsu.edu <javascript:;>> To: Mike Hammett <nanog@ics-il.net <javascript:;>> Cc: Martin Hannigan <hannigan@gmail.com <javascript:;>>, NANOG < nanog@nanog.org <javascript:;>> Sent: Thu, 22 Sep 2016 16:28:17 -0500 (CDT) Subject: Re: CDN Overload?
I have seen traffic from Microsoft in Europe to single hosts on our campus that seemed to be unusually (high bps) and long.
I don’t recall if the few multiple hosts I noticed this on over time were only on our campus wifi.
If not perhaps the common factor is longer latency? Both connects over wireless and connections from Europe to the US would have longer latency.
Perhaps this longer latency combined with some other factor is triggering a but in modern TCP Congestion Control algorithms?
This mentions that there have been bugs in TCP Congestion Control algorithm implementations. Perhaps there could be other bugs that result in the descried issue?
https://www.microsoft.com/en-us/research/wp-content/ uploads/2016/08/ms_feb07_eval.ppt.pdf
I have seen cases on our campus where too small buffers on an ethernet switch caused a Linux TCP Congestion Control algorithm to act badly resulting in slower downloads than a simple algorithm that depended on dropped packets rather than trying to determine window sizes etc. The fix in that case was to increase the buffer size. Of course buffer bloat is also known to play havoc with TCP Congestion Control algorithms. Just wondering if some combination of higher latency and another unknown variable or just a bug might cause a TCP Congestion Control algorithm to think it can safely try to increase the transmit rate?
On Sep 21, 2016, at 8:29 PM, Mike Hammett <nanog@ics-il.net <javascript:;>> wrote:
Thanks Marty. I have only experienced this on my network once and it was directly with Microsoft, so I haven't done much until a couple days ago when I started this campaign. I don't know if anyone else has brought this to anyone's attention. I just sent an e-mail to Owen when I saw yours.
----- Mike Hammett Intelligent Computing Solutions
Midwest Internet Exchange
The Brothers WISP
----- Original Message -----
From: "Martin Hannigan" <hannigan@gmail.com <javascript:;>> To: "Mike Hammett" <nanog@ics-il.net <javascript:;>> Cc: "NANOG" <nanog@nanog.org <javascript:;>> Sent: Wednesday, September 21, 2016 8:19:35 PM Subject: Re: CDN Overload?
Mike,
I will forward to the requisite group for a look. Have you brought this to our attention previously? I don't see anything. If you did, please forward me the ticket numbers or message(s) (peering@ is best) so wee can track down and see if someone already has it in queue.
Jared alluded to fasttcp a few emails ago. Astute man.
Best,
Martin Hannigan AS 20940 // AS 32787
On Sep 21, 2016, at 14:30, Mike Hammett < nanog@ics-il.net <javascript:;> > wrote:
https://docs.google.com/spreadsheets/d/1Jdm0dOBf81kSnXEvVfI6ZJbWFNt5A bYUV8CDxGwLSm8/edit?usp=sharing
I have made the anonymized answers public. This will obviously have some bias to it given that I mostly know fixed wireless operators, but I'm hoping this gets some good distribution to catch more platforms.
----- Mike Hammett Intelligent Computing Solutions
Midwest Internet Exchange
The Brothers WISP
----- Original Message -----
From: "Mike Hammett" < nanog@ics-il.net <javascript:;> > To: "NANOG" < nanog@nanog.org <javascript:;> > Sent: Wednesday, September 21, 2016 9:08:55 AM Subject: Re: CDN Overload?
https://goo.gl/forms/LvgFRsMdNdI8E9HF3
I have made this into a Google Form to make it easier to track compared to randomly formatted responses on multiple mailing lists, Facebook Groups, etc.
----- Mike Hammett Intelligent Computing Solutions
Midwest Internet Exchange
The Brothers WISP
----- Original Message -----
From: "Mike Hammett" < nanog@ics-il.net <javascript:;> > To: "NANOG" < nanog@nanog.org <javascript:;> > Sent: Monday, September 19, 2016 12:34:48 PM Subject: CDN Overload?
I participate on a few other mailing lists focused on eyeball networks. For a couple years I've been hearing complaints from this CDN or that CDN was behaving badly. It's been severely ramping up the past few months. There have been some wild allegations, but I would like to develop a bit more standardized evidence collection. Initially LimeLight was the only culprit, but recently it has been Microsoft as well. I'm not sure if there have been any others.
The principal complaint is that upstream of whatever is doing the rate limiting for a given customer there is significantly more capacity being utilized than the customer has purchased. This could happen briefly as TCP adjusts to the capacity limitation, but in some situations this has persisted for days at a time. I'll list out a few situations as best as I can recall them. Some of these may even be merges of a couple situations. The point is to show the general issue and develop a better process for
collecting what exactly is happening at the time and how to address it. > > One situation had approximately 45 megabit/s of capacity being used up by a customer that had a 1.5 megabit/s plan. All other traffic normally held itself within the 1.5 megabit/s, but this particular CDN sent excessively more for extended periods of time. > > An often occurrence has someone with a single digit megabit/s limitation consuming 2x - 3x more than their plan on the other side of the rate limiter. > > Last month on my own network I saw someone with 2x - 3x being consumed upstream and they had *190* connections downloading said data from Microsoft. > > The past week or two I've been hearing of people only having a single connection downloading at more than their plan rate. > > > These situations effectively shut out all other Internet traffic to that customer or even portion of the network for low capacity NLOS areas. It's a DoS caused by downloads. What happened to the days of MS BITS and you didn't even notice the download happening? A lot of these guys think that the CDNs are just a pile of dicks looking to ruin everyone's day and I'm certain that there are at least a couple people at each CDN that aren't that way. ;-) > > > > > Lots of rambling, sure. What do I need to have these guys collect as evidence of a problem and who should they send it to? > > > > > ----- > Mike Hammett > Intelligent Computing Solutions > > Midwest Internet Exchange > > The Brothers WISP > > > > > >
--- Bruce Curtis bruce.curtis@ndsu.edu <javascript:;> Certified NetAnalyst II 701-231-8527 North Dakota State University

Thanks. ----- Mike Hammett Intelligent Computing Solutions http://www.ics-il.com Midwest-IX http://www.midwest-ix.com ----- Original Message ----- From: Martin Hannigan <hannigan@gmail.com> To: Mike Hammett <nanog@ics-il.net> Cc: NANOG <nanog@nanog.org> Sent: Thu, 22 Sep 2016 18:29:38 -0500 (CDT) Subject: Re: CDN Overload? Mike, I have the right contact there and I'll flag this thread that way in case they havent already seen it. Best, Martin Hannigan AS 20940 // AS 32787 On Thursday, September 22, 2016, Mike Hammett <nanog@ics-il.net> wrote:
Do we have any contacts at Microsoft that we can talk to about this? This time around, they are the common denominator. I know people have been complaining about this for longer than Windows 10 has been out, so there must be some other reasons why other parties we are to blame.
-----Mike HammettIntelligent Computing SolutionsMidwest Internet ExchangeThe Brothers WISP
----- Original Message ----- From: Bruce Curtis <bruce.curtis@ndsu.edu <javascript:;>> To: Mike Hammett <nanog@ics-il.net <javascript:;>> Cc: Martin Hannigan <hannigan@gmail.com <javascript:;>>, NANOG < nanog@nanog.org <javascript:;>> Sent: Thu, 22 Sep 2016 16:28:17 -0500 (CDT) Subject: Re: CDN Overload?
I have seen traffic from Microsoft in Europe to single hosts on our campus that seemed to be unusually (high bps) and long.
I don’t recall if the few multiple hosts I noticed this on over time were only on our campus wifi.
If not perhaps the common factor is longer latency? Both connects over wireless and connections from Europe to the US would have longer latency.
Perhaps this longer latency combined with some other factor is triggering a but in modern TCP Congestion Control algorithms?
This mentions that there have been bugs in TCP Congestion Control algorithm implementations. Perhaps there could be other bugs that result in the descried issue?
https://www.microsoft.com/en-us/research/wp-content/ uploads/2016/08/ms_feb07_eval.ppt.pdf
I have seen cases on our campus where too small buffers on an ethernet switch caused a Linux TCP Congestion Control algorithm to act badly resulting in slower downloads than a simple algorithm that depended on dropped packets rather than trying to determine window sizes etc. The fix in that case was to increase the buffer size. Of course buffer bloat is also known to play havoc with TCP Congestion Control algorithms. Just wondering if some combination of higher latency and another unknown variable or just a bug might cause a TCP Congestion Control algorithm to think it can safely try to increase the transmit rate?
On Sep 21, 2016, at 8:29 PM, Mike Hammett <nanog@ics-il.net <javascript:;>> wrote:
Thanks Marty. I have only experienced this on my network once and it was directly with Microsoft, so I haven't done much until a couple days ago when I started this campaign. I don't know if anyone else has brought this to anyone's attention. I just sent an e-mail to Owen when I saw yours.
----- Mike Hammett Intelligent Computing Solutions
Midwest Internet Exchange
The Brothers WISP
----- Original Message -----
From: "Martin Hannigan" <hannigan@gmail.com <javascript:;>> To: "Mike Hammett" <nanog@ics-il.net <javascript:;>> Cc: "NANOG" <nanog@nanog.org <javascript:;>> Sent: Wednesday, September 21, 2016 8:19:35 PM Subject: Re: CDN Overload?
Mike,
I will forward to the requisite group for a look. Have you brought this to our attention previously? I don't see anything. If you did, please forward me the ticket numbers or message(s) (peering@ is best) so wee can track down and see if someone already has it in queue.
Jared alluded to fasttcp a few emails ago. Astute man.
Best,
Martin Hannigan AS 20940 // AS 32787
On Sep 21, 2016, at 14:30, Mike Hammett < nanog@ics-il.net <javascript:;> > wrote:
https://docs.google.com/spreadsheets/d/1Jdm0dOBf81kSnXEvVfI6ZJbWFNt5A bYUV8CDxGwLSm8/edit?usp=sharing
I have made the anonymized answers public. This will obviously have some bias to it given that I mostly know fixed wireless operators, but I'm hoping this gets some good distribution to catch more platforms.
----- Mike Hammett Intelligent Computing Solutions
Midwest Internet Exchange
The Brothers WISP
----- Original Message -----
From: "Mike Hammett" < nanog@ics-il.net <javascript:;> > To: "NANOG" < nanog@nanog.org <javascript:;> > Sent: Wednesday, September 21, 2016 9:08:55 AM Subject: Re: CDN Overload?
https://goo.gl/forms/LvgFRsMdNdI8E9HF3
I have made this into a Google Form to make it easier to track compared to randomly formatted responses on multiple mailing lists, Facebook Groups, etc.
----- Mike Hammett Intelligent Computing Solutions
Midwest Internet Exchange
The Brothers WISP
----- Original Message -----
From: "Mike Hammett" < nanog@ics-il.net <javascript:;> > To: "NANOG" < nanog@nanog.org <javascript:;> > Sent: Monday, September 19, 2016 12:34:48 PM Subject: CDN Overload?
I participate on a few other mailing lists focused on eyeball networks. For a couple years I've been hearing complaints from this CDN or that CDN was behaving badly. It's been severely ramping up the past few months. There have been some wild allegations, but I would like to develop a bit more standardized evidence collection. Initially LimeLight was the only culprit, but recently it has been Microsoft as well. I'm not sure if there have been any others.
The principal complaint is that upstream of whatever is doing the rate limiting for a given customer there is significantly more capacity being utilized than the customer has purchased. This could happen briefly as TCP adjusts to the capacity limitation, but in some situations this has persisted for days at a time. I'll list out a few situations as best as I can recall them. Some of these may even be merges of a couple situations. The point is to show the general issue and develop a better process for
collecting what exactly is happening at the time and how to address it. > > One situation had approximately 45 megabit/s of capacity being used up by a customer that had a 1.5 megabit/s plan. All other traffic normally held itself within the 1.5 megabit/s, but this particular CDN sent excessively more for extended periods of time. > > An often occurrence has someone with a single digit megabit/s limitation consuming 2x - 3x more than their plan on the other side of the rate limiter. > > Last month on my own network I saw someone with 2x - 3x being consumed upstream and they had *190* connections downloading said data from Microsoft. > > The past week or two I've been hearing of people only having a single connection downloading at more than their plan rate. > > > These situations effectively shut out all other Internet traffic to that customer or even portion of the network for low capacity NLOS areas. It's a DoS caused by downloads. What happened to the days of MS BITS and you didn't even notice the download happening? A lot of these guys think that the CDNs are just a pile of dicks looking to ruin everyone's day and I'm certain that there are at least a couple people at each CDN that aren't that way. ;-) > > > > > Lots of rambling, sure. What do I need to have these guys collect as evidence of a problem and who should they send it to? > > > > > ----- > Mike Hammett > Intelligent Computing Solutions > > Midwest Internet Exchange > > The Brothers WISP > > > > > >
--- Bruce Curtis bruce.curtis@ndsu.edu <javascript:;> Certified NetAnalyst II 701-231-8527 North Dakota State University

Mike, you might want to reference this thread - http://mailman.nanog.org/pipermail/nanog/2016-July/thread.html#87147 - as another data point. LLNW was sending data at levels ~ 10x greater than my policed DSL user's subscription rates. It seems to me that either the client or the server TCP stack was not working in a desirable manner. --Blake Mike Hammett wrote on 9/19/2016 12:34 PM:
I participate on a few other mailing lists focused on eyeball networks. For a couple years I've been hearing complaints from this CDN or that CDN was behaving badly. It's been severely ramping up the past few months. There have been some wild allegations, but I would like to develop a bit more standardized evidence collection. Initially LimeLight was the only culprit, but recently it has been Microsoft as well. I'm not sure if there have been any others.
The principal complaint is that upstream of whatever is doing the rate limiting for a given customer there is significantly more capacity being utilized than the customer has purchased. This could happen briefly as TCP adjusts to the capacity limitation, but in some situations this has persisted for days at a time. I'll list out a few situations as best as I can recall them. Some of these may even be merges of a couple situations. The point is to show the general issue and develop a better process for collecting what exactly is happening at the time and how to address it.
One situation had approximately 45 megabit/s of capacity being used up by a customer that had a 1.5 megabit/s plan. All other traffic normally held itself within the 1.5 megabit/s, but this particular CDN sent excessively more for extended periods of time.
An often occurrence has someone with a single digit megabit/s limitation consuming 2x - 3x more than their plan on the other side of the rate limiter.
Last month on my own network I saw someone with 2x - 3x being consumed upstream and they had *190* connections downloading said data from Microsoft.
The past week or two I've been hearing of people only having a single connection downloading at more than their plan rate.
These situations effectively shut out all other Internet traffic to that customer or even portion of the network for low capacity NLOS areas. It's a DoS caused by downloads. What happened to the days of MS BITS and you didn't even notice the download happening? A lot of these guys think that the CDNs are just a pile of dicks looking to ruin everyone's day and I'm certain that there are at least a couple people at each CDN that aren't that way. ;-)
Lots of rambling, sure. What do I need to have these guys collect as evidence of a problem and who should they send it to?
----- Mike Hammett Intelligent Computing Solutions
Midwest Internet Exchange
The Brothers WISP
participants (12)
-
Baldur Norddahl
-
Blake Hudson
-
Bruce Curtis
-
Florian Weimer
-
George Skorup
-
Jared Mauch
-
Jon Lewis
-
Josh Reynolds
-
Martin Hannigan
-
Matthew Walster
-
Mike Hammett
-
Spyros Kakaroukas