Armchair quarterbacking... Discussions I've seen from operators on Facebook shows some that had PNIs that worked just fine, while others with PNIs and cache boxes didn't fare so well. Some with just cache boxes were fine, while others were not. What were your educated observations, preferably with supporting data? Did we have a problem with congestion where the cache boxes phones home to, and this they just fell over? AWS used to be the data source of last resort. Did anyone notice congestion going from AWS to cache boxes? -----Mike HammettIntelligent Computing SolutionsMidwest Internet ExchangeThe Brothers WISP
Nonstop and fast 502 and 504 here on a Mac with Chrome. Points to edge having enough sockets and just inside that proxy not enough. Some ratio that was expected and exceeded. IMHO. Pete Get Outlook for iOS<https://aka.ms/o0ukef> ________________________________ From: NANOG <nanog-bounces+peter.tebault=charter.com@nanog.org> on behalf of Mike Hammett <nanog@ics-il.net> Sent: Sunday, November 17, 2024 12:25:59 PM To: NANOG <nanog@nanog.org> Subject: [EXTERNAL] Soooo..... Netflix CAUTION: The e-mail below is from an external source. Please exercise caution before opening attachments, clicking links, or following guidance. Armchair quarterbacking... Discussions I've seen from operators on Facebook shows some that had PNIs that worked just fine, while others with PNIs and cache boxes didn't fare so well. Some with just cache boxes were fine, while others were not. What were your educated observations, preferably with supporting data? Did we have a problem with congestion where the cache boxes phones home to, and this they just fell over? AWS used to be the data source of last resort. Did anyone notice congestion going from AWS to cache boxes? -----Mike HammettIntelligent Computing SolutionsMidwest Internet ExchangeThe Brothers WISP The contents of this e-mail message and any attachments are intended solely for the addressee(s) and may contain confidential and/or legally privileged information. If you are not the intended recipient of this message or if this message has been addressed to you in error, please immediately alert the sender by reply e-mail and then delete this message and any attachments. If you are not the intended recipient, you are notified that any use, dissemination, distribution, copying, or storage of this message or any attachment is strictly prohibited.
Yeah, normally I hold them up as the poster child for a scalable CDN but I’m hoping they release an RCA explaining what happened. On Sun, Nov 17, 2024 at 10:27 AM Mike Hammett <nanog@ics-il.net> wrote:
Armchair quarterbacking...
Discussions I've seen from operators on Facebook shows some that had PNIs that worked just fine, while others with PNIs and cache boxes didn't fare so well. Some with just cache boxes were fine, while others were not.
What were your educated observations, preferably with supporting data?
Did we have a problem with congestion where the cache boxes phones home to, and this they just fell over?
AWS used to be the data source of last resort. Did anyone notice congestion going from AWS to cache boxes?
-----Mike HammettIntelligent Computing SolutionsMidwest Internet ExchangeThe Brothers WISP
To understand the (historic) event by the numbers: https://about.netflix.com/en/news/60-million-households-tuned-in-live-for-ja... . On Sun, Nov 17, 2024 at 11:47 AM Maurice Brown <maurice@pwnship.com> wrote:
Yeah, normally I hold them up as the poster child for a scalable CDN but I’m hoping they release an RCA explaining what happened.
On Sun, Nov 17, 2024 at 10:27 AM Mike Hammett <nanog@ics-il.net> wrote:
Armchair quarterbacking...
Discussions I've seen from operators on Facebook shows some that had PNIs that worked just fine, while others with PNIs and cache boxes didn't fare so well. Some with just cache boxes were fine, while others were not.
What were your educated observations, preferably with supporting data?
Did we have a problem with congestion where the cache boxes phones home to, and this they just fell over?
AWS used to be the data source of last resort. Did anyone notice congestion going from AWS to cache boxes?
-----Mike HammettIntelligent Computing SolutionsMidwest Internet ExchangeThe Brothers WISP
Yeah, normally I hold them up as the poster child for a scalable CDN but I’m hoping they release an RCA explaining what happened.
I guess for then it was the difference between pre-recorded content they are used to, vs a live event. I wonder what the latency between live and the stream looked like – a few seconds or more like 30+ seconds? JL
What were your educated observations, preferably with supporting data?
If it was capacity issues, they learned a hard lesson that you should have other CDNs available to shed traffic over to if yours hits a problem that can't be quickly solved in real time. If it was server/software/livestream technical , then /shrug. Fix those. :) On Sun, Nov 17, 2024 at 1:28 PM Mike Hammett <nanog@ics-il.net> wrote:
Armchair quarterbacking...
Discussions I've seen from operators on Facebook shows some that had PNIs that worked just fine, while others with PNIs and cache boxes didn't fare so well. Some with just cache boxes were fine, while others were not.
What were your educated observations, preferably with supporting data?
Did we have a problem with congestion where the cache boxes phones home to, and this they just fell over?
AWS used to be the data source of last resort. Did anyone notice congestion going from AWS to cache boxes?
-----Mike HammettIntelligent Computing SolutionsMidwest Internet ExchangeThe Brothers WISP
This email may contain proprietary information of BAE Systems and/or third parties. My experience over a home internet fiber connection wasn’t great (like everyone else’s) but my son was watching it over his mobile device without any issues. From: NANOG <nanog-bounces+david.cooke=baesystemsdetica.com@nanog.org> On Behalf Of Tom Beecher Sent: Monday, November 18, 2024 12:23 AM To: Mike Hammett <nanog@ics-il.net> Cc: NANOG <nanog@nanog.org> Subject: Re: Soooo..... Netflix PHISHING ALERT This email has been sent from an account outside of the BAE Systems network. Be aware that this could be a phishing attempt. For more guidance, search "phishing email" on Connect. If you think this is a phishing email report it using the "PhishMe" button on Outlook. What were your educated observations, preferably with supporting data? If it was capacity issues, they learned a hard lesson that you should have other CDNs available to shed traffic over to if yours hits a problem that can't be quickly solved in real time. If it was server/software/livestream technical , then /shrug. Fix those. :) On Sun, Nov 17, 2024 at 1:28 PM Mike Hammett <nanog@ics-il.net<mailto:nanog@ics-il.net>> wrote: Armchair quarterbacking... Discussions I've seen from operators on Facebook shows some that had PNIs that worked just fine, while others with PNIs and cache boxes didn't fare so well. Some with just cache boxes were fine, while others were not. What were your educated observations, preferably with supporting data? Did we have a problem with congestion where the cache boxes phones home to, and this they just fell over? AWS used to be the data source of last resort. Did anyone notice congestion going from AWS to cache boxes? -----Mike HammettIntelligent Computing SolutionsMidwest Internet ExchangeThe Brothers WISP BAE Systems will collect and process information about you that may be subject to data protection laws. For more information about how we use and disclose your personal information, how we protect your information, our legal basis to use your information, your rights and who you can contact, please refer to the relevant sections of our Privacy note at www.baesystems.com/en/cybersecurity/privacy <http://www.baesystems.com/en/cybersecurity/privacy> Please consider the environment before printing this email. This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory. The contents of this email may relate to dealings with other companies under the control of BAE Systems PLC, details of which can be found at http://www.baesystems.com/Businesses/index.htm.
On Mon, Nov 18, 2024 at 6:17 AM Livingood, Jason via NANOG <nanog@nanog.org> wrote:
*> *My experience over a home internet fiber connection wasn’t great (like everyone else’s) but my son was watching it over his mobile device without any issues.
That may be an interesting data point – because mobile networks typically rate-shape video streams.
JL
Perhaps, sometimes , less is more. Perhaps, the “greedy” nature of adaptive bit-rate always trying to bid-up the data rate is counter productive… at a system level.
Also, how far have we come that 65 MILLION streams were active at the same time and we’re like “omg, so bad!” 5 years ago, never possible. That being said. I watched it on my iPad with no problems whatsoever. Not even one hiccup. Meanwhile I had X open in a side by side and I saw people complaining about it.
On Nov 18, 2024, at 09:08, Cooke, David via NANOG <nanog@nanog.org> wrote:
This email may contain proprietary information of BAE Systems and/or third parties.
My experience over a home internet fiber connection wasn’t great (like everyone else’s) but my son was watching it over his mobile device without any issues.
From: NANOG <nanog-bounces+david.cooke=baesystemsdetica.com@nanog.org <mailto:nanog-bounces+david.cooke=baesystemsdetica.com@nanog.org>> On Behalf Of Tom Beecher Sent: Monday, November 18, 2024 12:23 AM To: Mike Hammett <nanog@ics-il.net <mailto:nanog@ics-il.net>> Cc: NANOG <nanog@nanog.org <mailto:nanog@nanog.org>> Subject: Re: Soooo..... Netflix
PHISHING ALERT This email has been sent from an account outside of the BAE Systems network.
Be aware that this could be a phishing attempt. For more guidance, search "phishing email" on Connect. If you think this is a phishing email report it using the "PhishMe" button on Outlook.
What were your educated observations, preferably with supporting data?
If it was capacity issues, they learned a hard lesson that you should have other CDNs available to shed traffic over to if yours hits a problem that can't be quickly solved in real time.
If it was server/software/livestream technical , then /shrug. Fix those. :)
On Sun, Nov 17, 2024 at 1:28 PM Mike Hammett <nanog@ics-il.net <mailto:nanog@ics-il.net>> wrote: Armchair quarterbacking...
Discussions I've seen from operators on Facebook shows some that had PNIs that worked just fine, while others with PNIs and cache boxes didn't fare so well. Some with just cache boxes were fine, while others were not.
What were your educated observations, preferably with supporting data?
Did we have a problem with congestion where the cache boxes phones home to, and this they just fell over?
AWS used to be the data source of last resort. Did anyone notice congestion going from AWS to cache boxes?
-----Mike HammettIntelligent Computing SolutionsMidwest Internet ExchangeThe Brothers WISP BAE Systems will collect and process information about you that may be subject to data protection laws. For more information about how we use and disclose your personal information, how we protect your information, our legal basis to use your information, your rights and who you can contact, please refer to the relevant sections of our Privacy note at www.baesystems.com/en/cybersecurity/privacy <http://www.baesystems.com/en/cybersecurity/privacy>
Please consider the environment before printing this email. This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory. The contents of this email may relate to dealings with other companies under the control of BAE Systems PLC, details of which can be found at http://www.baesystems.com/Businesses/index.htm.
Something that would be interesting to see (particularly if someone has eyes in Comcast’s network) is to see how customers in areas where L4S trials are happening faired in comparison to others. Part of the narrative of what we are seeing in the CDN market i.e., some not necessarily being prepared for a streaming era. On Mon, Nov 18, 2024 at 9:03 AM <joel@joelesler.net> wrote:
Also, how far have we come that 65 MILLION streams were active at the same time and we’re like “omg, so bad!”
5 years ago, never possible.
That being said. I watched it on my iPad with *no problems whatsoever*. Not even one hiccup. Meanwhile I had X open in a side by side and I saw people complaining about it.
On Nov 18, 2024, at 09:08, Cooke, David via NANOG <nanog@nanog.org> wrote:
This email may contain proprietary information of BAE Systems and/or third parties.
My experience over a home internet fiber connection wasn’t great (like everyone else’s) but my son was watching it over his mobile device without any issues.
*From:* NANOG <nanog-bounces+david.cooke=baesystemsdetica.com@nanog.org> *On Behalf Of *Tom Beecher *Sent:* Monday, November 18, 2024 12:23 AM *To:* Mike Hammett <nanog@ics-il.net> *Cc:* NANOG <nanog@nanog.org> *Subject:* Re: Soooo..... Netflix
*PHISHING ALERT* *This email has been sent from an account outside of the BAE Systems network.*
Be aware that this could be a phishing attempt. For more guidance, search "phishing email" on Connect. If you think this is a phishing email report it using the "PhishMe" button on Outlook.
What were your educated observations, preferably with supporting data?
If it was capacity issues, they learned a hard lesson that you should have other CDNs available to shed traffic over to if yours hits a problem that can't be quickly solved in real time.
If it was server/software/livestream technical , then /shrug. Fix those. :)
On Sun, Nov 17, 2024 at 1:28 PM Mike Hammett <nanog@ics-il.net> wrote:
Armchair quarterbacking...
Discussions I've seen from operators on Facebook shows some that had PNIs that worked just fine, while others with PNIs and cache boxes didn't fare so well. Some with just cache boxes were fine, while others were not.
What were your educated observations, preferably with supporting data?
Did we have a problem with congestion where the cache boxes phones home to, and this they just fell over?
AWS used to be the data source of last resort. Did anyone notice congestion going from AWS to cache boxes?
-----Mike HammettIntelligent Computing SolutionsMidwest Internet ExchangeThe Brothers WISP
BAE Systems will collect and process information about you that may be subject to data protection laws. For more information about how we use and disclose your personal information, how we protect your information, our legal basis to use your information, your rights and who you can contact, please refer to the relevant sections of our Privacy note at www.baesystems.com/en/cybersecurity/privacy
Please consider the environment before printing this email. This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory. The contents of this email may relate to dealings with other companies under the control of BAE Systems PLC, details of which can be found at http://www.baesystems.com/Businesses/index.htm.
Something that would be interesting to see (particularly if someone has eyes in Comcast’s network) is to see how customers in areas where L4S trials are happening faired in comparison to others.
The sample area of the deployment is still to small from which to draw conclusions (~20K homes). We’ll know in a few weeks more how things look in comparison. But in this example, I think the bottleneck was more likely on the server/CDN side of things, so CPE and last mile AQM and/or dual queue L4S would probably not have made a difference. But never know without knowing full root cause. I have no doubt the Netflix folks will sort it – they’ve got some very smart transport layer and CDN folks. JL
i have (3) oca's ... 2 connected at 100g each, and 1 at dual 100g lag... with an operational throughput capacity of the nodes being something less than that, i forget the exact node(s) throughput specs, but anyway... about the 11/15/2024 Tyson/Paul Netflix fights.... from 6 - 7 p.m. central time i saw extreme ramp up on my OCA utilization...reaching an all-time high - 15g - 27g - 50g = 92g at 7:31 p.m. i saw what equated to a ~40g dive, total, across all 3 of my oca caches - 10g - 17g - 27g = 54g I never saw the utilization ramp up to the same level again after that. actually the first one did get back to 16g, but the other 2 never ramped up that much again I was waiting for the main event (Paul/Tyson) to generate an even higher load than originally seen at the 7 p.m. but i didn't happen The aforementioned graph ramp up seen from 6-7 p.m.was a clean scaling graph, as you would expect as more and more eyeballs were "tuning in".... After the sharp drop at 7:31 p.m. the graphs never really cleaned up after that. The graphs were just down and up. - 7:31 p.m. - sharp sag/drop - 7:51 p.m. - sharp sag/drop - 8:18 p.m. - sharp sag/drop - 9:04 p.m. - sharp sag/drop - 9:53 p.m. - ramp up - 10:08 p.m. - aggressive ramp down I wonder if the overall nationwide/worldwide issues affected even my local caches. I figured my local caches would have been "protected" or unaffected by issues outside of my network, but I'm not so sure about it I can say, that we didn't have a ton of customer complaints from our 60k resi bb subs, but I did hear about some customer complaints, but I don't think it was many I wonder if there was some sort of adaptive rate changes in the streams, altering the overall raw bandwidth utilization I observed, causing the main event to not be seen as high of a peak on the graph, or if it was just the Netflix was having issues everywhere. I don't know. Hopefully Netflix NFL Christmas Day is much better Aaron
On Nov 18, 2024, at 1:44 PM, Livingood, Jason via NANOG <nanog@nanog.org> wrote:
Something that would be interesting to see (particularly if someone has eyes in Comcast’s network) is to see how customers in areas where L4S trials are happening faired in comparison to others.
The sample area of the deployment is still to small from which to draw conclusions (~20K homes). We’ll know in a few weeks more how things look in comparison. But in this example, I think the bottleneck was more likely on the server/CDN side of things, so CPE and last mile AQM and/or dual queue L4S would probably not have made a difference. But never know without knowing full root cause. I have no doubt the Netflix folks will sort it – they’ve got some very smart transport layer and CDN folks.
JL
We have several caches and PNIs. Call-volume was high. On 11/17/24 19:25, Mike Hammett wrote:
Armchair quarterbacking...
Discussions I've seen from operators on Facebook shows some that had PNIs that worked just fine, while others with PNIs and cache boxes didn't fare so well. Some with just cache boxes were fine, while others were not.
What were your educated observations, preferably with supporting data?
Did we have a problem with congestion where the cache boxes phones home to, and this they just fell over?
AWS used to be the data source of last resort. Did anyone notice congestion going from AWS to cache boxes?
-----Mike HammettIntelligent Computing SolutionsMidwest Internet ExchangeThe Brothers WISP
participants (11)
-
Aaron1
-
Bryan Holloway
-
Ca By
-
Cooke, David
-
Innocent Obi
-
joel@joelesler.net
-
Livingood, Jason
-
Maurice Brown
-
Mike Hammett
-
Tebault, Peter H
-
Tom Beecher