Severe latency at both San Jose and Los Angeles Level3/AT&T peering
Hi Nanog, I have a ticket open with Level 3, with whom I have 1gig pipes in Oakland, CA and Las Vegas, NV. One of our users noticed very slow file transfer/media delivery from the Bay Area to L.A., and on investigating it appears as though the peering point between Level3 and AT&T in SF was saturated and had 300ms avg. latency. 90 minutes later after receiving no call from Level3, I escalated to a P1 ticket, as the latency is now > 1000ms and we're seeing 20% packet loss. I decided to statically route to the destination via our DR cluster in Vegas, and interestingly I found the same situation where AT&T and Level3 peer in Tustin. mtr traceroutes, for those curious: Via Oakland: My traceroute [v0.71] hivemind (0.0.0.0) Fri Apr 11 15:36:08 2014 Keys: Help Display mode Restart statistics Order of fields quit Packets Pings Host Loss% Last Avg Best Wrst StDev 1. 138.72.xxx.xxx 0.0% 0.3 0.2 0.1 0.3 0.1 2. pan5060-ae1-401.routerland.pixar.com 0.0% 0.4 0.4 0.4 0.5 0.0 3. verge-vlan66.pixar.com 0.0% 0.6 0.7 0.6 0.9 0.1 4. ge-6-24.car1.Oakland1.Level3.net 0.0% 0.7 105.5 0.7 307.3 110.9 5. ae-5-5.ebr2.SanJose1.Level3.net 0.0% 1.7 1.7 1.6 2.8 0.4 6. ae-92-92.csw4.SanJose1.Level3.net 0.0% 1.6 1.7 1.6 3.0 0.4 7. ae-4-90.edge2.SanJose1.Level3.net 0.0% 1.6 4.6 1.6 37.1 10.2 8. 192.205.32.209 41.7% 1042. 1048. 1038. 1059. 9.1 9. cr1.sffca.ip.att.net 25.0% 1052. 1059. 1046. 1072. 10.0 10. cr1.la2ca.ip.att.net 27.3% 1043. 1060. 1043. 1071. 10.7 11. cr83.la2ca.ip.att.net 16.7% 1058. 1060. 1045. 1073. 8.8 12. gar7.la2ca.ip.att.net 16.7% 1059. 1061. 1044. 1087. 13.3 13. 12.249.143.98 33.3% 1059. 1057. 1048. 1071. 7.8 14. ??? My traceroute [v0.71] hivemind (0.0.0.0) Fri Apr 11 15:36:43 2014 Resolver: Received error response 2. (server failure)er of fields quit Packets Pings Host Loss% Last Avg Best Wrst StDev 1. 138.72.xxx.xxx 0.0% 0.2 0.1 0.1 0.2 0.0 2. pan5060-ae1-401.routerland.pixar.com 0.0% 0.4 0.4 0.3 0.6 0.1 3. cat-vegas-01-vlan66.pixar.com 0.0% 22.0 21.8 21.7 22.3 0.2 4. 205.129.21.101 0.0% 19.4 19.5 19.3 19.9 0.2 5. ae-2-5.bar1.LasVegas1.Level3.net 0.0% 19.3 21.8 19.3 40.7 5.9 6. ae-4-4.ebr1.LosAngeles1.Level3.net 0.0% 22.0 22.4 21.9 26.8 1.3 7. ae-6-6.ebr1.Tustin1.Level3.net 0.0% 20.0 20.2 19.9 21.8 0.5 8. ae-107-3507.bar2.Tustin1.Level3.net 0.0% 22.0 22.0 21.9 22.1 0.0 9. 192.205.37.145 30.8% 1052. 1063. 1048. 1072. 8.1 10. cr1.la2ca.ip.att.net 35.7% 1050. 1060. 1050. 1070. 7.3 11. cr83.la2ca.ip.att.net 28.6% 1049. 1064. 1049. 1072. 7.8 12. gar7.la2ca.ip.att.net 21.4% 1048. 1061. 1048. 1072. 6.7 13. 12.249.143.98 28.6% 1050. 1061. 1050. 1072. 7.9 14. ??? Just wanted to share in case anyone else is running into similar issues. I know, I should be on the outages list. I will add myself now. :) Regards, Dave Sotnick
This should provide some background: http://apps.fcc.gov/ecfs/document/view?id=7022026095 Drive Slow, Paul On Fri, Apr 11, 2014 at 6:50 PM, David Sotnick <sotnickd-nanog@ddv.com> wrote:
Hi Nanog,
I have a ticket open with Level 3, with whom I have 1gig pipes in Oakland, CA and Las Vegas, NV.
One of our users noticed very slow file transfer/media delivery from the Bay Area to L.A., and on investigating it appears as though the peering point between Level3 and AT&T in SF was saturated and had 300ms avg. latency.
90 minutes later after receiving no call from Level3, I escalated to a P1 ticket, as the latency is now > 1000ms and we're seeing 20% packet loss.
I decided to statically route to the destination via our DR cluster in Vegas, and interestingly I found the same situation where AT&T and Level3 peer in Tustin.
mtr traceroutes, for those curious:
Via Oakland:
My traceroute [v0.71]
hivemind (0.0.0.0) Fri Apr 11 15:36:08 2014
Keys: Help Display mode Restart statistics Order of fields quit
Packets Pings
Host Loss% Last Avg Best Wrst StDev
1. 138.72.xxx.xxx 0.0% 0.3 0.2 0.1 0.3 0.1 2. pan5060-ae1-401.routerland.pixar.com 0.0% 0.4 0.4 0.4 0.5 0.0 3. verge-vlan66.pixar.com 0.0% 0.6 0.7 0.6 0.9 0.1 4. ge-6-24.car1.Oakland1.Level3.net 0.0% 0.7 105.5 0.7 307.3 110.9 5. ae-5-5.ebr2.SanJose1.Level3.net 0.0% 1.7 1.7 1.6 2.8 0.4 6. ae-92-92.csw4.SanJose1.Level3.net 0.0% 1.6 1.7 1.6 3.0 0.4 7. ae-4-90.edge2.SanJose1.Level3.net 0.0% 1.6 4.6 1.6 37.1 10.2 8. 192.205.32.209 41.7% 1042. 1048. 1038. 1059. 9.1 9. cr1.sffca.ip.att.net 25.0% 1052. 1059. 1046. 1072. 10.0 10. cr1.la2ca.ip.att.net 27.3% 1043. 1060. 1043. 1071. 10.7 11. cr83.la2ca.ip.att.net 16.7% 1058. 1060. 1045. 1073. 8.8 12. gar7.la2ca.ip.att.net 16.7% 1059. 1061. 1044. 1087. 13.3 13. 12.249.143.98 33.3% 1059. 1057. 1048. 1071. 7.8 14. ???
My traceroute [v0.71]
hivemind (0.0.0.0) Fri Apr 11 15:36:43 2014
Resolver: Received error response 2. (server failure)er of fields quit
Packets Pings
Host Loss% Last Avg Best Wrst StDev 1. 138.72.xxx.xxx 0.0% 0.2 0.1 0.1 0.2 0.0 2. pan5060-ae1-401.routerland.pixar.com 0.0% 0.4 0.4 0.3 0.6 0.1 3. cat-vegas-01-vlan66.pixar.com 0.0% 22.0 21.8 21.7 22.3 0.2 4. 205.129.21.101 0.0% 19.4 19.5 19.3 19.9 0.2 5. ae-2-5.bar1.LasVegas1.Level3.net 0.0% 19.3 21.8 19.3 40.7 5.9 6. ae-4-4.ebr1.LosAngeles1.Level3.net 0.0% 22.0 22.4 21.9 26.8 1.3 7. ae-6-6.ebr1.Tustin1.Level3.net 0.0% 20.0 20.2 19.9 21.8 0.5 8. ae-107-3507.bar2.Tustin1.Level3.net 0.0% 22.0 22.0 21.9 22.1 0.0 9. 192.205.37.145 30.8% 1052. 1063. 1048. 1072. 8.1 10. cr1.la2ca.ip.att.net 35.7% 1050. 1060. 1050. 1070. 7.3 11. cr83.la2ca.ip.att.net 28.6% 1049. 1064. 1049. 1072. 7.8 12. gar7.la2ca.ip.att.net 21.4% 1048. 1061. 1048. 1072. 6.7 13. 12.249.143.98 28.6% 1050. 1061. 1050. 1072. 7.9 14. ???
Just wanted to share in case anyone else is running into similar issues. I know, I should be on the outages list. I will add myself now. :)
Regards, Dave Sotnick
participants (2)
-
David Sotnick
-
Paul WALL