reachability problems Europe->US?
Hi, any known problems with reachability from europe to US? We have customer complaints that they can't reach US-based sites like microsoft and others. Seems to be only source-prefix-based, but several ISPs in europe are affected. Or is the same problem visible in the states? Regards, Thomas
Hi, On 07.10.2010 14:35, Heath Jones wrote:
Seems to be only source-prefix-based, but several ISPs in europe are affected. Can you post source and destination IP's ?
source: 131.220.0.0/16, 212.201.68.0/22, 212.201.72.0/21, destination: 65.122.178.73, 63.228.223.104 traceroute to 65.122.178.73 (65.122.178.73), 30 hops max, 40 byte packets 1 er-rz-gig-3-3.stw-bonn.de (131.220.99.62) 1.792 ms 1.275 ms 1.125 ms 2 xr-bon1-te2-3.x-win.dfn.de (188.1.233.193) 0.705 ms 2.132 ms 0.755 ms 3 xr-bir1-te2-3.x-win.dfn.de (188.1.144.9) 1.477 ms 1.936 ms 1.051 ms 4 zr-fra1-te0-7-0-5.x-win.dfn.de (188.1.145.46) 4.034 ms 3.734 ms 4.957 ms 5 64.213.78.237 (64.213.78.237) 3.866 ms 3.295 ms 26.854 ms 6 jfk-brdr-04.inet.qwest.net (63.146.26.225) 119.511 ms 92.735 ms 99.019 ms 7 * * * or quote from DE-CIX tech-list: [www.microsoft.com] ----------- We also have some connectivity problems to ms, changing the bgp routing to another tier 1 carrier don t resolve the problem ----------- Cheers, Thomas
an update: On 07.10.2010 15:09, Thomas Schmid wrote:
Hi,
On 07.10.2010 14:35, Heath Jones wrote:
Seems to be only source-prefix-based, but several ISPs in europe are affected. Can you post source and destination IP's ?
source: 131.220.0.0/16, 212.201.68.0/22, 212.201.72.0/21, destination: 65.122.178.73, 63.228.223.104
traceroute to 65.122.178.73 (65.122.178.73), 30 hops max, 40 byte packets 1 er-rz-gig-3-3.stw-bonn.de (131.220.99.62) 1.792 ms 1.275 ms 1.125 ms 2 xr-bon1-te2-3.x-win.dfn.de (188.1.233.193) 0.705 ms 2.132 ms 0.755 ms 3 xr-bir1-te2-3.x-win.dfn.de (188.1.144.9) 1.477 ms 1.936 ms 1.051 ms 4 zr-fra1-te0-7-0-5.x-win.dfn.de (188.1.145.46) 4.034 ms 3.734 ms 4.957 ms 5 64.213.78.237 (64.213.78.237) 3.866 ms 3.295 ms 26.854 ms 6 jfk-brdr-04.inet.qwest.net (63.146.26.225) 119.511 ms 92.735 ms 99.019 ms 7 * * *
or quote from DE-CIX tech-list:
[www.microsoft.com] ----------- We also have some connectivity problems to ms, changing the bgp routing to another tier 1 carrier don t resolve the problem -----------
we shut down GBLX and routing now goes via Telia. Seems this helped. Looks like there is an issue in the path GBLX - Qwest - ? Thomas
Seems to be only source-prefix-based, but several ISPs in europe are affected. source: 131.220.0.0/16, 212.201.68.0/22, 212.201.72.0/21, destination: 65.122.178.73, 63.228.223.104 traceroute to 65.122.178.73 (65.122.178.73), 30 hops max, 40 byte packets 1 er-rz-gig-3-3.stw-bonn.de (131.220.99.62) 1.792 ms 1.275 ms 1.125 ms 2 xr-bon1-te2-3.x-win.dfn.de (188.1.233.193) 0.705 ms 2.132 ms 0.755 ms 3 xr-bir1-te2-3.x-win.dfn.de (188.1.144.9) 1.477 ms 1.936 ms 1.051 ms 4 zr-fra1-te0-7-0-5.x-win.dfn.de (188.1.145.46) 4.034 ms 3.734 ms 4.957 ms 5 64.213.78.237 (64.213.78.237) 3.866 ms 3.295 ms 26.854 ms 6 jfk-brdr-04.inet.qwest.net (63.146.26.225) 119.511 ms 92.735 ms 99.019 ms
Based on all that, it looks like Qwest is not propogating your routes within their network. I was going to recommend route-views, but it might not reflect that now if you have dropped GBLX. Historical routing updates will show though if Qwest were advertising reachability to you (which would be a good indicator if they were filtering at their edge)
Global crossing is having major issues (since yesterday actually) in Seattle. Every path I see to dfn.de is via gblx and Microsoft hosts most of those sites out of the seattle area so they may be seeing the same issue. Based on what we can see gblx has a broken port-channel or something similar here as random traffic (into) their network via our transit link gets black-holed. We could not even reach global crossing's own name servers for a while. We gave up and turned down BGP yesterday until we hear from them. Based on graphs at the time things broke they appeared to be black-holing roughly 1/4 of what we were sending them. Thanks, John van Oppen Spectrum Networks / AS 11404 -----Original Message----- From: Thomas Schmid [mailto:schmid@dfn.de] Sent: Thursday, October 07, 2010 6:10 AM To: Heath Jones Cc: nanog@nanog.org Subject: Re: reachability problems Europe->US? Hi, On 07.10.2010 14:35, Heath Jones wrote:
Seems to be only source-prefix-based, but several ISPs in europe are affected. Can you post source and destination IP's ?
source: 131.220.0.0/16, 212.201.68.0/22, 212.201.72.0/21, destination: 65.122.178.73, 63.228.223.104 traceroute to 65.122.178.73 (65.122.178.73), 30 hops max, 40 byte packets 1 er-rz-gig-3-3.stw-bonn.de (131.220.99.62) 1.792 ms 1.275 ms 1.125 ms 2 xr-bon1-te2-3.x-win.dfn.de (188.1.233.193) 0.705 ms 2.132 ms 0.755 ms 3 xr-bir1-te2-3.x-win.dfn.de (188.1.144.9) 1.477 ms 1.936 ms 1.051 ms 4 zr-fra1-te0-7-0-5.x-win.dfn.de (188.1.145.46) 4.034 ms 3.734 ms 4.957 ms 5 64.213.78.237 (64.213.78.237) 3.866 ms 3.295 ms 26.854 ms 6 jfk-brdr-04.inet.qwest.net (63.146.26.225) 119.511 ms 92.735 ms 99.019 ms 7 * * * or quote from DE-CIX tech-list: [www.microsoft.com] ----------- We also have some connectivity problems to ms, changing the bgp routing to another tier 1 carrier don t resolve the problem ----------- Cheers, Thomas
It looked like a broken aggregated Ethernet bundle or something similar... Most annoying was that the issue moved around a bit, over about five hours all the broken test IPs we had started working again and then other destinations started failing. All was well when we turned down gblx. As of now though we are seeing the issue as fixed and turned up GBLX again. Thanks, John -----Original Message----- From: Heath Jones [mailto:hj1980@gmail.com] Sent: Thursday, October 07, 2010 9:22 AM To: John van Oppen Cc: Thomas Schmid; nanog@nanog.org Subject: Re: reachability problems Europe->US?
... random traffic (into) their network via our transit link gets black-holed. So for the same source & destination, sometimes it works, sometimes it doesn't?
Am 07.10.2010 18:46, schrieb John van Oppen:
It looked like a broken aggregated Ethernet bundle or something similar... Most annoying was that the issue moved around a bit, over about five hours all the broken test IPs we had started working again and then other destinations started failing. All was well when we turned down gblx. As of now though we are seeing the issue as fixed and turned up GBLX again.
yes, I can confirm that situation is back to normal now after we re-enabled the GBLX session. I heared from others that it was again a broken LSP problem in GBLX (unconfirmed :) ) Cheers, Thomas
On Thu, Oct 07, 2010 at 07:12:33PM +0200, Thomas Schmid wrote:
yes, I can confirm that situation is back to normal now after we re-enabled the GBLX session. I heared from others that it was again a broken LSP problem in GBLX (unconfirmed :) )
Global Crossing recently started deploying Foundry/Brocade XMR's in their MPLS core, as a lower cost alternative to their old T640/OC192 MPLS core model. Unfortunately these boxes are buggy as all hell, and seem to blackhole LSPs somewhere in their network on at least a weekly basis. I think we've seen at least a dozen issues similar to this over the last couple months, though most of them were out of LA, so I didn't know they had actually done a Seattle deployment. Honestly GX deserves what they get on this one. I'm not aware of any other large network who has ever done a serious MPLS deployment using these boxes (and if you're thinking of replying to this and saying "hey we do some vll's between 2 routers and it seems to work", stop and think about what I might mean when I say a SERIOUS mpls deployment first :P), so this was pretty much to be expected. I'll also say that I'm remarkably underwhelmed by their response to this issue, and suggest that anyone who doesn't want their packets blackholed by the Floundrys be prepared to vote with their wallet. -- Richard A Steenbergen <ras@e-gerbil.net> http://www.e-gerbil.net/ras GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)
It seemed from the symptoms OP was seeing, that Qwest was the issue. Has GLBX reported to you that they are having a fault? If not, perhaps try tagging your exported routes to GLBX with 8010 as per this: http://onesc.net/communities/as3549/ On 7 October 2010 16:59, John van Oppen <jvanoppen@spectrumnet.us> wrote:
Global crossing is having major issues (since yesterday actually) in Seattle. Every path I see to dfn.de is via gblx and Microsoft hosts most of those sites out of the seattle area so they may be seeing the same issue.
Based on what we can see gblx has a broken port-channel or something similar here as random traffic (into) their network via our transit link gets black-holed. We could not even reach global crossing's own name servers for a while. We gave up and turned down BGP yesterday until we hear from them. Based on graphs at the time things broke they appeared to be black-holing roughly 1/4 of what we were sending them.
Thanks, John van Oppen Spectrum Networks / AS 11404
-----Original Message----- From: Thomas Schmid [mailto:schmid@dfn.de] Sent: Thursday, October 07, 2010 6:10 AM To: Heath Jones Cc: nanog@nanog.org Subject: Re: reachability problems Europe->US?
Hi,
On 07.10.2010 14:35, Heath Jones wrote:
Seems to be only source-prefix-based, but several ISPs in europe are affected. Can you post source and destination IP's ?
source: 131.220.0.0/16, 212.201.68.0/22, 212.201.72.0/21, destination: 65.122.178.73, 63.228.223.104
traceroute to 65.122.178.73 (65.122.178.73), 30 hops max, 40 byte packets 1 er-rz-gig-3-3.stw-bonn.de (131.220.99.62) 1.792 ms 1.275 ms 1.125 ms 2 xr-bon1-te2-3.x-win.dfn.de (188.1.233.193) 0.705 ms 2.132 ms 0.755 ms 3 xr-bir1-te2-3.x-win.dfn.de (188.1.144.9) 1.477 ms 1.936 ms 1.051 ms 4 zr-fra1-te0-7-0-5.x-win.dfn.de (188.1.145.46) 4.034 ms 3.734 ms 4.957 ms 5 64.213.78.237 (64.213.78.237) 3.866 ms 3.295 ms 26.854 ms 6 jfk-brdr-04.inet.qwest.net (63.146.26.225) 119.511 ms 92.735 ms 99.019 ms 7 * * *
or quote from DE-CIX tech-list:
[www.microsoft.com] ----------- We also have some connectivity problems to ms, changing the bgp routing to another tier 1 carrier don t resolve the problem -----------
Cheers,
Thomas
I know for certain it was gblx, noc confirmed, we saw this to multiple destinations all with the outbound towards gblx (not just DFN). We are on the same GBLX pop the sites they are talking about are connected to (westin) and almost every path I see back to dfn (from seven upstreams in seattle) was via gblx not qwest, the only exceptions were level3's and Savvis' routes which are via AS1299. I think the asymmetric routing was obfuscating the problem a bit for the guys attached to DFN. John -----Original Message----- From: Heath Jones [mailto:hj1980@gmail.com] Sent: Thursday, October 07, 2010 9:24 AM To: John van Oppen Cc: Thomas Schmid; nanog@nanog.org Subject: Re: reachability problems Europe->US? It seemed from the symptoms OP was seeing, that Qwest was the issue. Has GLBX reported to you that they are having a fault? If not, perhaps try tagging your exported routes to GLBX with 8010 as per this: http://onesc.net/communities/as3549/ On 7 October 2010 16:59, John van Oppen <jvanoppen@spectrumnet.us> wrote:
Global crossing is having major issues (since yesterday actually) in Seattle. Every path I see to dfn.de is via gblx and Microsoft hosts most of those sites out of the seattle area so they may be seeing the same issue.
Based on what we can see gblx has a broken port-channel or something similar here as random traffic (into) their network via our transit link gets black-holed. We could not even reach global crossing's own name servers for a while. We gave up and turned down BGP yesterday until we hear from them. Based on graphs at the time things broke they appeared to be black-holing roughly 1/4 of what we were sending them.
Thanks, John van Oppen Spectrum Networks / AS 11404
-----Original Message----- From: Thomas Schmid [mailto:schmid@dfn.de] Sent: Thursday, October 07, 2010 6:10 AM To: Heath Jones Cc: nanog@nanog.org Subject: Re: reachability problems Europe->US?
Hi,
On 07.10.2010 14:35, Heath Jones wrote:
Seems to be only source-prefix-based, but several ISPs in europe are affected. Can you post source and destination IP's ?
source: 131.220.0.0/16, 212.201.68.0/22, 212.201.72.0/21, destination: 65.122.178.73, 63.228.223.104
traceroute to 65.122.178.73 (65.122.178.73), 30 hops max, 40 byte packets 1 er-rz-gig-3-3.stw-bonn.de (131.220.99.62) 1.792 ms 1.275 ms 1.125 ms 2 xr-bon1-te2-3.x-win.dfn.de (188.1.233.193) 0.705 ms 2.132 ms 0.755 ms 3 xr-bir1-te2-3.x-win.dfn.de (188.1.144.9) 1.477 ms 1.936 ms 1.051 ms 4 zr-fra1-te0-7-0-5.x-win.dfn.de (188.1.145.46) 4.034 ms 3.734 ms 4.957 ms 5 64.213.78.237 (64.213.78.237) 3.866 ms 3.295 ms 26.854 ms 6 jfk-brdr-04.inet.qwest.net (63.146.26.225) 119.511 ms 92.735 ms 99.019 ms 7 * * *
or quote from DE-CIX tech-list:
[www.microsoft.com] ----------- We also have some connectivity problems to ms, changing the bgp routing to another tier 1 carrier don t resolve the problem -----------
Cheers,
Thomas
participants (4)
-
Heath Jones
-
John van Oppen
-
Richard A Steenbergen
-
Thomas Schmid