Cogent --> Google Public DNS routing issue

Robert Glover

17 Aug 2011 17 Aug '11

4:09 a.m.

Hello, We have noticed that from our Cogent link (as well as from ALL U.S. based points we tested via the Cogent Looking Glass: http://www.cogentco.com/en/network/looking-glass), traceroutes to 8.8.8.8 and 8.8.5.5 all seem to go over to Europe: TRACE from Los Angeles to 8.8.4.4 1 gi2-6.99.mpd03.lax01.atlas.cogentco.com (66.28.3.81) 0.514 ms 0.332 ms 2 te0-0-0-6.ccr22.lax01.atlas.cogentco.com (154.54.28.145) 0.401 ms 0.518 ms 3 te0-3-0-6.ccr22.iah01.atlas.cogentco.com (154.54.3.185) 36.538 ms te0-2-0-5.ccr22.iah01.atlas.cogentco.com (154.54.27.18) 36.449 ms 4 te0-0-0-2.ccr22.atl01.atlas.cogentco.com (154.54.5.93) 63.200 ms te0-2-0-7.ccr22.atl01.atlas.cogentco.com (66.28.4.89) 62.909 ms 5 te0-1-0-2.ccr22.dca01.atlas.cogentco.com (154.54.28.230) 62.985 ms te0-0-0-7.ccr22.dca01.atlas.cogentco.com (154.54.28.221) 62.894 ms 6 te0-5-0-6.ccr22.jfk02.atlas.cogentco.com (154.54.42.30) 69.434 ms te0-5-0-2.ccr22.jfk02.atlas.cogentco.com (154.54.42.26) 69.229 ms 7 te0-3-0-2.ccr22.lon13.atlas.cogentco.com (154.54.30.22) 144.224 ms te0-2-0-2.ccr22.lon13.atlas.cogentco.com (154.54.30.133) 146.474 ms 8 te8-1.mpd02.lon01.atlas.cogentco.com (154.54.57.182) 143.890 ms te7-1.mpd02.lon01.atlas.cogentco.com (154.54.57.178) 146.320 ms 9 ldn-b4-link.telia.net (213.248.70.237) 145.967 ms 145.872 ms 10 80.91.247.93 (80.91.247.93) 155.170 ms ldn-bb1-link.telia.net (213.155.130.46) 145.971 ms 11 ldn-b3-link.telia.net (213.155.133.33) 143.909 ms 80.91.249.170 (80.91.249.170) 146.213 ms 12 google-ic-126258-ldn-b3.c.telia.net (213.248.67.66) 146.158 ms 146.291 ms 13 64.233.175.25 (64.233.175.25) 146.166 ms 64.233.175.27 (64.233.175.27) 144.256 ms 14 209.85.253.92 (209.85.253.92) 146.435 ms 146.401 ms 15 72.14.232.134 (72.14.232.134) 150.128 ms 66.249.95.173 (66.249.95.173) 152.591 ms 16 209.85.252.83 (209.85.252.83) 149.993 ms 209.85.251.231 (209.85.251.231) 152.270 ms 17 209.85.243.85 (209.85.243.85) 152.514 ms 152.332 ms 18 google-public-dns-b.google.com (8.8.4.4) 152.436 ms 152.299 ms The routes from our backup carrier stay state-side and result in ~30ms ping times. Through Cogent, the ping times are around ~164ms What is going on here? Can someone from Cogent chime in on this? -Robert

Show replies by date

Dave Pooser

17 Aug 17 Aug

4:19 a.m.

On 8/16/11 11:09 PM, "Robert Glover" <robertg@garlic.com> wrote:

...

What is going on here?

Cogent finally depeered the entire US? :^) -- Dave Pooser Manager of Information Services Alford Media http://www.alfordmedia.com

Christopher Morrow

5:07 a.m.

On Wed, Aug 17, 2011 at 12:09 AM, Robert Glover <robertg@garlic.com> wrote:

...

Hello,

We have noticed that from our Cogent link (as well as from ALL U.S. based points we tested via the Cogent Looking Glass: http://www.cogentco.com/en/network/looking-glass), traceroutes to 8.8.8.8 and 8.8.5.5 all seem to go over to Europe:

8.8.5.5 ain't the driods you are looking for...

Patrick W. Gilmore

1:13 p.m.

On Aug 17, 2011, at 1:07 AM, Christopher Morrow wrote:

...

On Wed, Aug 17, 2011 at 12:09 AM, Robert Glover <robertg@garlic.com> wrote:

...
Hello,

We have noticed that from our Cogent link (as well as from ALL U.S. based points we tested via the Cogent Looking Glass: http://www.cogentco.com/en/network/looking-glass), traceroutes to 8.8.8.8 and 8.8.5.5 all seem to go over to Europe:

8.8.5.5 ain't the driods you are looking for...

In the traceroute appended to the original post, he did trace to 8.8.4.4. While it did go all over, I don't see the problem - it got to the destination host. Anycast is OK for some things, but it depends on BGP. BGP has zero concept of latency, loss, or geography. Expecting anycast to guarantee an optimal path or location is a grave error. The possible reasons for this are nearly innumerable. Perhaps Congent <> Google is congested in the US so one or the other prefers EU? Perhaps there is some IGP metric messed up inside Cogent that prefers the EU? Perhaps more nefarious problems, such as Google de-peering Cogent in the US? Etc., etc. You may be able to find out if you look, and you may not (I didn't even try). But even if you do figure out the answer, you can't fix it. Only Cogent and/or Google can. Moreover, you can see things like this with anycast even when there is no problem! -- TTFN, patrick

David Miller

4:01 p.m.

On 8/17/2011 9:13 AM, Patrick W. Gilmore wrote:

...

On Aug 17, 2011, at 1:07 AM, Christopher Morrow wrote:

...
On Wed, Aug 17, 2011 at 12:09 AM, Robert Glover<robertg@garlic.com> wrote:

...
Hello,

We have noticed that from our Cogent link (as well as from ALL U.S. based points we tested via the Cogent Looking Glass: http://www.cogentco.com/en/network/looking-glass), traceroutes to 8.8.8.8 and 8.8.5.5 all seem to go over to Europe: 8.8.5.5 ain't the driods you are looking for... In the traceroute appended to the original post, he did trace to 8.8.4.4.

While it did go all over, I don't see the problem - it got to the destination host.

Anycast is OK for some things, but it depends on BGP. BGP has zero concept of latency, loss, or geography. Expecting anycast to guarantee an optimal path or location is a grave error.

There are two basic types of anycast: 1. Simple anycast - announce an anycast prefix to whoever/wherever in more than one location. 2. Global anycast + careful configuration - announce an anycast prefix to particular providers at specific geographically disparate locations and using other options to achieve geographic and/or performant inbound traffic distribution. Perhaps we need a new term for 2. Google is clearly attempting to implement 2 and not 1 for their resolving DNS service. Based on Google's claims of speed (and my testing of their response times), they have either found a way to exceed the speed of light with packets or they are managing to keep most of their traffic "local ish" to the requester. To say that anycast "relies on BGP" and therefore expecting an optimal path is an error - is disengenuous (I want a better word, but this one will do). The internet as a whole "relies on BGP" and yet we expect mostly optimal paths. While it is true that BGP has no capacity to account for latency or loss, IGPs which can take into account these factors end at the borders of networks (where prefixes are passed using BGP). This is what makes up the "inter net". If you were tracing from a host in Ashburn to a unicast host in NYC and your path passed through San Jose, then you would say that was an issue. The same would be true with an anycast destination address. As to geography, IGPs don't have a concept of geography either. A router in NYC doesn't know or care that the router at the other end of a link is in CHI. All it knows is the prefixes that it gets from that router and metrics to choose a best path for them. BGP combined with "proper" (i.e. distributed) peering of networks does provide performant paths for traffic. In an anycast configuration the "careful configuration" is selecting providers to announce anycast prefixes to and communities that you put on the prefixes to control redistribution. Global anycast + careful configuration can and does provide mostly performant paths and a very high level of geographic fidelity - though, granted, not "guaranteed" (at least not guaranteed at a higher level than unicast prefixes). You can't "guarantee" performant paths ever (regardless of anycast or unicast) if any path between the source and destination crosses the border between two networks because some networks will choose a "primary" upstream (single homed or heavily pref'ed) that only picks up a prefix in a particular area and sends all of the traffic there. The originator of the prefix can depref that provider to try to influence path selection, but some networks will doggedly prefer to send packets to that network despite the efforts of the originator. The only thing to do then is to ask why this network selected that particular upstream and then to explain to them why that might not have been the best choice, if they want performant paths...

...

The possible reasons for this are nearly innumerable. Perhaps Congent<> Google is congested in the US so one or the other prefers EU? Perhaps there is some IGP metric messed up inside Cogent that prefers the EU? Perhaps more nefarious problems, such as Google de-peering Cogent in the US? Etc., etc.

You may be able to find out if you look, and you may not (I didn't even try). But even if you do figure out the answer, you can't fix it. Only Cogent and/or Google can.

My traces show all the Cogent locations in the US that I traced from going to Telia in EU and then to Google. My traces from Telia locations in the US all (properly) reach Google destinations in the US. So, Cogent is only receiving/using/preferring these two prefixes from their peering(s) with Telia in EU. As to the root cause of that... only the players in that game can say.

...

Moreover, you can see things like this with anycast even when there is no problem!

The OP believes that it is a problem. You *can* see this with anycast, but I would say that this *is* a "problem" (for my definition of "problem" which admittedly may be different from others). There are many potential solutions to the problem, the most obvious is for the OP to stop preferring to send traffic to these prefixes over Cogent. To the OP: I have to wonder what factors were used to decide "primary" vs "backup" provider. If "price", then you should expect issues with less performant routing. If "quality", then what measures were used to determine a "quality" ranking? I am also curious as to who the "backup" is (but that is just morbid curiousity). -DMM

Dennis Burgess

4:27 p.m.

...

-----Original Message----- From: David Miller [mailto:dmiller@tiggee.com] Sent: Wednesday, August 17, 2011 11:02 AM To: nanog@nanog.org Subject: Re: Cogent --> Google Public DNS routing issue

On 8/17/2011 9:13 AM, Patrick W. Gilmore wrote:

...
On Aug 17, 2011, at 1:07 AM, Christopher Morrow wrote:

...
On Wed, Aug 17, 2011 at 12:09 AM, Robert Glover<robertg@garlic.com> wrote:

...
Hello,

We have noticed that from our Cogent link (as well as from ALL U.S. based points we tested via the Cogent Looking Glass: http://www.cogentco.com/en/network/looking-glass), traceroutes to 8.8.8.8 and 8.8.5.5 all seem to go over to Europe: 8.8.5.5 ain't the driods you are looking for... In the traceroute appended to the original post, he did trace to 8.8.4.4.

While it did go all over, I don't see the problem - it got to the destination host.

Anycast is OK for some things, but it depends on BGP. BGP has zero concept of latency, loss, or geography. Expecting anycast to guarantee an optimal path or location is a grave error.

There are two basic types of anycast: 1. Simple anycast - announce an anycast prefix to whoever/wherever in more than one location. 2. Global anycast + careful configuration - announce an anycast prefix to particular providers at specific geographically disparate locations and using other options to achieve geographic and/or performant inbound traffic distribution.

Perhaps we need a new term for 2.

Google is clearly attempting to implement 2 and not 1 for their resolving DNS service. Based on Google's claims of speed (and my testing of their response times), they have either found a way to exceed the speed of light with packets or they are managing to keep most of their traffic "local ish" to the requester.

To say that anycast "relies on BGP" and therefore expecting an optimal

...

is an error - is disengenuous (I want a better word, but this one will do). The internet as a whole "relies on BGP" and yet we expect mostly optimal

...

While it is true that BGP has no capacity to account for latency or loss, IGPs which can take into account these factors end at the borders of networks (where prefixes are passed using BGP). This is what makes up the "inter net".

If you were tracing from a host in Ashburn to a unicast host in NYC and your path passed through San Jose, then you would say that was an issue. The same would be true with an anycast destination address.

As to geography, IGPs don't have a concept of geography either. A router in NYC doesn't know or care that the router at the other end of a link is in CHI. All it knows is the prefixes that it gets from that router and metrics to choose a best path for them. BGP combined with "proper" (i.e. distributed)

...

of networks does provide performant paths for traffic. In an anycast configuration the "careful configuration" is selecting providers to announce anycast prefixes to and communities that you put on the prefixes to control redistribution. Global anycast + careful configuration can and does provide mostly performant paths and a very high level of geographic fidelity -

...

granted, not "guaranteed" (at least not guaranteed at a higher level

...

unicast prefixes).

You can't "guarantee" performant paths ever (regardless of anycast or unicast) if any path between the source and destination crosses the border between two networks because some networks will choose a "primary" upstream (single homed or heavily pref'ed) that only picks up a prefix in a particular area and sends all of the traffic there. The originator of

...

can depref that provider to try to influence path selection, but some networks will doggedly prefer to send packets to that network despite

...

efforts of the originator. The only thing to do then is to ask why

...

selected that particular upstream and then to explain to them why that might not have been the best choice, if they want performant paths...

...
The possible reasons for this are nearly innumerable. Perhaps Congent<> Google is congested in the US so one or the other prefers EU? Perhaps

...

is some IGP metric messed up inside Cogent that prefers the EU? Perhaps more nefarious problems, such as Google de-peering Cogent in the US? Etc., etc.

...
You may be able to find out if you look, and you may not (I didn't

even try). But even if you do figure out the answer, you can't fix it. Only Cogent and/or Google can.

My traces show all the Cogent locations in the US that I traced from going to Telia in EU and then to Google.

My traces from Telia locations in the US all (properly) reach Google destinations in the US.

So, Cogent is only receiving/using/preferring these two prefixes from

...

peering(s) with Telia in EU.

As to the root cause of that... only the players in that game can say.

...
Moreover, you can see things like this with anycast even when there

is no problem!

...
The OP believes that it is a problem. You *can* see this with anycast, but I would say that this *is* a "problem" (for my definition of "problem" which admittedly may be different from others). There are many potential solutions to the problem, the most obvious is for the OP to stop

The .129 is our peer to cogent, it just drops the traffic now.. Tracing route to google-public-dns-a.google.com [8.8.8.8] over a maximum of 30 hops: 1 <1 ms <1 ms <1 ms 172.25.0.1 2 1 ms 1 ms 1 ms 10.250.0.129 3 10.250.0.129 reports: Destination host unreachable. Trace complete. ----------------------------------------------------------- Dennis Burgess, Mikrotik Certified Trainer Link Technologies, Inc -- Mikrotik & WISP Support Services Office: 314-735-0270 Website: http://www.linktechs.net LIVE On-Line Mikrotik Training - Author of "Learn RouterOS" path paths. peering though, than the prefix the this network there their preferring to

...

send traffic to these prefixes over Cogent.

To the OP: I have to wonder what factors were used to decide "primary" vs "backup" provider. If "price", then you should expect issues with less performant routing. If "quality", then what measures were used to determine a "quality" ranking? I am also curious as to who the "backup" is (but that is just morbid curiousity).

-DMM

Justin Wilson

22 Aug 22 Aug

12:03 a.m.

Cogent and Google have had several issues over the past month. I have opened 3 tickets up with Cogent over inability to reach Google (web,apps, dns,etc). Each time they have been known issues by the time I called. Justin -- Justin Wilson <j2sw@mtin.net> Aol & Yahoo IM: j2sw http://www.mtin.net/blog xISP News http://www.twitter.com/j2sw Follow me on Twitter On 8/17/11 12:09 AM, "Robert Glover" <robertg@garlic.com> wrote:

...

Hello,

We have noticed that from our Cogent link (as well as from ALL U.S. based points we tested via the Cogent Looking Glass: http://www.cogentco.com/en/network/looking-glass), traceroutes to 8.8.8.8 and 8.8.5.5 all seem to go over to Europe:

TRACE from Los Angeles to 8.8.4.4 1 gi2-6.99.mpd03.lax01.atlas.cogentco.com (66.28.3.81) 0.514 ms 0.332 ms 2 te0-0-0-6.ccr22.lax01.atlas.cogentco.com (154.54.28.145) 0.401 ms 0.518 ms 3 te0-3-0-6.ccr22.iah01.atlas.cogentco.com (154.54.3.185) 36.538 ms te0-2-0-5.ccr22.iah01.atlas.cogentco.com (154.54.27.18) 36.449 ms 4 te0-0-0-2.ccr22.atl01.atlas.cogentco.com (154.54.5.93) 63.200 ms te0-2-0-7.ccr22.atl01.atlas.cogentco.com (66.28.4.89) 62.909 ms 5 te0-1-0-2.ccr22.dca01.atlas.cogentco.com (154.54.28.230) 62.985 ms te0-0-0-7.ccr22.dca01.atlas.cogentco.com (154.54.28.221) 62.894 ms 6 te0-5-0-6.ccr22.jfk02.atlas.cogentco.com (154.54.42.30) 69.434 ms te0-5-0-2.ccr22.jfk02.atlas.cogentco.com (154.54.42.26) 69.229 ms 7 te0-3-0-2.ccr22.lon13.atlas.cogentco.com (154.54.30.22) 144.224 ms te0-2-0-2.ccr22.lon13.atlas.cogentco.com (154.54.30.133) 146.474 ms 8 te8-1.mpd02.lon01.atlas.cogentco.com (154.54.57.182) 143.890 ms te7-1.mpd02.lon01.atlas.cogentco.com (154.54.57.178) 146.320 ms 9 ldn-b4-link.telia.net (213.248.70.237) 145.967 ms 145.872 ms 10 80.91.247.93 (80.91.247.93) 155.170 ms ldn-bb1-link.telia.net (213.155.130.46) 145.971 ms 11 ldn-b3-link.telia.net (213.155.133.33) 143.909 ms 80.91.249.170 (80.91.249.170) 146.213 ms 12 google-ic-126258-ldn-b3.c.telia.net (213.248.67.66) 146.158 ms 146.291 ms 13 64.233.175.25 (64.233.175.25) 146.166 ms 64.233.175.27 (64.233.175.27) 144.256 ms 14 209.85.253.92 (209.85.253.92) 146.435 ms 146.401 ms 15 72.14.232.134 (72.14.232.134) 150.128 ms 66.249.95.173 (66.249.95.173) 152.591 ms 16 209.85.252.83 (209.85.252.83) 149.993 ms 209.85.251.231 (209.85.251.231) 152.270 ms 17 209.85.243.85 (209.85.243.85) 152.514 ms 152.332 ms 18 google-public-dns-b.google.com (8.8.4.4) 152.436 ms 152.299 ms

The routes from our backup carrier stay state-side and result in ~30ms ping times. Through Cogent, the ping times are around ~164ms

What is going on here? Can someone from Cogent chime in on this?

-Robert

5329

Age (days ago)

5334

Last active (days ago)

List overview

Download

6 comments

7 participants

participants (7)

Christopher Morrow
Dave Pooser
David Miller
Dennis Burgess
Justin Wilson
Patrick W. Gilmore
Robert Glover