Rapidly-variable routing on the time scale of seconds to minutes?
We did a "traceroute" end-to-end routing measurement in 2004 and found about 5-10% of measuremnts exhibiting rapidly-variable routing on the time scale of a single traceroute (seconds to minutes). In other words, the packets belonging to a single traceroute took multiple paths. Vern Paxson mentioned in 1997 one mechanism that can lead to this "route fluttering" behavior as "route splitting", which is explicity allowed in RFC1812 - Requirements for IP version 4 Routers. Route change in such a short scale for packets in the same flow could be troublesome. But the occurrence of such behavior does not seem to have reduced over the past years at least from our measurements. Does anyone know how to explain this behavior? Thanks! An example traceroute record containing the fluttering is shown below (see the 5th hop) Fri Apr 09 09:35:35 2004 1 cisfhfb.fh-friedberg.de (212.201.24.1) 1.095 ms 0.402 ms 0.321 ms 2 ar-frankfurt2.g-win.dfn.de (188.1.42.9) 120.105 ms 198.766 ms 200.040 ms 3 cr-frankfurt1-ge5-0.g-win.dfn.de (188.1.80.1) 2.093 ms 2.142 ms 2.087 ms 4 so-6-0-0.ar2.FRA2.gblx.net (208.48.23.141) 2.461 ms 2.349 ms 2.333 ms 5 pos5-0-2488M.cr2.FRA2.gblx.net (67.17.65.53) 2.448 ms pos6-0-2488M.cr1.FRA2.gblx.net (67.17.65.77) 2.368 ms 2.281 ms 6 so3-0-0-2488M.ar2.FRA3.gblx.net (67.17.65.82) 2.676 ms 2.750 ms so2-0-0-2488M.ar2.FRA3.gblx.net (67.17.65.58) 2.569 ms 7 ge-7-2.Frankfurt1.Level3.net (195.122.136.245) 10.971 ms 10.967 ms 10.882 ms 8 ae-0-55.mp1.Frankfurt1.Level3.net (195.122.136.97) 11.488 ms 11.417 ms 11.353 ms 9 so-0-0-0.mp1.London2.Level3.net (212.187.128.61) 27.203 ms 27.042 ms 27.048 ms 10 so-1-0-0.bbr1.Washington1.Level3.net (212.187.128.138) 91.004 ms 91.006 ms 90.977 ms 11 ge-0-0-0.mpls1.Honolulu2.Level3.net (4.68.128.13) 212.254 ms 212.321 ms 212.351 ms 12 so-7-0.hsa1.Honolulu2.Level3.net (4.68.112.90) 212.407 ms 212.250 ms 212.365 ms 13 s1.lavanet.bbnplanet.net (4.24.134.18) 212.609 ms 212.372 ms 213.270 ms 14 malasada.lava.net (64.65.64.17) 212.260 ms 212.460 ms 212.226 ms Best regards, Charles
On Mon, Jan 31, 2005 at 04:20:31AM -0500, Charles Shen wrote:
We did a "traceroute" end-to-end routing measurement in 2004 and found about 5-10% of measuremnts exhibiting rapidly-variable routing on the time scale of a single traceroute (seconds to minutes). In other words, the packets belonging to a single traceroute took multiple paths. [...] Route change in such a short scale for packets in the same flow could be troublesome. But the occurrence of such behavior does not seem to have reduced over the past years at least from our measurements. Does anyone know how to explain this behavior? Thanks!
Yes, this is normal per-flow load balancing on parallel backbone lines. Usually, "flows" are defined via a hash on L3 and possibly L4 addressing information (IP source/dest, and TCP/UDP port source/dest, ICMP code, etc.). If the "flow hash" contains L4 information, every traceroute probe packet is considered a different "flow" and you see exactly that:
5 pos5-0-2488M.cr2.FRA2.gblx.net (67.17.65.53) 2.448 ms pos6-0-2488M.cr1.FRA2.gblx.net (67.17.65.77) 2.368 ms 2.281 ms 6 so3-0-0-2488M.ar2.FRA3.gblx.net (67.17.65.82) 2.676 ms 2.750 ms so2-0-0-2488M.ar2.FRA3.gblx.net (67.17.65.58) 2.569 ms
You would NOT see the same effect with packets of e.g. the same TCP session, so this (multipath forwarding) is usually no problem (as for TCP and UDP applications there is no reordering happening). So your analysis results (traceroute) are misleading for most real-life applications. I agree that it's irritating and I personally favor using aggregated SONET/Ethernet devices (IEEE 801.3ad) to bundle parallel lines if possible. Best regards, Daniel -- CLUE-RIPE -- Jabber: dr@cluenet.de -- dr@IRCnet -- PGP: 0xA85C8AA0
Daniel Roesen wrote:
You would NOT see the same effect with packets of e.g. the same TCP session, so this (multipath forwarding) is usually no problem (as for TCP and UDP applications there is no reordering happening). So your analysis results (traceroute) are misleading for most real-life applications. I agree that it's irritating and I personally favor using aggregated SONET/Ethernet devices (IEEE 801.3ad) to bundle parallel lines if possible.
Best regards, Daniel
Good point Daniel. Perhaps the researchers should be using Layer Four traceroute. John
Charles Shen wrote:
An example traceroute record containing the fluttering is shown below (see the 5th hop)
Fri Apr 09 09:35:35 2004
1 cisfhfb.fh-friedberg.de (212.201.24.1) 1.095 ms 0.402 ms 0.321 ms 2 ar-frankfurt2.g-win.dfn.de (188.1.42.9) 120.105 ms 198.766 ms 200.040 ms 3 cr-frankfurt1-ge5-0.g-win.dfn.de (188.1.80.1) 2.093 ms 2.142 ms 2.087 ms 4 so-6-0-0.ar2.FRA2.gblx.net (208.48.23.141) 2.461 ms 2.349 ms 2.333 ms 5 pos5-0-2488M.cr2.FRA2.gblx.net (67.17.65.53) 2.448 ms pos6-0-2488M.cr1.FRA2.gblx.net (67.17.65.77) 2.368 ms 2.281 ms
That sure looks like ECM to me. Equal Cost Multi-Path. This is NOT anything new. What's the big deal?
6 so3-0-0-2488M.ar2.FRA3.gblx.net (67.17.65.82) 2.676 ms 2.750 ms so2-0-0-2488M.ar2.FRA3.gblx.net (67.17.65.58) 2.569 ms
Same here. -- John Fraizer
Please see inline.
-----Original Message----- From: John Fraizer [mailto:nanog@enterzone.net] Sent: Monday, January 31, 2005 8:21 AM To: Charles Shen; nanog@merit.edu Subject: Re: Rapidly-variable routing on the time scale of seconds to minutes?
Charles Shen wrote:
An example traceroute record containing the fluttering is shown below (see the 5th hop)
Fri Apr 09 09:35:35 2004
1 cisfhfb.fh-friedberg.de (212.201.24.1) 1.095 ms 0.402 ms 0.321 ms 2 ar-frankfurt2.g-win.dfn.de (188.1.42.9) 120.105 ms 198.766 ms 200.040 ms 3 cr-frankfurt1-ge5-0.g-win.dfn.de (188.1.80.1) 2.093 ms 2.142 ms 2.087 ms 4 so-6-0-0.ar2.FRA2.gblx.net (208.48.23.141) 2.461 ms 2.349 ms 2.333 ms 5 pos5-0-2488M.cr2.FRA2.gblx.net (67.17.65.53) 2.448 ms pos6-0-2488M.cr1.FRA2.gblx.net (67.17.65.77) 2.368 ms 2.281 ms
That sure looks like ECM to me. Equal Cost Multi-Path. This is NOT anything new. What's the big deal?
From the responses, the answer to "the rapidly-variable routing on the time scale of seconds to minutes" seems to be:
1. It could be link layer load balancing, with the two interfaces belonging to the same router. 2. It could be per-flow load balancing where flows are defined via both L3 and L4 info, so traceroute probe could not reflect the truth. My question is then: would it be safe to argue that the above two causes explain all (or most of?) the observed "fluttering" routers? (some examples listed below) What we are concerned about is per-packet load balancing (packets in the same flow go through different paths), which will cause trouble to protocols that install state information in routers along the flow path. Example pairs: 144.223.27.146 sl-telia1-1-0.sprintlink.net 144.232.230.30 sl-telia1-4-0.sprintlink.net 216.140.0.66 s3-0-0.a1.hywr.broadwing.net 216.140.0.70 s4-0-0.a1.hywr.broadwing.net 67.17.65.53 pos5-0-2488M.cr2.FRA2.gblx.net 67.17.65.77 pos6-0-2488M.cr1.FRA2.gblx.net 67.17.65.54 so5-0-0-2488M.ar2.FRA2.gblx.net 67.17.65.78 so4-0-0-2488M.ar2.FRA2.gblx.net 67.17.65.57 pos11-0-2488M.cr2.FRA2.gblx.net 67.17.65.81 pos11-0-2488M.cr1.FRA2.gblx.net 67.17.65.58 so2-0-0-2488M.ar2.FRA3.gblx.net 67.17.65.82 so3-0-0-2488M.ar2.FRA3.gblx.net 67.17.64.66 pos6-0-2488M.cr2.SFO1.gblx.net 67.17.74.157 pos8-0-2488M.cr1.SFO1.gblx.net 129.250.2.183 p16-3-0-0.r01.snjsca04.us.bb.verio.net 129.250.5.136 p16-7-0-0.r00.snjsca04.us.bb.verio.net
On Mon, Jan 31, 2005 at 09:59:39PM -0500, Charles Shen wrote: [ snip ]
From the responses, the answer to "the rapidly-variable routing on the time scale of seconds to minutes" seems to be:
1. It could be link layer load balancing, with the two interfaces belonging to the same router. 2. It could be per-flow load balancing where flows are defined via both L3 and L4 info, so traceroute probe could not reflect the truth.
My question is then: would it be safe to argue that the above two causes explain all (or most of?) the observed "fluttering" routers? (some examples listed below) What we are concerned about is per-packet load balancing (packets in the same flow go through different paths), which will cause trouble to protocols that install state information in routers along the flow path.
AFAIK, multiple routers showing up in a single-hop in traceroute response is a sign of packet-by-packet load balancing, not flow based. I could be wrong, though this was my past observation. P.S.: What router-interacting applications are you using? -J -- James Jun TowardEX Technologies, Inc. Technical Lead Boston IPv4/IPv6 Web Hosting, Colocation and james@towardex.com Network design/consulting & configuration services cell: 1(978)-394-2867 web: http://www.towardex.com , noc: www.twdx.net
On Mon, Jan 31, 2005 at 09:59:39PM -0500, Charles Shen wrote: [ snip ]
From the responses, the answer to "the rapidly-variable routing on the time scale of seconds to minutes" seems to be:
1. It could be link layer load balancing, with the two interfaces belonging to the same router. 2. It could be per-flow load
where flows are defined via both L3 and L4 info, so
could not reflect the truth.
My question is then: would it be safe to argue that the above two causes explain all (or most of?) the observed "fluttering" routers? (some examples listed below) What we are concerned about is per-packet load balancing (packets in the same flow go through different paths), which will cause trouble to protocols
balancing traceroute probe that install
state information in routers along the flow path.
AFAIK, multiple routers showing up in a single-hop in traceroute response is a sign of packet-by-packet load balancing, not flow based.
I could be wrong, though this was my past observation.
P.S.: What router-interacting applications are you using?
I am talking about e.g. QoS reservation signaling applications.
James wrote:
AFAIK, multiple routers showing up in a single-hop in traceroute response is a sign of packet-by-packet load balancing, not flow based.
I could be wrong, though this was my past observation.
P.S.: What router-interacting applications are you using?
-J
I would venture to guess that in 99% of the cases, it's not multiple routers showing up in a single hop but, rather multiple interfaces on the same router showing up. John
On Mon, Jan 31, 2005 at 10:08:39PM -0500, James wrote:
AFAIK, multiple routers showing up in a single-hop in traceroute response is a sign of packet-by-packet load balancing, not flow based.
Not necessarily, and in most cases probably not a fact. Don't forget that standard UNIX traceroute uses UDP where the destination port of the probes is increased for each subsequent probe. So per-flow balancing hashes taking L4 header information into account will see each traceroute probe as distinct "flow". Best regards, Daniel -- CLUE-RIPE -- Jabber: dr@cluenet.de -- dr@IRCnet -- PGP: 0xA85C8AA0
On Mon, Jan 31, 2005 at 09:59:39PM -0500, Charles Shen wrote:
From the responses, the answer to "the rapidly-variable routing on the time scale of seconds to minutes" seems to be:
1. It could be link layer load balancing, with the two interfaces belonging to the same router. 2. It could be per-flow load balancing where flows are defined via both L3 and L4 info, so traceroute probe could not reflect the truth.
That's no contradiction as far as I read it. Wether the two equal-cost paths are terminated on the same routers doesn't matter actually.
My question is then: would it be safe to argue that the above two causes explain all (or most of?) the observed "fluttering" routers?
Taking seldom observed, transient control plane convergence effects (IGP/BGP converging while traceroute is used), probably yes.
(some examples listed below)
Well, to see wether flow-balancing is used, use e.g. TCP traceroute. If you see "stable" results (all three probes of a hop matching) there all the time, ...
What we are concerned about is per-packet load balancing (packets in the same flow go through different paths), which will cause trouble to protocols that install state information in routers along the flow path.
Modern core router hardware like Juniper (IP2 ASIC) can't do classic per-packet load balancing anymore at all, only per-flow balancing. I'm not sure for the GSR platform, but as far as I remember, it's not supported at all on Engine 2 line cards, and has a performance penalty otherwise. Exec summary: I seriously doubt the larger shops do so, either because their hardware can't do so at all (Juniper-based cores) and/or people know that per-packet load balancing leads to packet reordering which might make your customers quite unhappy. It's generally a bad idea. Best regards, Daniel -- CLUE-RIPE -- Jabber: dr@cluenet.de -- dr@IRCnet -- PGP: 0xA85C8AA0
On Tue, Feb 01, 2005 at 08:17:03AM +0100, Daniel Roesen wrote:
I'm not sure for the GSR platform, but as far as I remember, it's not supported at all on Engine 2 line cards, and has a performance penalty otherwise.
Found some reference on that: http://www.cisco.com/en/US/products/sw/iosswrel/ps1829/products_feature_guid... Bottom line: works on E0 and E1 linecards, and with caveats on E2. Not with newer ones. So Cisco is dumping that too. Good to see that. Best regards, Daniel -- CLUE-RIPE -- Jabber: dr@cluenet.de -- dr@IRCnet -- PGP: 0xA85C8AA0
participants (4)
-
Charles Shen
-
Daniel Roesen
-
James
-
John Fraizer