So Philip Smith / Geoff Huston's CIDR report becomes worth a good hard look today
512K routes, here we come. Lots of TCAM based routers suddenly become really expensive doorstops. Maybe time to revisit this old 2007 nanog thread? http://www.gossamer-threads.com/lists/engine?do=post_view_flat;post=99870;pa... FYI nanog - https://puck.nether.net/pipermail/outages/2014-August/007091.html [outages] Major outages today, not much info at this time Teun Vink teun at teun.tv Tue Aug 12 11:42:05 EDT 2014 On di, 2014-08-12 at 15:20 +0000, Hoyle Anderson (AM) via Outages wrote:
I know this isn’t much help, but there are major problems with multiple ISPs since around 4-5 AM EST. I really don’t have much detail, but I have sites that are unreachable from some providers. Looks like Comcast, level3, ATT, cogent, etc.
So, it’s probably not just you, but I’m afraid I don’t know who it is. I heard one report of a datacenter outage.
Hi, Some routing tables hit 512K routes today. Some old hardware and software can't handle that and either crash or ignore newly learned routes. So this may cause some disturbances in the force. HTH, Teun -----------------
On Tue, 12 Aug 2014, Suresh Ramasubramanian wrote: Many don't need to buy anything new. Just follow the instructions here: http://www.cisco.com/c/en/us/support/docs/switches/catalyst-6500-series-swit... We did this in the 1st week of June. Problem solved. -Hank
512K routes, here we come. Lots of TCAM based routers suddenly become really expensive doorstops.
Maybe time to revisit this old 2007 nanog thread?
http://www.gossamer-threads.com/lists/engine?do=post_view_flat;post=99870;pa...
FYI nanog - https://puck.nether.net/pipermail/outages/2014-August/007091.html
[outages] Major outages today, not much info at this time
Teun Vink teun at teun.tv Tue Aug 12 11:42:05 EDT 2014
On di, 2014-08-12 at 15:20 +0000, Hoyle Anderson (AM) via Outages wrote:
I know this isn’t much help, but there are major problems with multiple ISPs since around 4-5 AM EST. I really don’t have much detail, but I have sites that are unreachable from some providers. Looks like Comcast, level3, ATT, cogent, etc.
So, it’s probably not just you, but I’m afraid I don’t know who it is. I heard one report of a datacenter outage.
Hi,
Some routing tables hit 512K routes today. Some old hardware and software can't handle that and either crash or ignore newly learned routes. So this may cause some disturbances in the force.
HTH, Teun
-----------------
On Tue, 12 Aug 2014, Hank Nussbacher wrote: http://www.cisco.com/c/en/us/support/docs/switches/catalyst-6500-series-swit... -Hank
On Tue, 12 Aug 2014, Suresh Ramasubramanian wrote:
Many don't need to buy anything new. Just follow the instructions here: http://www.cisco.com/c/en/us/support/docs/switches/catalyst-6500-series-swit... We did this in the 1st week of June. Problem solved.
-Hank
512K routes, here we come. Lots of TCAM based routers suddenly become really expensive doorstops.
Maybe time to revisit this old 2007 nanog thread?
http://www.gossamer-threads.com/lists/engine?do=post_view_flat;post=99870;pa...
FYI nanog - https://puck.nether.net/pipermail/outages/2014-August/007091.html
[outages] Major outages today, not much info at this time
Teun Vink teun at teun.tv Tue Aug 12 11:42:05 EDT 2014
On di, 2014-08-12 at 15:20 +0000, Hoyle Anderson (AM) via Outages wrote:
I know this isn’t much help, but there are major problems with multiple ISPs since around 4-5 AM EST. I really don’t have much detail, but I have sites that are unreachable from some providers. Looks like Comcast, level3, ATT, cogent, etc.
So, it’s probably not just you, but I’m afraid I don’t know who it is. I heard one report of a datacenter outage.
Hi,
Some routing tables hit 512K routes today. Some old hardware and software can't handle that and either crash or ignore newly learned routes. So this may cause some disturbances in the force.
HTH, Teun
-----------------
On Tue, Aug 12, 2014 at 2:42 PM, Hank Nussbacher <hank@efes.iucc.ac.il> wrote:
http://www.cisco.com/c/en/us/support/docs/switches/catalyst-6500-series-swit...
I note that the recommended command in that article, "mls cef maximum-routes ip 1000", will throw most of your IPv6 routes out of the TCAM instead. Which if you have any IPv6 traffic of substance just kills you in the other direction. Might want to try something more like "mls cef maximum-routes ip 900". Regards, Bill Herrin -- William Herrin ................ herrin@dirtside.com bill@herrin.us Owner, Dirtside Systems ......... Web: <http://www.dirtside.com/> Can I solve your unusual networking challenges?
On 12/08/14 23:10, William Herrin wrote:
I note that the recommended command in that article, "mls cef maximum-routes ip 1000", will throw most of your IPv6 routes out of the TCAM instead. Which if you have any IPv6 traffic of substance just kills you in the other direction. Might want to try something more like "mls cef maximum-routes ip 900".
And if you want any MPLS labels (especially if running 6PE) you might want to claw that back a bit further. tl;dr buy new routers next year. :) Tom
http://www.zdnet.com/internet-hiccups-today-youre-not-alone-heres-why-70000 32566/ "According to NANOG, and complaints tracker DownDetector, many Internet providers — including Comcast, Level3, AT&T, Cogent, Sprint, Verizon, and others — have suffered from serious performance problems at various times on Tuesday.” While we had a few multi-homed customers have problems with their routers, we did not see anything in the core. Is this just a ZDNET reporting error? - Kevin On 8/12/14, 6:10 PM, "William Herrin" <bill@herrin.us> wrote:
On Tue, Aug 12, 2014 at 2:42 PM, Hank Nussbacher <hank@efes.iucc.ac.il> wrote:
http://www.cisco.com/c/en/us/support/docs/switches/catalyst-6500-series-s witches/117712-problemsolution-cat6500-00.html
I note that the recommended command in that article, "mls cef maximum-routes ip 1000", will throw most of your IPv6 routes out of the TCAM instead. Which if you have any IPv6 traffic of substance just kills you in the other direction. Might want to try something more like "mls cef maximum-routes ip 900".
Regards, Bill Herrin
-- William Herrin ................ herrin@dirtside.com bill@herrin.us Owner, Dirtside Systems ......... Web: <http://www.dirtside.com/> Can I solve your unusual networking challenges?
On Tue, Aug 12, 2014 at 5:06 PM, McElearney, Kevin < Kevin_McElearney@cable.comcast.com> wrote:
http://www.zdnet.com/internet-hiccups-today-youre-not-alone-heres-why-70000 32566/
"According to NANOG, and complaints tracker DownDetector, many Internet providers — including Comcast, Level3, AT&T, Cogent, Sprint, Verizon, and others — have suffered from serious performance problems at various times on Tuesday.”
While we had a few multi-homed customers have problems with their routers, we did not see anything in the core. Is this just a ZDNET reporting error?
- Kevin
Unless you guys are miraculously managing to terminate Nx100G bundles into 6509s with Sup2 or sup3s, I would be really, really surprised if this even made it on your radar. Chalk it up to poorly-researched reporting. And if you *are* handling Nx100G bundles on 6509s, please contact me off-list, I need to get the details on your source for magic router pixie dust. ;) Matt
On Tue, 12 Aug 2014, Matthew Petach wrote:
On Tue, Aug 12, 2014 at 5:06 PM, McElearney, Kevin < Kevin_McElearney@cable.comcast.com> wrote:
http://www.zdnet.com/internet-hiccups-today-youre-not-alone-heres-why-70000 32566/
"According to NANOG, and complaints tracker DownDetector, many Internet providers ÿÿ including Comcast, Level3, AT&T, Cogent, Sprint, Verizon, and others ÿÿ have suffered from serious performance problems at various times on Tuesday.ÿÿ
While we had a few multi-homed customers have problems with their routers, we did not see anything in the core. Is this just a ZDNET reporting error?
Unless you guys are miraculously managing to terminate Nx100G bundles into 6509s with Sup2 or sup3s, I would be really, really surprised if this even made it on your radar. Chalk it up to poorly-researched reporting.
There are/have been multiple fiber provider outages the past two days, but I suspect there's always a fiber cut / outage somewhere.
And if you *are* handling Nx100G bundles on 6509s, please contact me off-list, I need to get the details on your source for magic router pixie dust. ;)
Cisco white papers. Where else? ---------------------------------------------------------------------- Jon Lewis, MCP :) | I route | therefore you are _________ http://www.lewis.org/~jlewis/pgp for PGP public key_________
From: Matthew Petach <mpetach@netflight.com>
Unless you guys are miraculously managing to terminate
Nx100G bundles into 6509s with Sup2 or sup3s, I would be really, really surprised if this even made it on your radar. Chalk it up to poorly-researched reporting.
And if you *are* handling Nx100G bundles on 6509s, please contact me off-list, I need to get the details on your source for magic router pixie dust. ;)
It made the radar with the consumer impact. We traced the issue quickly to customer datacenter routers/512K and worked with them to correct. We were surprised (or not really) with this being called a wide spread provider issue. Just checking if others really had an issue or was this isolated to a few data centers. No pixie dust ;-) - Kevin
On Aug 12, 2014, at 1:02 PM, Hank Nussbacher <hank@efes.iucc.ac.il> wrote:
Many don't need to buy anything new. Just follow the instructions here: http://www.cisco.com/c/en/us/support/docs/switches/catalyst-6500-series-swit... We did this in the 1st week of June. Problem solved.
s/Problem solved/Critical limit pushed out long enough to give us a few more years/ -- Leo Bicknell - bicknell@ufp.org - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/
Subject: So Philip Smith / Geoff Huston's CIDR report becomes worth a good hard look today Date: Tue, Aug 12, 2014 at 09:40:55PM +0530 Quoting Suresh Ramasubramanian (ops.lists@gmail.com):
512K routes, here we come. Lots of TCAM based routers suddenly become really expensive doorstops.
We had a planned outage yesterday 2300 UTC to perform the operation Hank mentions. Alas, around 0850UTC the table went "critical" and we had to do an emergency reboot. Well, the good part is that all 10G line cards survived, and we're back in operation. The new routers are bought or in the investment plan for this year. Just need to wait until it's time for our vendors fiscal year end race... -- Måns Nilsson primary/secondary/besserwisser/machina MN-1334-RIPE +46 705 989668 Am I accompanied by a PARENT or GUARDIAN?
half the routing table is deagg crap. filter it. you mean your vendor won't give you the knobs to do it smartly ([j]tac tickets open for five years)? wonder why. randy
Nice little article http://www.bgpmon.net/what-caused-todays-internet-hiccup/ -----Original Message----- From: NANOG [mailto:nanog-bounces@nanog.org] On Behalf Of Randy Bush Sent: Wednesday, August 13, 2014 4:43 PM To: Suresh Ramasubramanian Cc: North American Network Operators' Group Subject: Re: So Philip Smith / Geoff Huston's CIDR report becomes worth a good hard look today half the routing table is deagg crap. filter it. you mean your vendor won't give you the knobs to do it smartly ([j]tac tickets open for five years)? wonder why. randy
Same reason no vendor has bothered to prune redundant RIB entries (i.e. more-specific pointing to the same NH as a covering route) when programming the TCAM... -C On Aug 13, 2014, at 1:42 PM, Randy Bush <randy@psg.com> wrote:
half the routing table is deagg crap. filter it.
you mean your vendor won't give you the knobs to do it smartly ([j]tac tickets open for five years)? wonder why.
randy
On Wed, Aug 13, 2014 at 6:47 PM, Chris Woodfield <rekoil@semihuman.com> wrote:
On Aug 13, 2014, at 1:42 PM, Randy Bush <randy@psg.com> wrote:
half the routing table is deagg crap. filter it.
you mean your vendor won't give you the knobs to do it smartly ([j]tac tickets open for five years)? wonder why.
Same reason no vendor has bothered to prune redundant RIB entries (i.e. more-specific pointing to the same NH as a covering route) when programming the TCAM...
Hi Chris, Not so much, no. Pruning seemingly redundant entries from BGP is actually impossible to do safely, or if not impossible at least no one has demonstrated a successful algorithm that can prune even a single entry anywhere but the BGP source node or a BGP leaf node. And there's not much point in pruning the BGP RIB at a BGP leaf node -- DRAM to hold the RIB once received and processed is plentiful and inexpensive. Pruning FIB entries, on the other hand, can be done quite safely as long as you're willing to accept the conversion of "null route" to "don't care." Some experiments were done on this in the IETF a couple years back. Draft-zhang-fibaggregation maybe? Savings of 30% in typical backbone nodes looked possible. That's 30% of your TCAM reclaimable. For the moment it seems to be cheaper to just build bigger TCAMs. Cheaper for the router vendors anyway. Regards, Bill Herrin -- William Herrin ................ herrin@dirtside.com bill@herrin.us Owner, Dirtside Systems ......... Web: <http://www.dirtside.com/> Can I solve your unusual networking challenges?
Pruning FIB entries, on the other hand, can be done quite safely as long as you're willing to accept the conversion of "null route" to "don't care." Some experiments were done on this in the IETF a couple years back. Draft-zhang-fibaggregation maybe? Savings of 30% in typical backbone nodes looked possible. That's 30% of your TCAM reclaimable.
Hence the “when programming the TCAM” part of my original statement :)
For the moment it seems to be cheaper to just build bigger TCAMs. Cheaper for the router vendors anyway.
I think of it more like “why spend development dollars on a feature that will cause my customers to keep their existing hardware longer and delay upgrades?” Yes, vendors do think like that. -C
Regards, Bill Herrin
-- William Herrin ................ herrin@dirtside.com bill@herrin.us Owner, Dirtside Systems ......... Web: <http://www.dirtside.com/> Can I solve your unusual networking challenges?
On Wed, Aug 13, 2014 at 8:20 PM, Chris Woodfield <rekoil@semihuman.com> wrote:
Hence the “when programming the TCAM” part of my original statement :)
Hi Chris, My point was that Randy's BGP RIB pruning knobs are missing for a different reason than your router FIB pruning knobs. Neither the science nor the technology exists to create Randy's BGP pruning knobs. Regards, Bill Herrin -- William Herrin ................ herrin@dirtside.com bill@herrin.us Owner, Dirtside Systems ......... Web: <http://www.dirtside.com/> Can I solve your unusual networking challenges?
My point was that Randy's BGP RIB pruning knobs are missing for a different reason than your router FIB pruning knobs. Neither the science nor the technology exists to create Randy's BGP pruning knobs.
ahhh, you dug out the [j]tac tickets, or are you just conjecturbating? if the former, ticket numbers would be cool. randy
On Thu, Aug 14, 2014 at 4:57 PM, Randy Bush <randy@psg.com> wrote:
My point was that Randy's BGP RIB pruning knobs are missing for a different reason than your router FIB pruning knobs. Neither the science nor the technology exists to create Randy's BGP pruning knobs.
ahhh, you dug out the [j]tac tickets, or are you just conjecturbating?
Neither. I'm reporting the state of the science having been engrossed in its research for the better part of a decade. Places like the IRTF RRG. Because no science, also no tech. If you think have some magic new algorithm that the RRG didn't consider, feel free to explain it and I'll demonstrate a scenario for you where it fails too. And I tip my hat to the router vendors for declining to implement knobs which would damage the network, despite the demands of Randy Bush. Regards, Bill Herrin -- William Herrin ................ herrin@dirtside.com bill@herrin.us Owner, Dirtside Systems ......... Web: <http://www.dirtside.com/> Can I solve your unusual networking challenges?
On Thu, Aug 14, 2014 at 6:07 PM, Randy Bush <randy@psg.com> wrote:
ahhh, you dug out the [j]tac tickets, or are you just conjecturbating? Neither. I'm reporting the state of the science.
ROFL. so just ad hominem. smart.
That phrase "ad hominem," I don't think it means what you think it means. -Bill -- William Herrin ................ herrin@dirtside.com bill@herrin.us Owner, Dirtside Systems ......... Web: <http://www.dirtside.com/> Can I solve your unusual networking challenges?
On Aug 13, 2014, at 16:42 , Randy Bush <randy@psg.com> wrote:
half the routing table is deagg crap. filter it.
We disagree. Just because you don't like all more specifics doesn't mean they are useless. Not everything is about minimizing FIB size. (And RIB size hasn't been relevant for years.) People pay an ass-ton of money to save a few ms off their RTT, if a more specific will allow packets to travel LHR->FRA directly instead of packets going from LHR -> SFO -> FRA, they are useful even if there is a covering prefix.
you mean your vendor won't give you the knobs to do it smartly ([j]tac tickets open for five years)? wonder why.
Might be useful if you mentioned what you considered a "smart" way to trim the fib. But then you couldn't bitch and moan about people not understanding you, which is the real reason you post to NANOG. -- TTFN, patrick
On Wed, Aug 13, 2014 at 07:53:45PM -0400, Patrick W. Gilmore wrote:
you mean your vendor won't give you the knobs to do it smartly ([j]tac tickets open for five years)? wonder why.
Might be useful if you mentioned what you considered a "smart" way to trim the fib. But then you couldn't bitch and moan about people not understanding you, which is the real reason you post to NANOG.
Optimization #1 -- elimination of more specifics where there's a less specific that has the same next hop (obviously only in cases where the less specific is the one that would be used if the more specific were left out). Example: if 10.10.4.0/22 has the same next hop as 10.10.7.0/24, the latter can be left out of TCAM (assuming there's not a 10.10.6.0/23 with a different next hop). Optimization #2 -- concatenation of adjacent routes when they have the same next hop Example: If 10.10.12.0/15 and 10.10.14.0/15 have the same next hop, leave them both out of TCAM and install 10.10.14.0/14 Optimization #3 -- elimination of routes that have more specifics for their entire range. Example: Don't program 10.10.4.0/22 in TCAM is 10.10.4.0/23, 10.10.6.0/24 an 10.10.7.0/24 all exist Some additional points: -- This isn't that hard to implement. Once you have a FIB and primitives for manipulating it, it's not especially difficult to extend them to also maintain a minimal-size-FIB. -- The key is that aggregation need not be limited to identical routes. Any two routes *that have the same next hop from the perspective of the router doing the aggregating* can be aggregated in TCAM. DFZ routers have half a million routes, but comparatively few direct adjacencies. So lots of opportunity to aggregate. -- What I've described above gives forwarding behavior *identical* to unaggregated forwarding behavior, but with fewer TCAM entries. Obviously, you can get further reductions if you're willing to accept different behavior (for example, igoring more specifics when there's a less specific, even if the less specific has a different next hop). (This might or might not be what Randy was talking about. Maybe he's looking for knobs to allow some routes to be excluded from TCAM at the expense of changing forwarding behavior. But even without any such things, there's still opportunity to meaningfully reduce usage just by handling the cases where forwarding behavior will not change.) -- Brett
Swisscom or some other European SP has / used to have a limit where they would not accept more specific routes than say a /22 from provider x, so if you wanted to take a /24 and announce it you were SOL sending packets to them from that /24 over provider y. Still, for elderly and capacity limited routers, that might work. On Thursday, August 14, 2014, Brett Frankenberger <rbf+nanog@panix.com> wrote:
On Wed, Aug 13, 2014 at 07:53:45PM -0400, Patrick W. Gilmore wrote:
you mean your vendor won't give you the knobs to do it smartly ([j]tac tickets open for five years)? wonder why.
Might be useful if you mentioned what you considered a "smart" way to trim the fib. But then you couldn't bitch and moan about people not understanding you, which is the real reason you post to NANOG.
Optimization #1 -- elimination of more specifics where there's a less specific that has the same next hop (obviously only in cases where the less specific is the one that would be used if the more specific were left out).
Example: if 10.10.4.0/22 has the same next hop as 10.10.7.0/24, the latter can be left out of TCAM (assuming there's not a 10.10.6.0/23 with a different next hop).
Optimization #2 -- concatenation of adjacent routes when they have the same next hop
Example: If 10.10.12.0/15 and 10.10.14.0/15 have the same next hop, leave them both out of TCAM and install 10.10.14.0/14
Optimization #3 -- elimination of routes that have more specifics for their entire range.
Example: Don't program 10.10.4.0/22 in TCAM is 10.10.4.0/23, 10.10.6.0/24 an 10.10.7.0/24 all exist
Some additional points:
-- This isn't that hard to implement. Once you have a FIB and primitives for manipulating it, it's not especially difficult to extend them to also maintain a minimal-size-FIB.
-- The key is that aggregation need not be limited to identical routes. Any two routes *that have the same next hop from the perspective of the router doing the aggregating* can be aggregated in TCAM. DFZ routers have half a million routes, but comparatively few direct adjacencies. So lots of opportunity to aggregate.
-- What I've described above gives forwarding behavior *identical* to unaggregated forwarding behavior, but with fewer TCAM entries. Obviously, you can get further reductions if you're willing to accept different behavior (for example, igoring more specifics when there's a less specific, even if the less specific has a different next hop).
(This might or might not be what Randy was talking about. Maybe he's looking for knobs to allow some routes to be excluded from TCAM at the expense of changing forwarding behavior. But even without any such things, there's still opportunity to meaningfully reduce usage just by handling the cases where forwarding behavior will not change.)
-- Brett
-- --srs (iPad)
Composed on a virtual keyboard, please forgive typos.
On Aug 13, 2014, at 22:59, Suresh Ramasubramanian <ops.lists@gmail.com> wrote:
Swisscom or some other European SP has / used to have a limit where they would not accept more specific routes than say a /22 from provider x, so if you wanted to take a /24 and announce it you were SOL sending packets to them from that /24 over provider y.
Still, for elderly and capacity limited routers, that might work.
And Sprint used to filter on /19s outside swamp space. (See NANOG 1999 archives for my [wrong then corrected] interpretation of ACL112.) Etc., etc. For stub networks, especially ones who are not as performance sensitive, this can help extend the life of their routers. But not everyone can make AGS+s work for years past their useful life or get "-doran" IOS builds. The 6500 was first sold in 1999. I'm impressed it has lasted this long, even with new sups. Time to start thinking about upgrading. As for networks providing transit, those were highly unsound policies, IMHO. I specifically did not buy from Sprint then or Verio later when they did it, and I was not alone. Giving your customers less than full routes has lots of bad side effects, such as less revenue when they don't pick you because you don't have the route. -- TTFN, patrick
On Thursday, August 14, 2014, Brett Frankenberger <rbf+nanog@panix.com> wrote: On Wed, Aug 13, 2014 at 07:53:45PM -0400, Patrick W. Gilmore wrote:
you mean your vendor won't give you the knobs to do it smartly ([j]tac tickets open for five years)? wonder why.
Might be useful if you mentioned what you considered a "smart" way to trim the fib. But then you couldn't bitch and moan about people not understanding you, which is the real reason you post to NANOG.
Optimization #1 -- elimination of more specifics where there's a less specific that has the same next hop (obviously only in cases where the less specific is the one that would be used if the more specific were left out).
Example: if 10.10.4.0/22 has the same next hop as 10.10.7.0/24, the latter can be left out of TCAM (assuming there's not a 10.10.6.0/23 with a different next hop).
Optimization #2 -- concatenation of adjacent routes when they have the same next hop
Example: If 10.10.12.0/15 and 10.10.14.0/15 have the same next hop, leave them both out of TCAM and install 10.10.14.0/14
Optimization #3 -- elimination of routes that have more specifics for their entire range.
Example: Don't program 10.10.4.0/22 in TCAM is 10.10.4.0/23, 10.10.6.0/24 an 10.10.7.0/24 all exist
Some additional points:
-- This isn't that hard to implement. Once you have a FIB and primitives for manipulating it, it's not especially difficult to extend them to also maintain a minimal-size-FIB.
-- The key is that aggregation need not be limited to identical routes. Any two routes *that have the same next hop from the perspective of the router doing the aggregating* can be aggregated in TCAM. DFZ routers have half a million routes, but comparatively few direct adjacencies. So lots of opportunity to aggregate.
-- What I've described above gives forwarding behavior *identical* to unaggregated forwarding behavior, but with fewer TCAM entries. Obviously, you can get further reductions if you're willing to accept different behavior (for example, igoring more specifics when there's a less specific, even if the less specific has a different next hop).
(This might or might not be what Randy was talking about. Maybe he's looking for knobs to allow some routes to be excluded from TCAM at the expense of changing forwarding behavior. But even without any such things, there's still opportunity to meaningfully reduce usage just by handling the cases where forwarding behavior will not change.)
-- Brett
-- --srs (iPad)
Sprint used to proxy aggregate… I remember 128.0.0.0/3 the real question, imho, is if folks are going to look into their crystal balls and roadmap where the default offered is a /32 (either v4 or v6) and plan accordingly, or just slap another bandaid on the oozing wound... /bill PO Box 12317 Marina del Rey, CA 90295 310.322.8102 On 13August2014Wednesday, at 21:15, Patrick W. Gilmore <patrick@ianai.net> wrote:
Composed on a virtual keyboard, please forgive typos.
On Aug 13, 2014, at 22:59, Suresh Ramasubramanian <ops.lists@gmail.com> wrote:
Swisscom or some other European SP has / used to have a limit where they would not accept more specific routes than say a /22 from provider x, so if you wanted to take a /24 and announce it you were SOL sending packets to them from that /24 over provider y.
Still, for elderly and capacity limited routers, that might work.
And Sprint used to filter on /19s outside swamp space. (See NANOG 1999 archives for my [wrong then corrected] interpretation of ACL112.) Etc., etc.
For stub networks, especially ones who are not as performance sensitive, this can help extend the life of their routers. But not everyone can make AGS+s work for years past their useful life or get "-doran" IOS builds. The 6500 was first sold in 1999. I'm impressed it has lasted this long, even with new sups. Time to start thinking about upgrading.
As for networks providing transit, those were highly unsound policies, IMHO. I specifically did not buy from Sprint then or Verio later when they did it, and I was not alone. Giving your customers less than full routes has lots of bad side effects, such as less revenue when they don't pick you because you don't have the route.
-- TTFN, patrick
On Thursday, August 14, 2014, Brett Frankenberger <rbf+nanog@panix.com> wrote: On Wed, Aug 13, 2014 at 07:53:45PM -0400, Patrick W. Gilmore wrote:
you mean your vendor won't give you the knobs to do it smartly ([j]tac tickets open for five years)? wonder why.
Might be useful if you mentioned what you considered a "smart" way to trim the fib. But then you couldn't bitch and moan about people not understanding you, which is the real reason you post to NANOG.
Optimization #1 -- elimination of more specifics where there's a less specific that has the same next hop (obviously only in cases where the less specific is the one that would be used if the more specific were left out).
Example: if 10.10.4.0/22 has the same next hop as 10.10.7.0/24, the latter can be left out of TCAM (assuming there's not a 10.10.6.0/23 with a different next hop).
Optimization #2 -- concatenation of adjacent routes when they have the same next hop
Example: If 10.10.12.0/15 and 10.10.14.0/15 have the same next hop, leave them both out of TCAM and install 10.10.14.0/14
Optimization #3 -- elimination of routes that have more specifics for their entire range.
Example: Don't program 10.10.4.0/22 in TCAM is 10.10.4.0/23, 10.10.6.0/24 an 10.10.7.0/24 all exist
Some additional points:
-- This isn't that hard to implement. Once you have a FIB and primitives for manipulating it, it's not especially difficult to extend them to also maintain a minimal-size-FIB.
-- The key is that aggregation need not be limited to identical routes. Any two routes *that have the same next hop from the perspective of the router doing the aggregating* can be aggregated in TCAM. DFZ routers have half a million routes, but comparatively few direct adjacencies. So lots of opportunity to aggregate.
-- What I've described above gives forwarding behavior *identical* to unaggregated forwarding behavior, but with fewer TCAM entries. Obviously, you can get further reductions if you're willing to accept different behavior (for example, igoring more specifics when there's a less specific, even if the less specific has a different next hop).
(This might or might not be what Randy was talking about. Maybe he's looking for knobs to allow some routes to be excluded from TCAM at the expense of changing forwarding behavior. But even without any such things, there's still opportunity to meaningfully reduce usage just by handling the cases where forwarding behavior will not change.)
-- Brett
-- --srs (iPad)
Sprint also had 192/2 in the RADB :) manning bill wrote:
Sprint used to proxy aggregate… I remember 128.0.0.0/3
the real question, imho, is if folks are going to look into their crystal balls and roadmap where the default offered is a /32 (either v4 or v6) and plan accordingly, or just slap another bandaid on the oozing wound...
/bill PO Box 12317 Marina del Rey, CA 90295 310.322.8102
I believe at one point, SPRINT had in the RADB (and actively advertised) 0.0.0.0/2, 64.0.0.0/2, 128.0.0.0/2, and 192.0.0.0/2 under something called “Quarter Default Route, see Rational Default Project” or words to that effect. I could be wrong. It was a long time ago and I barely remember SPRINT any more. Owen On Aug 13, 2014, at 9:47 PM, Steve Noble <snoble@sonn.com> wrote:
Sprint also had 192/2 in the RADB :)
manning bill wrote:
Sprint used to proxy aggregate… I remember 128.0.0.0/3
the real question, imho, is if folks are going to look into their crystal balls and roadmap where the default offered is a /32 (either v4 or v6) and plan accordingly, or just slap another bandaid on the oozing wound...
/bill PO Box 12317 Marina del Rey, CA 90295 310.322.8102
On Thu, Aug 14, 2014 at 12:15:36AM -0400, Patrick W. Gilmore wrote:
Composed on a virtual keyboard, please forgive typos.
On Aug 13, 2014, at 22:59, Suresh Ramasubramanian <ops.lists@gmail.com> wrote:
Swisscom or some other European SP has / used to have a limit where they would not accept more specific routes than say a /22 from provider x, so if you wanted to take a /24 and announce it you were SOL sending packets to them from that /24 over provider y.
Still, for elderly and capacity limited routers, that might work.
And Sprint used to filter on /19s outside swamp space. (See NANOG 1999 archives for my [wrong then corrected] interpretation of ACL112.) Etc., etc.
For stub networks, especially ones who are not as performance sensitive, this can help extend the life of their routers. But not everyone can make AGS+s work for years past their useful life or get "-doran" IOS builds. The 6500 was first sold in 1999. I'm impressed it has lasted this long, even with new sups. Time to start thinking about upgrading.
Just as a historical note, Sprint didn't have AGS+ or such equipment that were being propped up by the /19 filters (at least for the vast majority of the filter's existence). Neither did Verio. Those filters were primarily an attempt to enforce a certain behavior. Also, my recollection is that during that era "named" builds were typically named via receipient's well known email id, e.g."-smd" or first name "-sean" and I don't think I've ever seen it named after the last name unless it was their email id as well. -dorian
On Thu, Aug 14, 2014 at 01:47:20AM -0400, Dorian Kim wrote:
On Thu, Aug 14, 2014 at 12:15:36AM -0400, Patrick W. Gilmore wrote:
Composed on a virtual keyboard, please forgive typos.
On Aug 13, 2014, at 22:59, Suresh Ramasubramanian <ops.lists@gmail.com> wrote:
Swisscom or some other European SP has / used to have a limit where they would not accept more specific routes than say a /22 from provider x, so if you wanted to take a /24 and announce it you were SOL sending packets to them from that /24 over provider y.
Still, for elderly and capacity limited routers, that might work.
And Sprint used to filter on /19s outside swamp space. (See NANOG 1999 archives for my [wrong then corrected] interpretation of ACL112.) Etc., etc.
For stub networks, especially ones who are not as performance sensitive, this can help extend the life of their routers. But not everyone can make AGS+s work for years past their useful life or get "-doran" IOS builds. The 6500 was first sold in 1999. I'm impressed it has lasted this long, even with new sups. Time to start thinking about upgrading.
Just as a historical note, Sprint didn't have AGS+ or such equipment that were being propped up by the /19 filters (at least for the vast majority of the filter's existence). Neither did Verio. Those filters were primarily an attempt to enforce a certain behavior.
It was kindly pointed out to me in private that my phrasing could be misleading here. When ACL112 came into being, there were old equipment that were being protected by the /19 filters. However, the filters were in place long after those equipment were replaced. -dorian
It was kindly pointed out to me in private that my phrasing could be misleading here.
When ACL112 came into being, there were old equipment that were being protected by the /19 filters. However, the filters were in place long after those equipment were replaced.
but by then it had driven all sorts of filtering and a negotiated (at danvers) treaty with the rirs to allocate on /19. another note from our private aside, it is worth noting that verio's satanic phyltres meant we did not even notice the 7007 and 128/9 disasters. we read about them on nanog (or com-priv?). randy
On Aug 14, 2014, at 02:36 , Randy Bush <randy@psg.com> wrote:
It was kindly pointed out to me in private that my phrasing could be misleading here.
When ACL112 came into being, there were old equipment that were being protected by the /19 filters. However, the filters were in place long after those equipment were replaced.
but by then it had driven all sorts of filtering and a negotiated (at danvers) treaty with the rirs to allocate on /19.
another note from our private aside, it is worth noting that verio's satanic phyltres meant we did not even notice the 7007 and 128/9 disasters. we read about them on nanog (or com-priv?).
Everything has pluses & minuses. The as7007 debacle was actually made far, far worse by Sprint's policies at the time, including a "-smb" (thanx, Dorian) build. Vinny may have made a major boo-boo by pumping BGP through RIPv1 then back into BGP, but the fact Sprint filtered only on AS path _and_ had an IOS which ignored withdrawals was the real killer. Let's work on the primary protection of the INTERNET. When you were at Verio, you were driving a policy that you wanted, despite being clearly and objectively a tiny minority of the population in question. It might have made the Internet safer, but it had lots of bad side effects, including making it so that large networks have an advantage over small ones. Since those "small networks" are frequently the people paying the bills, and I am here to make money, I am not terribly happy with such policies. A quick list off the top of my head: BCP38, filtering customer announcements properly, putting pressure on networks that needlessly deaggregate, ensuring information (e.g. "your 6500 is about to crash") is properly disseminated, etc. These will have far larger impacts, disadvantage no one, and will not lose you business like your previous policies did. Everyone wins. All that said, I still abide by my primary rule: Your network, your decision. I am arguing for things we can all agree help everyone, not a select few. On Aug 14, 2014, at 02:13 , Randy Bush <randy@psg.com> wrote:
you mean your vendor won't give you the knobs to do it smartly ([j]tac tickets open for five years)? wonder why.
Might be useful if you mentioned what you considered a "smart" way to trim the fib. But then you couldn't bitch and moan about people not understanding you, which is the real reason you post to NANOG.
i did not get the original of this post, but the ad hominem speaks for it pathetic self.
Ad hominem implies I was going after your character without facts. However, the statement above _is_ fact - at least I believe so and given the private replies I received (and especially who replied), I am not alone. Also, you calling an ad hominem attack "pathetic" is hilarious in more ways than I can count. (Again, not ad hominem. It is trivial to objectively prove that statement hypocritical at least, which I find amusing.) -- TTFN, patrick
When ACL112 came into being, there were old equipment that were being protected by the /19 filters. However, the filters were in place long after those equipment were replaced.
This was done for commercial reasons, not to protect the Internet. You know it, I know it, and I'm pretty sure the statute of limitations has expired, so now everyone can know it. Randy may have created Verio's filters for "the good of the Internet" (even though I disagree, as I just posted), but Sean's reasons for keeping those filters were absolutely not so pristine. -- TTFN, patrick
I think you mean what is best described here: http://www.swinog.ch/meetings/swinog7/BGP_filtering-swinog.ppt --Aris
Suresh Ramasubramanian <mailto:ops.lists@gmail.com> Thursday, August 14, 2014 04:59 Swisscom or some other European SP has / used to have a limit where they would not accept more specific routes than say a /22 from provider x, so if you wanted to take a /24 and announce it you were SOL sending packets to them from that /24 over provider y.
Still, for elderly and capacity limited routers, that might work.
On Thursday, August 14, 2014, Brett Frankenberger <rbf+nanog@panix.com>
Brett Frankenberger <mailto:rbf+nanog@panix.com> Thursday, August 14, 2014 04:49
Optimization #1 -- elimination of more specifics where there's a less specific that has the same next hop (obviously only in cases where the less specific is the one that would be used if the more specific were left out).
Example: if 10.10.4.0/22 has the same next hop as 10.10.7.0/24, the latter can be left out of TCAM (assuming there's not a 10.10.6.0/23 with a different next hop).
Optimization #2 -- concatenation of adjacent routes when they have the same next hop
Example: If 10.10.12.0/15 and 10.10.14.0/15 have the same next hop, leave them both out of TCAM and install 10.10.14.0/14
Optimization #3 -- elimination of routes that have more specifics for their entire range.
Example: Don't program 10.10.4.0/22 in TCAM is 10.10.4.0/23, 10.10.6.0/24 an 10.10.7.0/24 all exist
Some additional points:
-- This isn't that hard to implement. Once you have a FIB and primitives for manipulating it, it's not especially difficult to extend them to also maintain a minimal-size-FIB.
-- The key is that aggregation need not be limited to identical routes. Any two routes *that have the same next hop from the perspective of the router doing the aggregating* can be aggregated in TCAM. DFZ routers have half a million routes, but comparatively few direct adjacencies. So lots of opportunity to aggregate.
-- What I've described above gives forwarding behavior *identical* to unaggregated forwarding behavior, but with fewer TCAM entries. Obviously, you can get further reductions if you're willing to accept different behavior (for example, igoring more specifics when there's a less specific, even if the less specific has a different next hop).
(This might or might not be what Randy was talking about. Maybe he's looking for knobs to allow some routes to be excluded from TCAM at the expense of changing forwarding behavior. But even without any such things, there's still opportunity to meaningfully reduce usage just by handling the cases where forwarding behavior will not change.)
-- Brett Patrick W. Gilmore <mailto:patrick@ianai.net> Thursday, August 14, 2014 01:53 On Aug 13, 2014, at 16:42 , Randy Bush <randy@psg.com> wrote:
half the routing table is deagg crap. filter it.
We disagree.
Just because you don't like all more specifics doesn't mean they are useless.
Not everything is about minimizing FIB size. (And RIB size hasn't been relevant for years.) People pay an ass-ton of money to save a few ms off their RTT, if a more specific will allow packets to travel LHR->FRA directly instead of packets going from LHR -> SFO -> FRA, they are useful even if there is a covering prefix.
you mean your vendor won't give you the knobs to do it smartly ([j]tac tickets open for five years)? wonder why.
Might be useful if you mentioned what you considered a "smart" way to trim the fib. But then you couldn't bitch and moan about people not understanding you, which is the real reason you post to NANOG.
Randy Bush <mailto:randy@psg.com> Wednesday, August 13, 2014 22:42 half the routing table is deagg crap. filter it.
you mean your vendor won't give you the knobs to do it smartly ([j]tac tickets open for five years)? wonder why.
randy Suresh Ramasubramanian <mailto:ops.lists@gmail.com> Tuesday, August 12, 2014 18:10 512K routes, here we come. Lots of TCAM based routers suddenly become really expensive doorstops.
Maybe time to revisit this old 2007 nanog thread?
http://www.gossamer-threads.com/lists/engine?do=post_view_flat;post=99870;pa...
FYI nanog - https://puck.nether.net/pipermail/outages/2014-August/007091.html
[outages] Major outages today, not much info at this time
Teun Vink teun at teun.tv Tue Aug 12 11:42:05 EDT 2014
Hi,
Some routing tables hit 512K routes today. Some old hardware and software can't handle that and either crash or ignore newly learned routes. So this may cause some disturbances in the force.
HTH, Teun
-----------------
Once upon a time, Brett Frankenberger <rbf+nanog@panix.com> said:
-- This isn't that hard to implement. Once you have a FIB and primitives for manipulating it, it's not especially difficult to extend them to also maintain a minimal-size-FIB.
I would say it is hard to implement, or at least non-trivial. Building a reduced FIB from a given RIB is not hard, but then RIB changes become more complex (possibly significantly so) to process into FIB updates. For example, the control plane receives a BGP update that removes a route from the RIB. Right now, the check is simply "was this the best path" (and if so, "was it in the FIB" for systems with RIB->FIB filtering methods). If so, remove it from the FIB. If there's a next-best path in the RIB, add it to the FIB. With a compacted FIB, a RIB update has to check a bunch of different things to see what (if any) FIB updates are required. This could also require other RIB updates (to mark other RIB entries as now in or out of the FIB). The lowest-overhead method would probably be for the control plane to keep a separate (non-compacted) copy of the FIB, with all the pointers to how things were compacted before sending to the forwarding plane compacted FIB. That would take up more control plane RAM (and still add CPU overhead to every RIB change). If you thought things like rpd stalls on JUNOS were fun before, imagine the excitement you could have with FIB compacting! -- Chris Adams <cma@cmadams.net>
you mean your vendor won't give you the knobs to do it smartly ([j]tac tickets open for five years)? wonder why.
Might be useful if you mentioned what you considered a "smart" way to trim the fib. But then you couldn't bitch and moan about people not understanding you, which is the real reason you post to NANOG.
i did not get the original of this post, but the ad hominem speaks for it pathetic self.
Optimization #1 -- elimination of more specifics where there's a less specific that has the same next hop (obviously only in cases where the less specific is the one that would be used if the more specific were left out).
Example: if 10.10.4.0/22 has the same next hop as 10.10.7.0/24, the latter can be left out of TCAM (assuming there's not a 10.10.6.0/23 with a different next hop).
Optimization #2 -- concatenation of adjacent routes when they have the same next hop
Example: If 10.10.12.0/15 and 10.10.14.0/15 have the same next hop, leave them both out of TCAM and install 10.10.14.0/14
Optimization #3 -- elimination of routes that have more specifics for their entire range.
Example: Don't program 10.10.4.0/22 in TCAM is 10.10.4.0/23, 10.10.6.0/24 an 10.10.7.0/24 all exist
those are some of the cases. i guess i should dig up the old [j]tac tickets. randy
It looks great though I would not want to troubleshoot the RIB to FIB programing errors unless there's a note somewhere saying what abbreviation to search for in FIB. The other think that comes to mind is that the more specifics could have different backup next-hops programed. adam
From: NANOG [mailto:nanog-bounces@nanog.org] On Behalf Of Brett Frankenberger Sent: Thursday, August 14, 2014 4:50 AM
On Wed, Aug 13, 2014 at 07:53:45PM -0400, Patrick W. Gilmore wrote:
you mean your vendor won't give you the knobs to do it smartly ([j]tac tickets open for five years)? wonder why.
Might be useful if you mentioned what you considered a "smart" way to trim the fib. But then you couldn't bitch and moan about people not understanding you, which is the real reason you post to NANOG.
Optimization #1 -- elimination of more specifics where there's a less specific that has the same next hop (obviously only in cases where the less specific is the one that would be used if the more specific were left out).
Example: if 10.10.4.0/22 has the same next hop as 10.10.7.0/24, the latter can be left out of TCAM (assuming there's not a 10.10.6.0/23 with a different next hop).
Optimization #2 -- concatenation of adjacent routes when they have the same next hop
Example: If 10.10.12.0/15 and 10.10.14.0/15 have the same next hop, leave them both out of TCAM and install 10.10.14.0/14
Optimization #3 -- elimination of routes that have more specifics for their entire range.
Example: Don't program 10.10.4.0/22 in TCAM is 10.10.4.0/23, 10.10.6.0/24 an 10.10.7.0/24 all exist
Some additional points:
-- This isn't that hard to implement. Once you have a FIB and primitives for manipulating it, it's not especially difficult to extend them to also maintain a minimal-size-FIB.
-- The key is that aggregation need not be limited to identical routes. Any two routes *that have the same next hop from the perspective of the router doing the aggregating* can be aggregated in TCAM. DFZ routers have half a million routes, but comparatively few direct adjacencies. So lots of opportunity to aggregate.
-- What I've described above gives forwarding behavior *identical* to unaggregated forwarding behavior, but with fewer TCAM entries. Obviously, you can get further reductions if you're willing to accept different behavior (for example, igoring more specifics when there's a less specific, even if the less specific has a different next hop).
(This might or might not be what Randy was talking about. Maybe he's looking for knobs to allow some routes to be excluded from TCAM at the expense of changing forwarding behavior. But even without any such things, there's still opportunity to meaningfully reduce usage just by handling the cases where forwarding behavior will not change.)
-- Brett
FlowViewer version 4.4 (open-source) is now available on SourceForge. FlowViewer provides a dynamic web front-end to two powerful open-source netflow data collector and analyzers, flow-tools and SiLK. FlowViewer provides the user with the ability to report, graph and track (MRTG-like) user specified subsets of network traffic (IPv4 and IPv6.) Version 4.4 is a significant upgrade with several new key features: * A visual Analysis feature that simplifies the identification of major contributors to traffic events (e.g., peak flows.) * The ability to create multiple Dashboards for different user classes (individuals, groups, networks, data centers, etc.) * More flexibility for interfacing with a wide variety of SiLK configurations. The new features extend FlowViewer's security analysis capabilities and enhance the user's general traffic situational awareness. https://sourceforge.net/projects/flowviewer Regards, Joe
participants (22)
-
Aris Lambrianidis
-
Brett Frankenberger
-
Chris Adams
-
Chris Woodfield
-
Dorian Kim
-
Hank Nussbacher
-
Joe Loiacono
-
Jon Lewis
-
Leo Bicknell
-
manning bill
-
Matthew Petach
-
McElearney, Kevin
-
Måns Nilsson
-
Owen DeLong
-
Patrick W. Gilmore
-
Randy Bush
-
Romeo Czumbil
-
Steve Noble
-
Suresh Ramasubramanian
-
Tom Hill
-
Vitkovský Adam
-
William Herrin