RE: Route table growth and hardware limits...talk to the filter

Church, Charles

11 Sep 2007 11 Sep '07

3:34 p.m.

I'm willing to bet that Cisco's sales team is pushing back a little on this enhancement. Given the choice between prolonging the usefulness of Sup2 for another year or two versus selling a massive amount of new Sup720-3BXLs, I'm betting they'll do the later. Chuck -----Original Message----- From: owner-nanog@merit.edu [mailto:owner-nanog@merit.edu] On Behalf Of Sam Stickland Sent: Tuesday, September 11, 2007 10:55 AM To: Forrest Cc: nanog@nanog.org Subject: Re: Route table growth and hardware limits...talk to the filter

...

There's a couple of feature requests open at Cisco that would do this. Rodney Dunn (at Cisco) pushed quite hard for these but couldn't get any traction. As he's previously pointed out it would have if people would vote for them (attach a case to). FORWARDED MESSAGE: Rodney Dunn wrote: Take your pick. :) CSCsa45474 Ability to block overlapping BGP prefixes from being installed in RIB or better: CSCsa46049 IOS RIB should not install overlapping routes with same next hop Rodney

Show replies by date

Kevin Oberman

12 Sep 12 Sep

4:09 p.m.

New subject: Route table growth and hardware limits...talk to the filter

...

Date: Tue, 11 Sep 2007 10:34:20 -0500 From: "Church, Charles" <cchurc05@harris.com> Sender: owner-nanog@merit.edu

I'm willing to bet that Cisco's sales team is pushing back a little on this enhancement. Given the choice between prolonging the usefulness of Sup2 for another year or two versus selling a massive amount of new Sup720-3BXLs, I'm betting they'll do the later.

Yes, but higher management should understand that when customers feel that they are being forced to buy expensive hardware because a software "fix" is deemed bad for sales, they are likely to buy someone other companies products. -- R. Kevin Oberman, Network Engineer Energy Sciences Network (ESnet) Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab) E-mail: oberman@es.net Phone: +1 510 486-8634 Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751

Matt Liotta

20 Sep 20 Sep

6:52 p.m.

New subject: Route table growth and hardware limits...talk to the filter

I was playing with a sup2 adding in extra routes to the point that it ran out of memory. Unfortunately, it didn't just drop routes like I thought it would. CEF disabled itself as well, which on a busy box would be a disaster. Is this what people expect will happen in a few months to people using sup2s? Or am I missing something else? -Matt

Jon Lewis

9:03 p.m.

New subject: Route table growth and hardware limits...talk to the filter

On Thu, 20 Sep 2007, Matt Liotta wrote:

...

I was playing with a sup2 adding in extra routes to the point that it ran out of memory. Unfortunately, it didn't just drop routes like I thought it would. CEF disabled itself as well, which on a busy box would be a disaster.

Is this what people expect will happen in a few months to people using sup2s? Or am I missing something else?

That's not good. What software version was it running? ---------------------------------------------------------------------- Jon Lewis | I route Senior Network Engineer | therefore you are Atlantic Net | _________ http://www.lewis.org/~jlewis/pgp for PGP public key_________

Bora Akyol

11:07 p.m.

New subject: Route table growth and hardware limits...talk to the filter

On 9/20/07 2:03 PM, "Jon Lewis" <jlewis@lewis.org> wrote:

...

On Thu, 20 Sep 2007, Matt Liotta wrote:

...
I was playing with a sup2 adding in extra routes to the point that it ran out of memory. Unfortunately, it didn't just drop routes like I thought it would. CEF disabled itself as well, which on a busy box would be a disaster.

Is this what people expect will happen in a few months to people using sup2s? Or am I missing something else?

That's not good. What software version was it running?

While it is not good, the alternative approach would leave an indeterminate routing table in hardware. Would you like the packets to go to randomized directions? The box is trying to do the right thing by turning off CEF and switching everything in software since in this case software is the only entity in the system with a consistent FIB. An alternate would be to use the hardware forwarding tables as a limited size cache (similar but not exactly as in the 7000 router). I am sure that this is a large software effort and whether the hardware can support this is questionable. SUP2 was a great RP with a really long life, but maybe it is time to move on to a SUP720 with the large table option and then grab a cold one ;-) Cheers, Bora

Jon Lewis

11:40 p.m.

New subject: Route table growth and hardware limits...talk to the filter

On Thu, 20 Sep 2007, Bora Akyol wrote:

...

...
...
I was playing with a sup2 adding in extra routes to the point that it ran out of memory. Unfortunately, it didn't just drop routes like I thought it would. CEF disabled itself as well, which on a busy box would be a disaster.

Is this what people expect will happen in a few months to people using sup2s? Or am I missing something else?

That's not good. What software version was it running?

While it is not good, the alternative approach would leave an indeterminate routing table in hardware. Would you like the packets to go to randomized directions?

No, but someone previously posted that with later software versions, when TCAM runs out, packets for those routes that fit in TCAM are hardware switched, and only traffic for the remaining routes that didn't fit are software switched. That could potentially go unnoticed for some time, while software switching all traffic is likely be impossible on many installations. I kind of doubt the MSFC2 can software switch gigabits/s of traffic (or anything close to gigabits/s).

...

SUP2 was a great RP with a really long life, but maybe it is time to move on to a SUP720 with the large table option and then grab a cold one ;-)

Or start filtering some of the twit networks that totally deagg their CIDRs. I see a game of internet chicken in the near future...only some of the players don't realize they're playing. ---------------------------------------------------------------------- Jon Lewis | I route Senior Network Engineer | therefore you are Atlantic Net | _________ http://www.lewis.org/~jlewis/pgp for PGP public key_________

John A. Kilpatrick

21 Sep 21 Sep

8:52 a.m.

New subject: Route table growth and hardware limits...talk to the filter

On 9/20/07 4:40 PM, "Jon Lewis" <jlewis@lewis.org> wrote:

...

No, but someone previously posted that with later software versions, when TCAM runs out, packets for those routes that fit in TCAM are hardware switched, and only traffic for the remaining routes that didn't fit are software switched.

That is what I thought as well, but I'm afraid the only MSFC2s I have are attached to my SUP32s in production. And so destructive testing like this would be, you know, bad. But it would be really good to know what to expect. One possibility means that we might see occasional CPU spikes. The other possibility means the box will start sucking more than this year's factory Honda team. I suppose I could ask Cisco, but if anyone else has done any testing it'd be good to hear about it... Meanwhile, I have brought myself to three options: 1. Upgrade to RSP720-3CXL (same price, more memory, faster CPU compared to SUP720-3BXL) + 6148s 2. Cisco 7304 + pair of 3750s 3. Juniper M7i + pair of 3750. Even with need for the 6148s it's still cheaper for me to keep my 7604s although not by too much. Now to get in to the nitty gritty. Imagine my shock when my Cisco rep said he's been having a lot of these conversations lately... -- John A. Kilpatrick john@hypergeek.net Email| http://www.hypergeek.net/ john-page@hypergeek.net Text pages| ICQ: 19147504 remember: no obstacles/only challenges

Pekka Savola

11:59 a.m.

New subject: Route table growth and hardware limits...talk to the filter

On Fri, 21 Sep 2007, John A. Kilpatrick wrote:

...

Meanwhile, I have brought myself to three options: ..

Has the option of using default route(s) occurred to you? -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings

Randy Bush

1:41 p.m.

New subject: Route table growth and hardware limits...talk to the filter

...

...
Meanwhile, I have brought myself to three options: Has the option of using default route(s) occurred to you?

welcome to v6. we forgot to sort out routing, so just don't do it. you're kidding, right? randy

Pekka Savola

2:18 p.m.

New subject: Route table growth and hardware limits...talk to the filter

On Fri, 21 Sep 2007, Randy Bush wrote:

...

...
...
Meanwhile, I have brought myself to three options: Has the option of using default route(s) occurred to you?

welcome to v6. we forgot to sort out routing, so just don't do it. you're kidding, right?

No, I'm not kidding but maybe we're talking about a different thing (you may have a more generalized network in mind). The way I see it, a network which is considering "Juniper M7i or Cisco 7300 plus a couple of switches" as an option does not _need_ 220K IPv4 routes in its routing table. Whether it has 150K, 40K (Hi Simon!) or 5K shouldn't matter that much from the functionality perspective. If we still disagree, it might be interesting to hear why filtered BGP feeds from upstream and appropriately placed default routes to cover the holes wouldn't provide a functionally and operationally an equivalent solution? -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings

Randy Bush

2:27 p.m.

New subject: Route table growth and hardware limits...talk to the filter

Pekka Savola wrote:

...

it might be interesting to hear why filtered BGP feeds from upstream and appropriately placed default routes to cover the holes wouldn't provide a functionally and operationally an equivalent solution?

as we scale, if the vendors can't maintain simple/promised functionality they seem to ask the customers to add complexity and hence unreliability and ops cost or upgrade and upgrade and upgrade on the "we're working on that for next year" path. randy

Michael Smith

4:05 p.m.

New subject: Route table growth and hardware limits...talk to the filter

Hello Pekka: On Sep 21, 2007, at 7:18 AM, Pekka Savola wrote:

...

On Fri, 21 Sep 2007, Randy Bush wrote:

...
...
...
Meanwhile, I have brought myself to three options: Has the option of using default route(s) occurred to you?

welcome to v6. we forgot to sort out routing, so just don't do it. you're kidding, right?

No, I'm not kidding but maybe we're talking about a different thing (you may have a more generalized network in mind).

The way I see it, a network which is considering "Juniper M7i or Cisco 7300 plus a couple of switches" as an option does not _need_ 220K IPv4 routes in its routing table. Whether it has 150K, 40K (Hi Simon!) or 5K shouldn't matter that much from the functionality perspective.

If we still disagree, it might be interesting to hear why filtered BGP feeds from upstream and appropriately placed default routes to cover the holes wouldn't provide a functionally and operationally an equivalent solution?

Well, how do you determine which routes to select from each provider and what to cover with defaults? How do you modify those settings once they're in place, particularly when you find exceptions in your design? I know the answers, but these are not easy questions to answer if you are a small provider that is smart enough to have multiple transit providers and enough clue to configure .* and ^$, but not enough clue to filter based upon upstream provider communities, flows and/or other dynamic means. The whole point of BGP, to my mind, is so that I *can* accept full routes from multiple providers and *may* elect to change that behavior for other reasons. I shouldn't have to modify my BGP configuration to support my vendors' inability to provide a device that can scale to the present demands of the global routing table. Last time I checked, they are here to support me, not the other way around. Regards, Mike

Pekka Savola

4:33 p.m.

New subject: Route table growth and hardware limits...talk to the filter

On Fri, 21 Sep 2007, Michael Smith wrote:

...

Well, how do you determine which routes to select from each provider and what to cover with defaults? How do you modify those settings once they're in place, particularly when you find exceptions in your design? I know the answers, but these are not easy questions to answer if you are a small provider that is smart enough to have multiple transit providers and enough clue to configure .* and ^$, but not enough clue to filter based upon upstream provider communities, flows and/or other dynamic means.

One approach is to accept everything up to some prefix length, e.g., /16, /20, /21 or whatever, and filter the rest -- and point the primary default route to a non-tier1 so that it should _always_ have connectivity to all the world. Now, I guess one question is, what do you do when your tierN upstream you point default route to has routing or forwarding broken in such a way that your packets get dropped? Answer: you fix it manually or just ignore it and get SLA credits. However, very probably the same problem (e.g., BGP works but forwarding broken) would happen with full BGP feeds as well, so it's not like you're losing much. (FWIW, not sure if such a small provider needs other than default route and potentially routes of networks directly attached to its upstreams, but filtered full feeds may be a more politically correct approach for network administrators)

...

The whole point of BGP, to my mind, is so that I *can* accept full routes from multiple providers and *may* elect to change that behavior for other reasons. I shouldn't have to modify my BGP configuration to support my vendors' inability to provide a device that can scale to the present demands of the global routing table. Last time I checked, they are here to support me, not the other way around.

But the vendors aren't unable -- AFAIK, such devices have been available on the market for, what, 7-8 years now? It's just your wallet that's unable to get equipment that's needed to face the network that's getting more complex. In this case your choices seem to be a) dig out more money and get a better router, b) complain to vendor so that they make their implementations better (e.g., better memory or FIB utilization, transparently) so that you can continue doing exactly as before, at least for a while, c) change the configuration in such a manner that your gear remains viable for a longer while, or d) complain to IETF, ITU-T, ... or whoever to create a new protocol that would accomplish the same thing as b). I don't oppose b) but I fail to see how that could provide more than a quick term fix as the number of routes is climbing and the mountaintop is nowhere in sight. Similarly, d) would take so long that it won't help you here. So your real options are either a) or c). Whether the drawbacks of letting go of full, unfiltered BGP feeds is worth the cost of a) is up to you. -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings

Donald Stahl

5 p.m.

New subject: Route table growth and hardware limits...talk to the filter

...

But the vendors aren't unable -- AFAIK, such devices have been available on the market for, what, 7-8 years now? It's just your wallet that's unable to get equipment that's needed to face the network that's getting more complex. Cisco reps are still selling SUP32's to people despite being told the customer wants to take full routes. That's either incompetence, or dishonesty.

It has nothing to do with budget. If you are told Product A will do the job and costs $100k, and product B will also do the job but costs only $50k, you'd be an idiot to go with product A. Furthermore- Cisco doesn't have a product to meet their needs. A Sup32 with a 3bxl is what a lot of people need, but Cisco seems intent on forcing people to upgrade to a Sup720. That's just overkill for most people. -Don

John A. Kilpatrick

5:55 p.m.

New subject: Route table growth and hardware limits...talk to the filter

On 9/21/07 7:18 AM, "Pekka Savola" <pekkas@netcore.fi> wrote:

...

The way I see it, a network which is considering "Juniper M7i or Cisco 7300 plus a couple of switches" as an option does not _need_ 220K IPv4 routes in its routing table. Whether it has 150K, 40K (Hi Simon!) or 5K shouldn't matter that much from the functionality perspective.

There are a couple of reasons: 1. The "captain obvious" suggestion of a default means that now I'm paying for multiple links but can only use one. That's not cost effective and will provide lower performance for some destinations. I have done defaults in the past where appropriate but it's not appropriate in this application. 2. The idea of a complex filtering strategy is, from my perspective, an even worse idea. You get all of the downsides of a default with increased operational complexity that may not scale across multiple sites depending on the size of your ops team. Oh, and don't forget, for testing and validation you'd need to buy a router that can take these multiple feeds to test the results of the filtering policy. Both of those options are viable (#1 obviously over #2) if just basic connectivity is required. However I find myself not really wanting to have to continually support solutions with such limitations when there are other options. -- John A. Kilpatrick john@hypergeek.net Email| http://www.hypergeek.net/ john-page@hypergeek.net Text pages| ICQ: 19147504 remember: no obstacles/only challenges

Pekka Savola

6:22 p.m.

New subject: Route table growth and hardware limits...talk to the filter

On Fri, 21 Sep 2007, John A. Kilpatrick wrote:

...

1. The "captain obvious" suggestion of a default means that now I'm paying for multiple links but can only use one. That's not cost effective and will provide lower performance for some destinations. I have done defaults in the past where appropriate but it's not appropriate in this application.

That's not the case at all. If you use only defaults, you could do load balancing but in a very crude fashion. If you use a default route and filtered version of BGP feed (e.g., accept everything up to /21) probably up to 90-95% of traffic would go over that link, or multiple ones if you have multiple BGP sessions. If you want more control than _only_ a default route or two (and many do), the default route would in principle be just a safeguard for more specifics (or other routes, based on a metric of your choosing) you filter out.

...

2. The idea of a complex filtering strategy is, from my perspective, an even worse idea. You get all of the downsides of a default with increased operational complexity that may not scale across multiple sites depending on the size of your ops team.

I'd probably agree if you used complex filtering without a default route. Having a default route, as long as it points to a sufficiently good (non-tier1, not cogent) upstream allows you not to care so much about how you filter the BGP feed. But as should be obvious, you don't need to worry about this problem if you're willing to put money into router upgrades. However, I'm just suggesting there is an alternative to router upgrades if you're comfortable with the somewhat different tradeoffs that will bring with it. -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings

Warren Kumari

7:30 p.m.

New subject: Route table growth and hardware limits...talk to the filter

On Sep 21, 2007, at 2:22 PM, Pekka Savola wrote:

...

On Fri, 21 Sep 2007, John A. Kilpatrick wrote:

...
1. The "captain obvious" suggestion of a default means that now I'm paying for multiple links but can only use one. That's not cost effective and will provide lower performance for some destinations. I have done defaults in the past where appropriate but it's not appropriate in this application.

That's not the case at all. If you use only defaults, you could do load balancing but in a very crude fashion. If you use a default route and filtered version of BGP feed (e.g., accept everything up to /21) probably up to 90-95% of traffic would go over that link, or multiple ones if you have multiple BGP sessions.

Sure, but you do still run the (not insignificant) risk of following the default to the "sufficiently good (non-tier1, not cogent) upstream", only to discover that, for whatever reason, it has no reachability to the prefix. If I have spent to time and effort to get multiple providers, presumably I believe that my bits are important enough to not trust to "this will probably work most of the time..." W

...

If you want more control than _only_ a default route or two (and many do), the default route would in principle be just a safeguard for more specifics (or other routes, based on a metric of your choosing) you filter out.

...
2. The idea of a complex filtering strategy is, from my perspective, an even worse idea. You get all of the downsides of a default with increased operational complexity that may not scale across multiple sites depending on the size of your ops team.

I'd probably agree if you used complex filtering without a default route. Having a default route, as long as it points to a sufficiently good (non-tier1, not cogent) upstream allows you not to care so much about how you filter the BGP feed.

But as should be obvious, you don't need to worry about this problem if you're willing to put money into router upgrades. However, I'm just suggesting there is an alternative to router upgrades if you're comfortable with the somewhat different tradeoffs that will bring with it.

-- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings

-- Hope is not a strategy. -- Ben Treynor, Google

Deepak Jain

8:18 p.m.

New subject: Route table growth and hardware limits...talk to the filter

I think one of the points here is that we've gotten beyond the space where two uplinks was "good enough" for virtually all cases and that either uplink was "good enough" provided it is up. And "up" was a binary state rather than an array of binary states with associated keys defined by destinations. More explicitly, I think those using MSFC2s to take full routes are largely saying "hey, we know what we are doing and why." and "Cisco should have redesigned their boards to support more routes earlier"-- so items like the SUP32 would have a 3BXL option and the like. From the folks on here who are saying using a default or aggressive route filtering isn't sufficient are also implying they have more than 2 views of the Internet... in many cases many more than 2 transit views, and possibly peers as well. Certain snobbery aside ("Anyone who needs a M7i doesn't need a full routing table") seems uncalled for. I am not going to comment on J's line up, but a M7i should be able to route 3-4Gb/s to a full table without sweating too hard over a handful of interfaces. How many people are routing 3-4 Gb/s to the Internet and don't have at least several uplinks and LOTS of customers that would get *exceptionally* pissed off at less than ideal routing or routing holes (in this case defined as a default to a provider that has a hole)? The Cisco 6500/7600 line is amazingly stable and supports a ridiculously high number of 1GE ethernet and 10GE ethernet L3 ports mated to a Cisco<tm> BGP talker. Yes, in the majority, the ports have small interface buffers. Therefore, these are best suited at interfaces between other networks over low-latency intra-building or metro-area cross-connects rather than large-latency international circuits. I suspect that is where the majority are being operated. If you need a router to talk to your >40ms interfaces, its not for you. If you like to mix and match a lot of media in your router, its not for you. If you have gotten rid of most of that SONET-speed craziness (OC3, OC12, OC48) in your core --even if that just means upgrading to Nx10GE everywhere, and everything has started looking like ethernet, they are exceptionally tasty. As an operator of such a successful series of equipment that has had a surprisingly long set of legs, I think I would be more impressed if Cisco had a board that had dramatically greater routing capabilities (not just speed, but table size) than the 3BXL. Or if Cisco demonstrated that it understands where these boxes are being used and they all aren't deployed in super-high-density PoE applications or on high-latency overseas interfaces. But that is neither here nor there. The idea of how to filter has been brought up, in fact, someone posted an actively worked-on filter for US-centric providers that provide some immediate relief. The idea of a code improvement that gives MSFC2s a more graceful fail pattern has been brought up by Lincoln c/o Cisco. So far, nobody has spoken of a Cisco plan to provide a SUP32-3BXL or similar board for immediate relief of the problem -- so either the NDA has no leaks or its not going to happen in the next few months of operational planning. Speculation about the alternative platforms (from C or J) is fair game. I suspect J is trying to upsell lots of people from 6500s and 7600s and is realizing that lots of 6500/7600 users don't see the point of paying $7,000 per GigE interface no matter how many bells and whistles one can turn on at the same time. C isn't worried about many defections, nor should it be -- no one has a competing box with the same kind of reputation for ethernet density/stability at the price point. I suspect whoever owns the product line at C is going to get a big bonus this year while he/she struggles to justify why they need to keep increasing the routing capabilities beyond the Sup720-3BXL, at least for the 6500. This is longer than I had intended. Hopefully something in it is operational. Deepak Jain AiNET

Pekka Savola

22 Sep 22 Sep

5:10 a.m.

New subject: Route table growth and hardware limits...talk to the filter

On Fri, 21 Sep 2007, Warren Kumari wrote:

...

On Sep 21, 2007, at 2:22 PM, Pekka Savola wrote:

...
On Fri, 21 Sep 2007, John A. Kilpatrick wrote:

...
1. The "captain obvious" suggestion of a default means that now I'm paying for multiple links but can only use one. That's not cost effective and will provide lower performance for some destinations. I have done defaults in the past where appropriate but it's not appropriate in this application.

That's not the case at all. If you use only defaults, you could do load balancing but in a very crude fashion. If you use a default route and filtered version of BGP feed (e.g., accept everything up to /21) probably up to 90-95% of traffic would go over that link, or multiple ones if you have multiple BGP sessions.

Sure, but you do still run the (not insignificant) risk of following the default to the "sufficiently good (non-tier1, not cogent) upstream", only to discover that, for whatever reason, it has no reachability to the prefix. If I have spent to time and effort to get multiple providers, presumably I believe that my bits are important enough to not trust to "this will probably work most of the time..."

Our perceptions differ -- you seem to think that the having full, unfiltered BGP feed protects from these problems. That's not the case. E.g., in the TeliaSonera routing problem I sent on the m-l on Sep 6, all prefixes were received fine through TSIC, but certain traffic ended up being dropped for the duration of about 9 hours. Unless you made an administrative action on the router, some networks would have been blackholed for 9 hours regardless of the fact whether you used unfiltered BGP or filtered BGP. So, if you're uncomfortable with such major networks causing problems in your connectivity, you'll need the ops staff to look after the routing and change it if need be. Ergo, if you need the ops staff, you could just as easily as shutdown or depref of a badly behaving transit switch the default or change the other priorities. I guess the main point here is how prevalent "no reachability, no prefix" scenario is compared to "routing/forwarding broken, manual action required". My take is that the the former is rare with good upstreams and while the latter might not be as frequent as the former, you'll need to prepare for it in any case so the difference likely doesn't matter that much. -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings

Jon Lewis

1:23 p.m.

New subject: Route table growth and hardware limits...talk to the filter

On Sat, 22 Sep 2007, Pekka Savola wrote:

...

Our perceptions differ -- you seem to think that the having full, unfiltered BGP feed protects from these problems. That's not the case. E.g., in the TeliaSonera routing problem I sent on the m-l on Sep 6, all prefixes were received fine through TSIC, but certain traffic ended up being dropped for the duration of about 9 hours.

Has everyone forgotten the "Tier 1 depeerings" of several years ago? i.e. If you were pointing default at C&W, PSINet, Cogent, or Level3 when they each had or caused depeering issues, parts of the internet ceased to be reachable. In such cases, having full routes from multiple providers was the only way to be automatically protected from such games. ---------------------------------------------------------------------- Jon Lewis | I route Senior Network Engineer | therefore you are Atlantic Net | _________ http://www.lewis.org/~jlewis/pgp for PGP public key_________

Joe Provo

4:26 p.m.

New subject: Route table growth and hardware limits...talk to the filter

On Sat, Sep 22, 2007 at 09:23:11AM -0400, Jon Lewis wrote: [snip]

...

Has everyone forgotten the "Tier 1 depeerings" of several years ago? i.e. If you were pointing default at C&W, PSINet, Cogent, or Level3 when they each had or caused depeering issues, parts of the internet ceased to be reachable. In such cases, having full routes from multiple providers was the only way to be automatically protected from such games.

The triumph of marketing in the so-called tier-1s is just sad. The continued success of them reflects the lack of... oh wait, didn't 3561 change hands a lot? And didn't supposedly inferior edge networks pick up 701, 7018, 174 .... Perhaps having marketing dictate a fragile network strategy isn't in the best business interest after all. -- RSUC / GweepNet / Spunk / FnB / Usenix / SAGE

michael.dillon＠bt.com

23 Sep 23 Sep

2:18 p.m.

New subject: Route table growth and hardware limits...talk to the filter

...

Has everyone forgotten the "Tier 1 depeerings" of several years ago? i.e. If you were pointing default at C&W, PSINet, Cogent, or Level3 when they each had or caused depeering issues, parts of the internet ceased to be reachable. In such cases, having full routes from multiple providers was the only way to be automatically protected from such games.

Not so. Anyone who had sufficient transit was also protected from the games. Lots of so-called regionals and tier-2 networks were shielded from this monkey-business. And, of course, they shielded their customers as well. A tier-1 network operator who operates such a fragile network becomes a single point of failure. And not just because of peering as the AT&T frame relay collapse shows. --Michael Dillon

Bill Woodcock

2:38 p.m.

New subject: Route table growth and hardware limits...talk to the filter

On Sun, 23 Sep 2007 michael.dillon@bt.com wrote: > > having full routes from multiple providers was the only way > > to be automatically protected. > > Not so. Anyone who had sufficient transit was also protected from > the games. And they shielded their customers as well. Michael, how are these two statements not in agreement? It looks to me like you're saying the same thing: A network which claims "tier 1" status by failing to buy any transit, subjects its customers to connectivity failures when depeering happens, while a normal multi-homed network does not inflict that failure upon its customers. Isn't that what you're both saying? Disclaimer: this is my first posting of the morning, thus it's inevitably dunderheaded or offensive, for which everyone has my apologies in advance. -Bill

michael.dillon＠bt.com

6:57 p.m.

New subject: Route table growth and hardware limits...talk to the filter

...

On Sun, 23 Sep 2007 michael.dillon@bt.com wrote: > > having full routes from multiple providers was the only way > > to be automatically protected. > > Not so. Anyone who had sufficient transit was also protected from > the games. And they shielded their customers as well.

Michael, how are these two statements not in agreement? It looks to me like you're saying the same thing: A network which claims "tier 1" status by failing to buy any transit, subjects its customers to connectivity failures when depeering happens, while a normal multi-homed network does not inflict that failure upon its customers. Isn't that what you're both saying?

I suppose that if you dig deeper, which most people don't seem to do, then buying transit is just one form of having full routes from multiple providers. But on the surface, the comment that I responded to seemed to be repeating that commonly held belief than only transit-free, default-free providers with multiple peers for any given prefix, can be considered Tier 1. Last century, there was lots of boasting in the business and people needed rules of thumb such as "default free" and "transit free" to sift the wheat from the chaff. But I don't think that is true anymore, especially not on a global scale (even a partly global scale). There are providers who provide high levels of service and reliability who have some transit and some default routes in the mix. I'd like to see a lot more focus on how a network deals with single points of failure, physical separacy of links, and the like. These are more important than whether they are a pure-play peering network.

...

Disclaimer: this is my first posting of the morning, thus it's inevitably dunderheaded or offensive, for which everyone has my apologies in advance.

Not at all. It is inevitable to have misunderstandings when going through a paradigm change. We went through the last one when the telecom industry bought up the ISP industry. But now we are going through another one as businesses higher up the OSI stack, like Google, are getting into running an IP WAN. Also, traditional telecom companies are diversifying into other service areas higher up the stack in a similar way to how IBM branched out from being a computer hardware manufacturer into a services company. --Michael Dillon

Bill Woodcock

8:20 p.m.

New subject: Route table growth and hardware limits...talk to the filter

On Sun, 23 Sep 2007 michael.dillon@bt.com wrote: > On the surface, the comment that I responded > to seemed to be repeating that commonly held belief than only > transit-free, default-free providers with multiple peers for > any given prefix, can be considered Tier 1. Well, taken in its entirety, that's the null set. Hypothetically, setting aside the issue of mainland China, it could be the case that there would be a set of providers which were transit-free. However, if they were transit-free, they would, by definition, never have more than one peer for any single-homed prefix. But in any event, pretty much the definition of "tier 1" is the subset of providers which claim not to buy transit, and peer with each other, and not with anyone else. Whether or not that set is empty or populated is one issue. Whether the term is a useful one is a different issue. How much of a liability it would be to one's self and one's customers to find one's self in that set is a third issue. But I'm not convinced we have a disagreement on our hands here. Just more of an argument. :-) -Bill

michael.dillon＠bt.com

8:54 p.m.

New subject: Route table growth and hardware limits...talk to the filter

...

However, if they were transit-free, they would, by definition, never have more than one peer for any single-homed prefix.

And that sounds like a single point of failure to me. Let's look at it another way by considering the path to any prefix. If there is only one single path available, and a single event, such as the depeering by one ASN, can lead to that path being broken, then you have a network whose connectivity is not terribly robust. If a network bites the bullet, and either openly buys transit, or works out some partnership peering plus transit deal to hide the fact that they have transit, then there is the possibility of having two paths for every prefix. If they then take the trouble to analyze the paths and adjust things to make sure that the multiple paths to a single prefix don't share fate, then they stand a good chance of having a robust network. The thinking, and the work involved, are a lot like what you need to do in order to ensure physical separacy of fibre paths. It's the same fundamental problem but perhaps more dynamic since circuits tend to get groomed less often than paths change. --Michael Dillon

Raymond L. Corbin

9:45 p.m.

New subject: Yahoo! Mail/Sys Admin

Hello, Can a Yahoo! Mail/SysAdmin contact me off list? I am having a problem with multiple mail servers within our network not being able to send to Yahoo mail servers. Thanks, Raymond Corbin Support Analyst HostMySite.com

Suresh Ramasubramanian

24 Sep 24 Sep

12:58 a.m.

New subject: Yahoo! Mail/Sys Admin

On 9/24/07, Raymond L. Corbin <rcorbin@hostmysite.com> wrote:

...

Can a Yahoo! Mail/SysAdmin contact me off list? I am having a problem with multiple mail servers within our network not being able to send to Yahoo mail servers.

http://help.yahoo.com/l/us/yahoo/mail/yahoomail/postmaster/ -- Suresh Ramasubramanian (ops.lists@gmail.com)

Raymond L. Corbin

1:36 a.m.

New subject: Yahoo! Mail/Sys Admin

I've used those forms. All I get are canned responses :/ -Ray -----Original Message----- From: Suresh Ramasubramanian [mailto:ops.lists@gmail.com] Sent: Sunday, September 23, 2007 8:58 PM To: Raymond L. Corbin Cc: nanog@nanog.org Subject: Re: Yahoo! Mail/Sys Admin On 9/24/07, Raymond L. Corbin <rcorbin@hostmysite.com> wrote:

...

Can a Yahoo! Mail/SysAdmin contact me off list? I am having a problem with multiple mail servers within our network not being able to send to Yahoo mail servers.

http://help.yahoo.com/l/us/yahoo/mail/yahoomail/postmaster/ -- Suresh Ramasubramanian (ops.lists@gmail.com)

Ken Simpson

3:55 p.m.

New subject: Yahoo! Mail/Sys Admin

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

...

I've used those forms. All I get are canned responses :/

Such is the art of sending email these days... - -- Ken Simpson CEO, MailChannels Fax: +1 604 677 6320 Web: http://mailchannels.com MailChannels - Reliable Email Delivery (tm) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFG993U2YHPr/ypq5QRAs+VAJ9xJpZHtm1FU++nEWOCpnVxqrAVKQCeMext bfQ1V+OmTJ10usk+SRXbOVE= =o7gz -----END PGP SIGNATURE-----

Jason J. W. Williams

4:18 p.m.

New subject: Yahoo! Mail/Sys Admin

Hi Ray, And Yahoo's better than MSN at having a live body resolve the issue... Good luck. Hopefully, someone at Yahoo! Has heard you. :-) -J -----Original Message----- From: owner-nanog@merit.edu [mailto:owner-nanog@merit.edu] On Behalf Of Raymond L. Corbin Sent: Sunday, September 23, 2007 7:37 PM To: Suresh Ramasubramanian Cc: nanog@nanog.org Subject: RE: Yahoo! Mail/Sys Admin I've used those forms. All I get are canned responses :/ -Ray -----Original Message----- From: Suresh Ramasubramanian [mailto:ops.lists@gmail.com] Sent: Sunday, September 23, 2007 8:58 PM To: Raymond L. Corbin Cc: nanog@nanog.org Subject: Re: Yahoo! Mail/Sys Admin On 9/24/07, Raymond L. Corbin <rcorbin@hostmysite.com> wrote:

...

Can a Yahoo! Mail/SysAdmin contact me off list? I am having a problem with multiple mail servers within our network not being able to send to Yahoo mail servers.

http://help.yahoo.com/l/us/yahoo/mail/yahoomail/postmaster/ -- Suresh Ramasubramanian (ops.lists@gmail.com) !SIG:46f7185962571437352537!

Al Iverson

6:51 p.m.

New subject: Yahoo! Mail/Sys Admin

On 9/23/07, Raymond L. Corbin <rcorbin@hostmysite.com> wrote:

...

I've used those forms. All I get are canned responses :/

What did the responses say? They're usually insightful. The canned responses are indeed coming from humans at the other end, and you usually can reply and ask for clarification of various points. Regards, Al Iverson -- Al Iverson on Spam and Deliverability, see http://www.spamresource.com News, stats, info, and commentary on blacklists: http://www.dnsbl.com My personal website: http://www.aliverson.com -- Chicago, IL, USA Remove "lists" from my email address to reach me faster and directly.

Jon Lewis

23 Sep 23 Sep

3:23 p.m.

New subject: Route table growth and hardware limits...talk to the filter

On Sun, 23 Sep 2007 michael.dillon@bt.com wrote:

...

...
Has everyone forgotten the "Tier 1 depeerings" of several years ago? i.e. If you were pointing default at C&W, PSINet, Cogent, or Level3 when they each had or caused depeering issues, parts of the internet ceased to be reachable. In such cases, having full routes from multiple providers was the only way to be automatically protected from such games.

Not so. Anyone who had sufficient transit was also protected from the games. Lots of so-called regionals and tier-2 networks were shielded from this monkey-business. And, of course, they shielded their customers as well. A tier-1 network operator who operates such a fragile network becomes a single point of failure. And not just because of peering as the AT&T frame relay collapse shows.

I think you've completely missed what I said. If you were pointing default at C&W (whether they were your only connection, or you were "multihomed" but couldn't handle full routes, so perhaps you had customer routes from each provider and default pointing at C&W) when they depeered PSI, single homed (or similarly configured non-full routes) customers of PSI ceased to be reachable. A long time customer of mine was hit by this (their business required communications with one or more single homed PSI customers, and C&W was their sole transit). It was the driving force behind their multihoming. Ever since, they've maintained 3 or more transit providers and full routes from each. ---------------------------------------------------------------------- Jon Lewis | I route Senior Network Engineer | therefore you are Atlantic Net | _________ http://www.lewis.org/~jlewis/pgp for PGP public key_________

Randy Bush

3:36 p.m.

New subject: pointing default (was Re: Route table growth and hardware limits...talk to the filter)

side note: i would not advise relying heavily (e.g pointing default) on a network which flap damps or relies on upstreams which damp. one teensie weensie flappipoo and you could be dead meat. randy

John A. Kilpatrick

21 Sep 21 Sep

7:55 p.m.

New subject: Route table growth and hardware limits...talk to the filter

On Fri, 21 Sep 2007, Pekka Savola wrote:

...

But as should be obvious, you don't need to worry about this problem if you're willing to put money into router upgrades. However, I'm just suggesting there is an alternative to router upgrades if you're comfortable with the somewhat different tradeoffs that will bring with it.

Yes, I would agree this statement is true but some of the tradeoffs seem pretty high. My statement about routing platforms was more based on the fact that what my Cisco rep said was true - the sup upgrade was gonna be cheaper than 7304s or "option J". I mean yeah, I could buy 7206s but it still wouldn't save me that much. What just chaps my hide is that there is no reason, in this application, to need 40GB/slot performance. Their refusal to sell a cheaper card with improved TCAM suggests that the SUP720/RSP720 has really high margins and they're making a killing on this issue... -- John A. Kilpatrick john@hypergeek.net Email| http://www.hypergeek.net/ john-page@hypergeek.net Text pages| ICQ: 19147504 remember: no obstacles/only challenges

James Jun

22 Sep 22 Sep

7:11 p.m.

New subject: Route table growth and hardware limits...talk to the filter

...

My statement about routing platforms was more based on the fact that what my Cisco rep said was true - the sup upgrade was gonna be cheaper than 7304s or "option J". I mean yeah, I could buy 7206s but it still wouldn't save me that much.

What just chaps my hide is that there is no reason, in this application, to need 40GB/slot performance. Their refusal to sell a cheaper card with improved TCAM suggests that the SUP720/RSP720 has really high margins and they're making a killing on this issue...

Actually, originally Cisco planned to release SUP32-XL or similar variant with higher FIB TCAM space. But they scrapped that plan near the end, screwing many people in the process (I'm sure some cisco account reps got earful about this from many people who bought sup32's in the past)-- I mean hey, forcing customers to buy SUP720 plus may be new line cards (depending on situation) is more revenue right? This whole 220k+ ipv4 routing issue is an excellent opportunity :) On the other hand, if you have the guts, try popping in a PFC3BXL card into SUP32. I wonder which IOS versions will actually recognize this and show ~1 mil. entry capacity when doing 'sh mls cef max' ;-) (WARNING: this completely violates warranty and irreparable damage may occur) james

micky coughes

8:12 p.m.

New subject: Route table growth and hardware limits...talk to the filter

On 9/22/07, James Jun <james@towardex.com> wrote:

...

...
My statement about routing platforms was more based on the fact that what my Cisco rep said was true - the sup upgrade was gonna be cheaper than 7304s or "option J". I mean yeah, I could buy 7206s but it still wouldn't save me that much.

What just chaps my hide is that there is no reason, in this application, to need 40GB/slot performance. Their refusal to sell a cheaper card with improved TCAM suggests that the SUP720/RSP720 has really high margins and they're making a killing on this issue...

Actually, originally Cisco planned to release SUP32-XL or similar variant with higher FIB TCAM space. But they scrapped that plan near the end, screwing many people in the process (I'm sure some cisco account reps got earful about this from many people who bought sup32's in the past)-- I mean hey, forcing customers to buy SUP720 plus may be new line cards (depending on situation) is more revenue right? This whole 220k+ ipv4 routing issue is an excellent opportunity :)

On the other hand, if you have the guts, try popping in a PFC3BXL card into SUP32. I wonder which IOS versions will actually recognize this and show ~1 mil. entry capacity when doing 'sh mls cef max' ;-) (WARNING: this completely violates warranty and irreparable damage may occur)

james

James, So it is the vendor's fault that you didn't properly engineer your network and size the right kit for the job? Learn a little engineering 101 to avoid these situations.

James Jun

8:20 p.m.

New subject: Route table growth and hardware limits...talk to the filter

...

James, So it is the vendor's fault that you didn't properly engineer your network and size the right kit for the job? Learn a little engineering 101 to avoid these situations.

Did I ever mention that *I* didn't properly engineer my network (there are no sup32's on my network as of date)? Consider your own arrogance before you make idiotic statements that add no value to discussion. james

John A. Kilpatrick

25 Sep 25 Sep

7:48 p.m.

New subject: Route table growth and hardware limits...talk to the filter

On Sat, 22 Sep 2007, James Jun wrote:

...

On the other hand, if you have the guts, try popping in a PFC3BXL card into SUP32.

If the SUP32 had a slot for a daughter card folks would. But the PFC is integrated into the card...thus our problem. No upgrade for you! -- John A. Kilpatrick john@hypergeek.net Email| http://www.hypergeek.net/ john-page@hypergeek.net Text pages| ICQ: 19147504 remember: no obstacles/only challenges

Lincoln Dale

21 Sep 21 Sep

11:31 a.m.

New subject: Route table growth and hardware limits...talk to the filter

...

No, but someone previously posted that with later software versions, when TCAM runs out, packets for those routes that fit in TCAM are hardware switched, and only traffic for the remaining routes that didn't fit are software switched.

that would have been me. and the comments on the logic still stand (as correct) provided you are running the appropriately non-ancient release of code. cheers, lincoln (ltd@cisco.com)

Jon Lewis

11:52 a.m.

New subject: Route table growth and hardware limits...talk to the filter

On Fri, 21 Sep 2007, Lincoln Dale wrote:

...

...
No, but someone previously posted that with later software versions, when TCAM runs out, packets for those routes that fit in TCAM are hardware switched, and only traffic for the remaining routes that didn't fit are software switched.

that would have been me. and the comments on the logic still stand (as correct) provided you are running the appropriately non-ancient release of code.

Do you know in what version this behavior was changed? At the very least, people are going to want to upgrade IOS, as it'll likely mean the difference between slightly increased MSFC CPU and a switch that can't cope. ---------------------------------------------------------------------- Jon Lewis | I route Senior Network Engineer | therefore you are Atlantic Net | _________ http://www.lewis.org/~jlewis/pgp for PGP public key_________

Matt Liotta

1:14 p.m.

New subject: Route table growth and hardware limits...talk to the filter

Jon Lewis wrote:

...

Do you know in what version this behavior was changed? At the very least, people are going to want to upgrade IOS, as it'll likely mean the difference between slightly increased MSFC CPU and a switch that can't cope.

I looked at the IOS version I ran the test on it was quite old (12.1). I am happy to rerun the test with other versions of IOS. Any suggestions? -Matt

Lincoln Dale

1:26 p.m.

New subject: Route table growth and hardware limits...talk to the filter

...

Jon Lewis wrote:

...
Do you know in what version this behavior was changed? At the very least, people are going to want to upgrade IOS, as it'll likely mean the difference between slightly increased MSFC CPU and a switch that can't cope.

I looked at the IOS version I ran the test on it was quite old (12.1). I am happy to rerun the test with other versions of IOS. Any suggestions?

i'd suggest the most recent 12.2(18)SXF (its 12.2(18)SXF11) or 12.2(33)SXH. i think you'll find you get different results than your original test. cheers, lincoln. (ltd@cisco.com)

6756

Age (days ago)

6770

Last active (days ago)

List overview

Download

42 comments

23 participants

participants (23)

Al Iverson
Bill Woodcock
Bora Akyol
Church, Charles
Deepak Jain
Donald Stahl
James Jun
Jason J. W. Williams
Joe Provo
John A. Kilpatrick
Jon Lewis
Ken Simpson
Kevin Oberman
Lincoln Dale
Matt Liotta
Michael Smith
michael.dillon＠bt.com
micky coughes
Pekka Savola
Randy Bush
Raymond L. Corbin
Suresh Ramasubramanian
Warren Kumari