Faster 'Net growth rate raises fears about routers

newer
RE: Faster 'Net growth rate raises...

older
RE: Faster 'Net growth rate raises...

Hank Nussbacher

2 Apr 2001 2 Apr '01

11:15 a.m.

I have a feeling this one may start another very large NANOG thread: http://www.nwfusion.com/news/2001/0402routing.html -Hank

Show replies by date

Travis Pugh

2 Apr 2 Apr

11:55 a.m.

Not to oversimplify, but assuming we can continue to separate forwarding from the routing process itself, is this really a situation that calls for a complete redesign of BGP? If you look at the routing processors on Cisco and Juniper hardware, Cisco's GSR is using a 200Mhz MIPS RISC processor and Juniper is using a 333Mhz Mobile Pentium II. With RISC reaching 1Ghz and Intel pushing 2Ghz, it appears that the actual processors in use by the 2 big vendors are a couple of years behind. What happens to the boxes ability to process a 500,000 route table if you quadruple it's memory and give it 5 times more processing power? Also, it would likely require a re-write of software, but what's keeping us from using SMP in routers? Cheers. -travis On Mon, 2 Apr 2001, Hank Nussbacher wrote:

...

I have a feeling this one may start another very large NANOG thread:

http://www.nwfusion.com/news/2001/0402routing.html

-Hank

Adrian Chadd

12:09 p.m.

On Mon, Apr 02, 2001, Travis Pugh wrote:

...

Not to oversimplify, but assuming we can continue to separate forwarding from the routing process itself, is this really a situation that calls for a complete redesign of BGP? If you look at the routing processors on Cisco and Juniper hardware, Cisco's GSR is using a 200Mhz MIPS RISC processor and Juniper is using a 333Mhz Mobile Pentium II.

With RISC reaching 1Ghz and Intel pushing 2Ghz, it appears that the actual processors in use by the 2 big vendors are a couple of years behind. What happens to the boxes ability to process a 500,000 route table if you quadruple it's memory and give it 5 times more processing power?

Also, it would likely require a re-write of software, but what's keeping us from using SMP in routers?

Performance of a routing protocol is not a function of just the CPU avaliable. Performance of a routing protocol is a function of the CPU avaliable and the network characteristics. *shakes head* people keep forgetting this. Do you guys also think you can solve the internets problems by adding more bandwidth? Adrian -- Adrian Chadd "The fact you can download a 100 megabyte file <adrian@creative.net.au> from half way around the world should be viewed as an accident and not a right." -- Adrian Chadd and Bill Fumerola

Marc Teichtahl

12:28 p.m.

adrian, to take your point one step further. the architecture of the router and the mechanism by which it forwards packets differs for various vendors. to simply state "CPU" is a non-sense when it comes to routers. You need to be more specific and look into the scaling effects on scheduler performance, switching fabric performance and architecture, Buffering, forwarding design ( centralised or distributed ), and ASIC development. Remeber Moores law applies to ASICs and their "widget" density. my 2c worths -- --- Marc Teichtahl, B. Eng (Comp Sys) RMIT, IEEE 11016334 Network Sloven and Engineer "I never remember what i did tomorrow" PGP Key ID: 0x8E69E8A1 On Mon, 2 Apr 2001, Adrian Chadd wrote:

...

On Mon, Apr 02, 2001, Travis Pugh wrote:

...
Not to oversimplify, but assuming we can continue to separate forwarding from the routing process itself, is this really a situation that calls for a complete redesign of BGP? If you look at the routing processors on Cisco and Juniper hardware, Cisco's GSR is using a 200Mhz MIPS RISC processor and Juniper is using a 333Mhz Mobile Pentium II.

With RISC reaching 1Ghz and Intel pushing 2Ghz, it appears that the actual processors in use by the 2 big vendors are a couple of years behind. What happens to the boxes ability to process a 500,000 route table if you quadruple it's memory and give it 5 times more processing power?

Also, it would likely require a re-write of software, but what's keeping us from using SMP in routers?

Performance of a routing protocol is not a function of just the CPU avaliable.

Performance of a routing protocol is a function of the CPU avaliable and the network characteristics.

*shakes head* people keep forgetting this. Do you guys also think you can solve the internets problems by adding more bandwidth?

Adrian

Hank Nussbacher

12:29 p.m.

At 20:09 02/04/01 +0800, Adrian Chadd wrote:

...

On Mon, Apr 02, 2001, Travis Pugh wrote:

...
Not to oversimplify, but assuming we can continue to separate forwarding from the routing process itself, is this really a situation that calls for a complete redesign of BGP? If you look at the routing processors on Cisco and Juniper hardware, Cisco's GSR is using a 200Mhz MIPS RISC processor and Juniper is using a 333Mhz Mobile Pentium II.

With RISC reaching 1Ghz and Intel pushing 2Ghz, it appears that the actual processors in use by the 2 big vendors are a couple of years behind. What happens to the boxes ability to process a 500,000 route table if you quadruple it's memory and give it 5 times more processing power?

Also, it would likely require a re-write of software, but what's keeping us from using SMP in routers?

Performance of a routing protocol is not a function of just the CPU avaliable.

Performance of a routing protocol is a function of the CPU avaliable and the network characteristics.

*shakes head* people keep forgetting this. Do you guys also think you can solve the internets problems by adding more bandwidth?

I think the current large routers can handle flapping (50,000 routes every 30 seconds): http://www.lightreading.com/document.asp?site=testing&doc_id=4009&page_number=12 and they can handle large BGP tables (Cisco: 400K, Juniper: 2.4M): http://www.lightreading.com/document.asp?site=testing&doc_id=4009&page_number=10 The problem is all the legacy Cisco 7500s in the core that are defaultless and currently carry 99,000 routes. I think Geoff is wrong in his statement that the problem is not routing table size, but rather flapping. To quote Geoff: "It's not the size of the table, but the number of updates per second that kills a router stone dead." But the rate of flapping is proportional to the size of the routing table, IMO. If you have 1000 routes in your table, and on average 5% of the nets will flap every 60 seconds, that comes to 50. If you table is 100,000 and the same 5% will flap, that comes to 5000 every minute. Reduce the table size and you *will* affect the flapping as well. -Hank

...

Adrian

-- Adrian Chadd "The fact you can download a 100 megabyte file <adrian@creative.net.au> from half way around the world should be viewed as an accident and not a right." -- Adrian Chadd and Bill Fumerola

Adrian Chadd

1:01 p.m.

On Mon, Apr 02, 2001, Hank Nussbacher wrote:

...

I think the current large routers can handle flapping (50,000 routes every 30 seconds): http://www.lightreading.com/document.asp?site=testing&doc_id=4009&page_number=12 and they can handle large BGP tables (Cisco: 400K, Juniper: 2.4M): http://www.lightreading.com/document.asp?site=testing&doc_id=4009&page_number=10

How many routers did they test? Did they test 2 routers? Or did they test 1000 routers? Did they plot just the BGP table withdrawl speed and the subsequent BGP table repopulation speed? What about doing some quick modelling on what affect this flapping "latency" could do to a large mesh of routers. There has been some work done on this. Its been covered at NANOG. The reason that most of its effects on reachability are masked by super-routes. (which for most of you will be the default route. :-) I'd love to see one day when every network running a full BGP table pulled out its default route(s) and ran defaultless.

...

The problem is all the legacy Cisco 7500s in the core that are defaultless and currently carry 99,000 routes. I think Geoff is wrong in his statement that the problem is not routing table size, but rather flapping. To quote Geoff: "It's not the size of the table, but the number of updates per second that kills a router stone dead." But the rate of flapping is proportional to the size of the routing table, IMO. If you have 1000 routes in your table, and on average 5% of the nets will flap every 60 seconds, that comes to 50. If you table is 100,000 and the same 5% will flap, that comes to 5000 every minute. Reduce the table size and you *will* affect the flapping as well.

Even if every router in the internet core was upgraded to the latest and greatest 4-way SMP 2ghz intel CPUs running the routing protocols with 4 gigabytes of RAM each, the sheer complexity of the routing system would produce some rather interesting dynamics. Hell, even if you threwq this at 100,000 routes in today's network topology, I'm pretty sure the nature of BGP would be a little different :) (Read: Just because its faster, doesn't mean its better. Sometimes something being slow acts as a regulator. People might want to try grabbing some basic CS programs to do network modelling and start playing. :-) I'll stop ranting now, since I've already ranted on this topic before. Adrian (NOTE: People are probably thinking that I'm just speaking out of my ass. Being a hardware geek, software programmer and routing person has got its advantages. One of them is that I have an insatiable desire to digest any reading I can to figure out how things work, and I currently do this for networking since my current job hat has "programmer" on it.) -- Adrian Chadd "The fact you can download a 100 megabyte file <adrian@creative.net.au> from half way around the world should be viewed as an accident and not a right." -- Adrian Chadd and Bill Fumerola

Travis Pugh

1:06 p.m.

On Mon, 2 Apr 2001, Adrian Chadd wrote:

...

On Mon, Apr 02, 2001, Travis Pugh wrote:

...
Not to oversimplify, but assuming we can continue to separate forwarding from the routing process itself, is this really a situation that calls for a complete redesign of BGP? If you look at the routing processors on Cisco and Juniper hardware, Cisco's GSR is using a 200Mhz MIPS RISC processor and Juniper is using a 333Mhz Mobile Pentium II.

With RISC reaching 1Ghz and Intel pushing 2Ghz, it appears that the actual processors in use by the 2 big vendors are a couple of years behind. What happens to the boxes ability to process a 500,000 route table if you quadruple it's memory and give it 5 times more processing power?

Also, it would likely require a re-write of software, but what's keeping us from using SMP in routers?

Performance of a routing protocol is not a function of just the CPU avaliable.

Performance of a routing protocol is a function of the CPU avaliable and the network characteristics.

Granted, but are you saying that the 15 minutes it takes one of my BGP sessions to reload has no relevance to the crusty, old processor doing route calculations on 104,000 routes? Multiply the CPU available by 5, and then look for bottlenecks. Seems sane to me. Also seems a hell of a lot easier than trying to redesign the network characteristics in 1 year and implement them ... IPv6 anyone? -travis

...

*shakes head* people keep forgetting this. Do you guys also think you can solve the internets problems by adding more bandwidth?

Adrian

-- Adrian Chadd "The fact you can download a 100 megabyte file <adrian@creative.net.au> from half way around the world should be viewed as an accident and not a right." -- Adrian Chadd and Bill Fumerola

Greg Maxwell

3 Apr 3 Apr

12:37 p.m.

On Mon, 2 Apr 2001, Travis Pugh wrote:

...

...
Performance of a routing protocol is not a function of just the CPU avaliable.

Performance of a routing protocol is a function of the CPU avaliable and the network characteristics.

Granted, but are you saying that the 15 minutes it takes one of my BGP sessions to reload has no relevance to the crusty, old processor doing route calculations on 104,000 routes? Multiply the CPU available by 5, and then look for bottlenecks. Seems sane to me. Also seems a hell of a lot easier than trying to redesign the network characteristics in 1 year and implement them ... IPv6 anyone?

It might not, it might be more of a function of the CPU on the other end. It might be more limited by the bandwidth. Even if it's totally CPU limited on your end: multiply your CPU by 5 and you have gained somewhat less then a factor of 5 improvement. Replace the internet with a highly aggregated IPv6 network which uses transport level multihoming and you gain a factor of 1000 improvement at core routers (and 100,000x further from the core where you no longer need to be default-free) and still have the oppturnity for a further 5x by going to a state-of-the-art CPU (providing that your cpu speed reasoning is valid). Incremental performance improvements in router performance is a good thing, but it's no where near the level needed to ensure sustainability. If going to a 5x faster CPU would really help real-world performance that much, the high-end router vendors would have already done it at the prices they charge they can afford to be bleeding-edge cpu wise. Easier yes, in the short term, but after you've implimented your state of the art CPU to scale any further you need to invent working quantum computers and install seperate OC3s to carry routing updates to continue scaling. When you consider that, IPv6 doesn't sound bad after all.

Philip Smith

9:11 p.m.

If you look at the graphs though, the routing table growth stopped around the end of the year. I've seen 101k prefixes give or take a few hundred in my view since 31st December.... Also, the number of /24s being announced has stopped growing. It's been 58k5 for the last 3 months... Routing table growth is following the state of the Internet economies? Looks like it to me. philip -- At 13:15 02/04/2001 +0200, Hank Nussbacher wrote:

...

I have a feeling this one may start another very large NANOG thread:

http://www.nwfusion.com/news/2001/0402routing.html

-Hank

Geoff Huston

11:48 p.m.

At 4/4/01 07:11 AM, Philip Smith wrote:

...

If you look at the graphs though, the routing table growth stopped around the end of the year. I've seen 101k prefixes give or take a few hundred in my view since 31st December.... Also, the number of /24s being announced has stopped growing. It's been 58k5 for the last 3 months...

Routing table growth is following the state of the Internet economies? Looks like it to me.

Absolutely - if you look at the strong US dollar and make the observation that connectivity prices for Internet are largely driven in USD currency, then the relatively stronger USD makes connectivity more expensive in other economies - this in turn damps down growth as new non-US markets which are dependant on exposure to lower unit prices remain unexposed until the comms price declines once more. hmm - maybe we can use the first derivative of this BGP table metric as a global economy indicator :-)

Hank Nussbacher

4 Apr 4 Apr

9 a.m.

At 07:11 04/04/01 +1000, Philip Smith wrote: It may also be the concerted effort of certain individuals who have been after the top 10-30 non-CIDRers. I know I have sent out dozens of emails to those ASes since Jan 1 and have gotten, in general, a positive response and many have fixed their systems. -Hank

...

If you look at the graphs though, the routing table growth stopped around the end of the year. I've seen 101k prefixes give or take a few hundred in my view since 31st December.... Also, the number of /24s being announced has stopped growing. It's been 58k5 for the last 3 months...

Routing table growth is following the state of the Internet economies? Looks like it to me.

philip --

At 13:15 02/04/2001 +0200, Hank Nussbacher wrote:

...
I have a feeling this one may start another very large NANOG thread:

http://www.nwfusion.com/news/2001/0402routing.html

-Hank

Philip Smith

5 Apr 5 Apr

12:59 a.m.

At 11:00 04/04/2001 +0200, Hank Nussbacher wrote:

...

It may also be the concerted effort of certain individuals who have been after the top 10-30 non-CIDRers. I know I have sent out dozens of emails to those ASes since Jan 1 and have gotten, in general, a positive response and many have fixed their systems.

I'd imagine that efforts to clean up the table would result in step function changes, rather than the abrupt stop to the increase we have been seeing. It's a little different from 1994 when we all had to CIDRise "or die"... (I'll look at it a bit more closely - the daily delta should point to whether it is a cleanup, or economic slowdown...) philip --

...

-Hank

...
If you look at the graphs though, the routing table growth stopped around the end of the year. I've seen 101k prefixes give or take a few hundred in my view since 31st December.... Also, the number of /24s being announced has stopped growing. It's been 58k5 for the last 3 months...

Routing table growth is following the state of the Internet economies? Looks like it to me.

philip --

9113

Age (days ago)

9116

Last active (days ago)

List overview

Download

11 comments

7 participants

participants (7)

Adrian Chadd
Geoff Huston
Greg Maxwell
Hank Nussbacher
Marc Teichtahl
Philip Smith
Travis Pugh