On 8/10/07, John Paul Morrison <jmorrison@bogomips.com> wrote:
And yet people still say the sky is falling with respect to routing convergence and FIB size. Probably a better comparison BTW, would be with a Nintendo or Playstation, as they are MIPS and PowerPC based. Even the latest route processor for a decent peering box is only a 1.2 GHz PowerPC with 2 GB RAM (RSP720) - so basically an old iBook is enough for the BGP control plane load these days? I think this has something to do with the vendors giving you just enough to keep you going, but not so much that you delay hardware upgrades :-)
There have been big gains in silicon for the fast switched path, but the route processors even on high end routers are still pretty low end in comparison to what's common on the average desktop. I would say that when control plane/processor power becomes critical, I would hope to see better processors inside.
With the IETF saying that speed and forwarding path are the bottlenecks now, not FIB size, perhaps there just isn't enough load to push Core Duo processors in your routers. (If Apple can switch, why not Cisco?) http://www3.ietf.org/proceedings/07mar/slides/plenaryw-3.pdf
I guess people are still spectacularly missing the real point. The point isn't that the latest generation hardware cpu du jour you can pick up from the local hardware store is doubling processing power every n months. The point is that getting them qualified, tested, verified, and then deployed is a non trivial task. We need to be substantially behind moores observation to be economically viable. I have some small number of route processors in my network and it is a major hassle to get even those few upgraded. In other words, if you have a network that you can upgrade the RPs on every 18 months, let me know. /vijay John Paul Morrison, CCIE 8191
A better comparison would be with a Playstation or Nintendo,
Leo Bicknell wrote:
In a message written on Thu, Aug 09, 2007 at 04:21:37PM +0000, bmanning@vacation.karoshi.com wrote:
(1) there are technology factors we can't predict, e.g., moore's law effects on hardware development
Some of that is predictable though. I'm sitting here looking at a heavily peered exchange point router with a rather large FIB. It has in it a Pentium III 700Mhz processor. Per Wikipedia (http://en.wikipedia.org/wiki/Pentium_III) it appears they were released in late 1999 to early 2000. This box is solidly two, perhaps three, and maybe even 4 doublings behind things that are already available at your local best buy off the shelf.
Heck, this chip is slower than the original Xbox chip, a $400 obsolete game console.
------------------------------
_______________________________________________ PPML You are receiving this message because you are subscribed to the ARIN Public Policy Mailing List (PPML@arin.net). Unsubscribe or manage your mailing list subscription at: http://lists.arin.net/mailman/listinfo/ppml Please contact the ARIN Member Services Help Desk at info@arin.net if you experience any issues.
_______________________________________________ PPML You are receiving this message because you are subscribed to the ARIN Public Policy Mailing List (PPML@arin.net). Unsubscribe or manage your mailing list subscription at: http://lists.arin.net/mailman/listinfo/ppml Please contact the ARIN Member Services Help Desk at info@arin.net if you experience any issues.
In a message written on Fri, Aug 10, 2007 at 11:08:26AM -0700, vijay gill wrote:
substantially behind moores observation to be economically viable. I have some small number of route processors in my network and it is a major hassle to get even those few upgraded. In other words, if you have a network that you can upgrade the RPs on every 18 months, let me
You're mixing problems. Even though you may only be able to put in a new route processor every 3-5 years doesn't mean the vendor shouldn't have a faster version every 18 months, or even sooner. It's the addition of the two that's the problem. You're 5 year cycle may come a year before the vendors 5 year cycle, putting you on 9 year old gear before you refresh next. Vendor J got it half right. The RP is a separately replaceable component based on a commodity motherboard, hooked in with commodity ethernet, using the most popular CPU and ram on the market. And yes, I understand needing to pay extra for the sheet metal, cooling calculations, and other items. But, they still cost 10x a PC based on the same components, and are upgraded perhaps every 3 years, at best. They don't even take advantage of perhaps going from a 2.0Ghz processor to a 2.4, using the same motherboard, RAM, disk, etc. But I think the point still stands, I bet Vendor J in particular could pop out a Core 2 Duo based RP with 8 gig of ram and a 300+ gig hard drive in under 6 months while holding the price point if BGP convergence demanded it, and their customers made it a priority. To Bill's original e-mail. Can we count on 2x every 18 months going forward? No. But betting on 2x every 24 months, and accounting for the delta between currently shipping and currently available hardware seems completely reasonable when assessing the real problem. -- Leo Bicknell - bicknell@ufp.org - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/ Read TMBG List - tmbg-list-request@tmbg.org, www.tmbg.org
On 8/10/07, Leo Bicknell <bicknell@ufp.org> wrote:
In a message written on Fri, Aug 10, 2007 at 11:08:26AM -0700, vijay gill wrote:
substantially behind moores observation to be economically viable. I have some small number of route processors in my network and it is a major hassle to get even those few upgraded. In other words, if you have a network that you can upgrade the RPs on every 18 months, let me
You're mixing problems.
Even though you may only be able to put in a new route processor every 3-5 years doesn't mean the vendor shouldn't have a faster version every 18 months, or even sooner. It's the addition of the two that's the problem. You're 5 year cycle may come a year before the vendors 5 year cycle, putting you on 9 year old gear before you refresh next.
The vendor has to qualify, write code for, and support n versions. This IS a part of the problem. Just blindly swapping out CPUs is non trivial, as any systems engineer can tell you. The support cost will be passed on to the consumer. /vijay Vendor J got it half right. The RP is a separately replaceable
component based on a commodity motherboard, hooked in with commodity ethernet, using the most popular CPU and ram on the market. And yes, I understand needing to pay extra for the sheet metal, cooling calculations, and other items.
But, they still cost 10x a PC based on the same components, and are upgraded perhaps every 3 years, at best. They don't even take advantage of perhaps going from a 2.0Ghz processor to a 2.4, using the same motherboard, RAM, disk, etc.
But I think the point still stands, I bet Vendor J in particular could pop out a Core 2 Duo based RP with 8 gig of ram and a 300+ gig hard drive in under 6 months while holding the price point if BGP convergence demanded it, and their customers made it a priority.
To Bill's original e-mail. Can we count on 2x every 18 months going forward? No. But betting on 2x every 24 months, and accounting for the delta between currently shipping and currently available hardware seems completely reasonable when assessing the real problem.
-- Leo Bicknell - bicknell@ufp.org - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/ Read TMBG List - tmbg-list-request@tmbg.org, www.tmbg.org
_______________________________________________ PPML You are receiving this message because you are subscribed to the ARIN Public Policy Mailing List (PPML@arin.net). Unsubscribe or manage your mailing list subscription at: http://lists.arin.net/mailman/listinfo/ppml Please contact the ARIN Member Services Help Desk at info@arin.net if you experience any issues.
Leo Bicknell wrote:
To Bill's original e-mail. Can we count on 2x every 18 months going forward? No. But betting on 2x every 24 months, and accounting for the delta between currently shipping and currently available hardware seems completely reasonable when assessing the real problem.
This assumes "the real problem" is CPU performance, where many have argued that the real problem is memory bandwidth. Memory doesn't track Moore's Law. Besides, Moore's Law isn't a law. What's your Plan B? This is where a lot of RRG/RAM work is going on right now. Eliot
In a message written on Mon, Aug 13, 2007 at 02:29:14PM +0200, Eliot Lear wrote:
This assumes "the real problem" is CPU performance, where many have argued that the real problem is memory bandwidth. Memory doesn't track Moore's Law. Besides, Moore's Law isn't a law. What's your Plan B? This is where a lot of RRG/RAM work is going on right now.
I think there are multiple problems with core routers. However, the discussion here was about BGP being able to converge. For that, the FIB is not important, there need be no routing plane. What sort of computer does it take to get 200 sessions at an exchange point and compute a FIB in a "reasonable" amount of time? That's determined first by the implementation (algorithm) and second by the processor speed. It may also be impacted due to the bandwidth between routers, although I'm skeptical that's an issue. [Why? Let's say 10,000 routes per peer, and 50 peers all on a single giabit ethernet exchange. Let's also put an upper bound of 512 bytes per route. That's ~250Mbytes, or what, maybe 30 seconds?] It seems to me an off the shelf PC with a Core 2 Duo processor, 4 gig of memory, and a gigabit ethernet port would be 1-2 orders of magnitude faster than what's currently in the routers. Optimize for a multithreaded CPU, add a second and it would converge really fast. My own experience is that zebra / quagga blow away the performance of any router out there as long as you don't ask them to install the routes in the kernel (which is really slow in a general purpose OS). Now, once the FIB is computed, can we push it into line cards, is there enough memory on them, can they do wire rate lookups, etc are all good questions and all quickly drift into specialized hardware. There are no easy answers at that step... -- Leo Bicknell - bicknell@ufp.org - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/ Read TMBG List - tmbg-list-request@tmbg.org, www.tmbg.org
Leo Bicknell wrote:
Now, once the FIB is computed, can we push it into line cards, is there enough memory on them, can they do wire rate lookups, etc are all good questions and all quickly drift into specialized hardware. There are no easy answers at that step...
I think we're agreeing that it's the FIB management that's going to kill you with all the entropy that Paul keeps alluding to. Eliot
On Mon, 13 Aug 2007, Leo Bicknell wrote:
It seems to me an off the shelf PC with a Core 2 Duo processor, 4 gig of memory, and a gigabit ethernet port would be 1-2 orders of magnitude faster than what's currently in the routers. Optimize for a multithreaded CPU, add a second and it would converge really fast. My own experience is that zebra / quagga blow away the performance of any router out there as long as you don't ask them to install the routes in the kernel (which is really slow in a general purpose OS).
Isn't this mostly where we started? I could have sworn the first 'routers' weren't exactly specialized devices... cheers! ========================================================================== "A cat spends her life conflicted between a deep, passionate and profound desire for fish and an equally deep, passionate and profound desire to avoid getting wet. This is the defining metaphor of my life right now."
On August 13, 2007 at 18:14 cat@reptiles.org (Cat Okita) wrote:
Isn't this mostly where we started? I could have sworn the first 'routers' weren't exactly specialized devices...
http://www.cs.purdue.edu/research/technical_reports/1986/TR%2086-575.pdf -- -Barry Shein The World | bzs@TheWorld.com | http://www.TheWorld.com Purveyors to the Trade | Voice: 800-THE-WRLD | Login: Nationwide Software Tool & Die | Public Access Internet | SINCE 1989 *oo*
On 8/13/07, Leo Bicknell <bicknell@ufp.org> wrote:
In a message written on Mon, Aug 13, 2007 at 02:29:14PM +0200, Eliot Lear wrote:
This assumes "the real problem" is CPU performance, where many have argued that the real problem is memory bandwidth. Memory doesn't track Moore's Law. Besides, Moore's Law isn't a law. What's your Plan B? This is where a lot of RRG/RAM work is going on right now.
I think there are multiple problems with core routers. However, the discussion here was about BGP being able to converge. For that, the FIB is not important, there need be no routing plane.
What sort of computer does it take to get 200 sessions at an exchange point and compute a FIB in a "reasonable" amount of time? That's determined first by the implementation (algorithm) and second by the processor speed. It may also be impacted due to the bandwidth between routers, although I'm skeptical that's an issue.
[Why? Let's say 10,000 routes per peer, and 50 peers all on a single giabit ethernet exchange. Let's also put an upper bound of 512 bytes per route. That's ~250Mbytes, or what, maybe 30 seconds?]
It seems to me an off the shelf PC with a Core 2 Duo processor, 4 gig of memory, and a gigabit ethernet port would be 1-2 orders of magnitude faster than what's currently in the routers. Optimize for a multithreaded CPU, add a second and it would converge really fast. My own experience is that zebra / quagga blow away the performance of any router out there as long as you don't ask them to install the routes in the kernel (which is really slow in a general purpose OS).
Pick a newly released Core 2 Duo. How long will Intel be selling it? How does that compare with getting it into your RP design, tested, produced, OS support integrated, sold, and stocked in your depots? -Scott
Now, once the FIB is computed, can we push it into line cards, is there enough memory on them, can they do wire rate lookups, etc are all good questions and all quickly drift into specialized hardware. There are no easy answers at that step...
-- Leo Bicknell - bicknell@ufp.org - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/ Read TMBG List - tmbg-list-request@tmbg.org, www.tmbg.org
In a message written on Mon, Aug 13, 2007 at 05:53:00PM -0700, Scott Whyte wrote:
Pick a newly released Core 2 Duo. How long will Intel be selling it? How does that compare with getting it into your RP design, tested, produced, OS support integrated, sold, and stocked in your depots?
Intel designates chips for "long life". That's why Vendor J is still souring P-III 600's, which were new almost 8 years ago now. Which one of the current chips is in that bucket, I don't know, but the vendors could find out. Plus, your argument doesn't hold for the simple reason that servers have the same lifespan as routers in most companies. HP, Dell, IBM, they don't seem to be going under with changes in Intel's line of chips. They don't seem to have support issues. As the vendors move to off the shelf parts the arguments about testing, stocking, and so forth start to go out the window. More importantly, why specialize? Vendor J's RE is basically a PC connected to the backplane with FastEthernet. They did a lot of engineering in airflow, sheet metal, and other packaging issues to put it in a Juniper package, but to what end? Compare with Avaya. When they moved to a Linux brain in their phone switch line they moved the brain out of the specialized forwarding hardware (the old Definity PBX) and into a, wait for it, PC! Yes, an off the shelf 2U PC they source from a third party, connected to the backplane with Gigabit Ethernet. Vendors also kill themselves on the depot side because they hate to give you a free upgrade. If vendors changed their maintenance policies to be "what you have or faster", when it became cost prohibitive to stock P-III 600's they could stop, giving you the P-III 1.2g that came afterwards when you RMA a part. It's probably cheaper to stop stocking multiple parts and provide "free" upgrades on failure than to stock all the varieties. Of course, I think if the RE were an external 2RU PC that they sold for $5,000 (which is still highway robbery) ISP's might upgrade more than once every 10 years.... The problem here is that large companies don't like to take risk, and any change is perceived as a risk. Cisco and Juniper will not be "creative" in finding a solution, particularly when it may reduce cost (and thus, revenue). Small startups that might take the risk can't play in the specialized forwarding side of things. We can exist in this state, primarily because we're not pushing the cutting edge. Route Processors are WAY behind current technology. Fordwarding hardware is WAY ahead of current need. Cost/bit is the problem, and has been for some number of years. We have OC-768, but can't afford to deploy the DWDM systems to support it. We have 32 way Xenon boxes, but we can't afford to change the design to use them as route processors. -- Leo Bicknell - bicknell@ufp.org - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/ Read TMBG List - tmbg-list-request@tmbg.org, www.tmbg.org
On Tue, Aug 14, 2007, Leo Bicknell wrote:
Of course, I think if the RE were an external 2RU PC that they sold for $5,000 (which is still highway robbery) ISP's might upgrade more than once every 10 years....
Sounds like an experiment. Anyone have a spare J M40? (*duck*) Adrian
Of course, I think if the RE were an external 2RU PC that they sold for $5,000 (which is still highway robbery) ISP's might upgrade more than once every 10 years....
Sounds like an experiment. Anyone have a spare J M40?
Since End of Service for M40s is later this year, you should actually expect to see quite a few spare M40s in relatively short time. We have one... Steinar Haug, Nethelp consulting, sthaug@nethelp.no
Hi, I am the chap who did the port of XORP to Microsoft Windows; normally on the Internet I play a FreeBSD committer role. Leo Bicknell wrote:
It seems to me an off the shelf PC with a Core 2 Duo processor, 4 gig of memory, and a gigabit ethernet port would be 1-2 orders of magnitude faster than what's currently in the routers. Optimize for a multithreaded CPU, add a second and it would converge really fast. My own experience is that zebra / quagga blow away the performance of any router out there as long as you don't ask them to install the routes in the kernel (which is really slow in a general purpose OS).
As far as I am aware, zebra/quagga uses a table scanning design for BGP. This is where XORP comes in: http://www.usenix.org/event/nsdi05/tech/full_papers/handley/handley_html/ind... Figure 13, BGP route flow, may be of particular interest here. XORP defers the pushdown of routes in its RIB to the kernel. The current model is to explicitly coroutine on a per-process basis, within a message-passing architecture. This is lockless at process level - the kernel handles synchronization based on the I/O primitives upon which the XRL message passing between XORP routing processes is based, so it does rely on the atomicity of these operations. Normally on a BSD or Linux machine, a local stream socket (TCP or UNIX) is used for such communication. XORP's design could potentially be improved further for multi-core CPUs, by the use of continuations and CPU affinity. The use of a soft-real-time scheduler might also be worth investigation, although XORP is built around cooperative multi-tasking, so that would require explicitly re-entering the scheduler. As regards the latency of route propagation to the kernel: I'd be interested to hear in technical detail problems which people may be experiencing here, e.g. on Linux or BSD platforms. I have on-and-off been researching a new routing implementation for FreeBSD which would draw inspiration from the recent work in Linux and other operating systems, so I want to get as much input from people's experiences as possible. regards, BMS
[ vijay]
I guess people are still spectacularly missing the real point. The point isn't that the latest generation hardware cpu du jour you can pick up from the local hardware store is doubling processing power every n months.
agreed.
The point is that getting them qualified, tested, verified, and then deployed is a non trivial task. We need to be substantially behind moores observation to be economically viable. I have some small number of route processors in my network and it is a major hassle to get even those few upgraded. In other words, if you have a network that you can upgrade the RPs on every 18 months, let me know.
yow. while i agree that routing processors cannot, and have historically not had to, track moore's law, i am still surprised to see such a heavy focus on the RP. my (ample) gut feeling on this is that system level (combinatorial) effects would limit Internet routing long before moore's law could do so.
On 8/10/07, Paul Vixie <paul@vix.com> wrote:
[ vijay]
I guess people are still spectacularly missing the real point. The point isn't that the latest generation hardware cpu du jour you can pick up from the local hardware store is doubling processing power every n months.
agreed.
The point is that getting them qualified, tested, verified, and then deployed is a non trivial task. We need to be substantially behind moores observation to be economically viable. I have some small number of route processors in my network and it is a major hassle to get even those few upgraded. In other words, if you have a network that you can upgrade the RPs on every 18 months, let me know.
yow. while i agree that routing processors cannot, and have historically not had to, track moore's law, i am still surprised to see such a heavy focus on the RP. my (ample) gut feeling on this is that system level (combinatorial) effects would limit Internet routing long before moore's law could do so.
It is an easy derivative/proxy for the system level effect is all. Bandwidth for updates (inter and intra system) are another choking point but folks tend to be even less aware of those than cpu. /vijay
... is that system level (combinatorial) effects would limit Internet routing long before moore's law could do so.
It is an easy derivative/proxy for the system level effect is all. Bandwidth for updates (inter and intra system) are another choking point but folks tend to be even less aware of those than cpu.
is bandwidth the only consideration? number of graph nodes and number of advertised endpoints and churn rate per endpoint don't enter into the limits? at what system size does speed of light begin to enter into the equation? (note, as i told geoff huston: if it seems like john scudder's outbound BGP announcement compression observations are relevant, or that moore's law is relevant, then you're misunderstanding my question or i'm asking it wrong.)
On Fri, 10 Aug 2007 18:42:23 +0000 Paul Vixie <paul@vix.com> wrote:
... is that system level (combinatorial) effects would limit Internet routing long before moore's law could do so.
It is an easy derivative/proxy for the system level effect is all. Bandwidth for updates (inter and intra system) are another choking point but folks tend to be even less aware of those than cpu.
is bandwidth the only consideration? number of graph nodes and number of advertised endpoints and churn rate per endpoint don't enter into the limits? at what system size does speed of light begin to enter into the equation?
Right. What is the computational complexity of the current algorithm? --Steve Bellovin, http://www.cs.columbia.edu/~smb
participants (11)
-
Adrian Chadd
-
Barry Shein
-
Bruce M Simpson
-
Cat Okita
-
Eliot Lear
-
Leo Bicknell
-
Paul Vixie
-
Scott Whyte
-
Steven M. Bellovin
-
sthaug@nethelp.no
-
vijay gill