estimation of number of DFZ IPv4 routes at peak in the future

older
In need of an att person at the...

Mikael Abrahamsson

9 Mar 2011 9 Mar '11

3:17 a.m.

Hi. We had an interesting discussion the other day at work. We were speculating on how many DFZ IPv4 routes there would be at peak in the future before it starts to decline again due to less IPv4 usage. The current number is around 350k, and my personal estimation is that it would grow by at least 100k more due to the the last 5 /8s being carved up at around /22 meaning each /8 ending up with 16k routes, plus the last allocations being seen in the remaining RIR "normal allocations" would be smaller than before plus de-aggregation of space as people "sell" or "lease" subspace of their allocations. My guess therefore is a peak around 450-500k IPv4 DFZ routes and that this would happen in around 3-5 years. I wanted to record this for posterity. What is your guess, any why? -- Mikael Abrahamsson email: swmike@swm.pp.se

Show replies by date

Randy Bush

9 Mar 9 Mar

3:44 a.m.

i am more of a pessimist. i suspect that there will be enough v4-only destinations out there that multi-homed enterprises fronting onto dual-stack backbones will announce teenie bits of v4 so they can nat64. randy

Majdi S. Abbas

5:43 a.m.

On Wed, Mar 09, 2011 at 12:44:05PM +0900, Randy Bush wrote:

...

i am more of a pessimist. i suspect that there will be enough v4-only destinations out there that multi-homed enterprises fronting onto dual-stack backbones will announce teenie bits of v4 so they can nat64.

I'll take this one a little further. I suspect that as we reach exhaustion, more people will be forced to break space out of their provider's v4 aggregates, and announce them, and an unfiltered DFZ may well approach the 'million' entries some vendors now claim to support. Conveniently, we've given them enough ASes to do so, with four byte support. At least if our vendors get that working correctly. If we get there, or even close (anything beyond 0.5M), I expect we'll see some of the native dual stack networks actually acquire transport specifically for v6 and start running parallel 4/6 networks to deal with hardware forwarding limitations, particularly those involving v6. Of course, I'd really, really, really love to be wrong here. It'd be great if v4 traffic fell off quickly enough people wouldn't deagg for TE purposes, or v4 growth fell off, and a widespread forwarding problem could be avoided. --msa

John Curran

2:32 p.m.

On Mar 9, 2011, at 12:43 AM, Majdi S. Abbas wrote:

...

On Wed, Mar 09, 2011 at 12:44:05PM +0900, Randy Bush wrote:

...
i am more of a pessimist. i suspect that there will be enough v4-only destinations out there that multi-homed enterprises fronting onto dual-stack backbones will announce teenie bits of v4 so they can nat64.

I'll take this one a little further.

I suspect that as we reach exhaustion, more people will be forced to break space out of their provider's v4 aggregates, and announce them, and an unfiltered DFZ may well approach the 'million' entries some vendors now claim to support.

This matches my personal view (and could be viewed as "success" compared to the 5M estimate of Mr. Herrin...) /John

Justin Krejci

11 Mar 11 Mar

11:39 p.m.

On Wed, 2011-03-09 at 09:32 -0500, John Curran wrote:

...

On Mar 9, 2011, at 12:43 AM, Majdi S. Abbas wrote:

...
On Wed, Mar 09, 2011 at 12:44:05PM +0900, Randy Bush wrote:

...
i am more of a pessimist. i suspect that there will be enough v4-only destinations out there that multi-homed enterprises fronting onto dual-stack backbones will announce teenie bits of v4 so they can nat64.

I'll take this one a little further.

I suspect that as we reach exhaustion, more people will be forced to break space out of their provider's v4 aggregates, and announce them, and an unfiltered DFZ may well approach the 'million' entries some vendors now claim to support.

This matches my personal view (and could be viewed as "success" compared to the 5M estimate of Mr. Herrin...)

/John

Are people going to be relying on using default-routing then in the future if they don't upgrade routers to handle large routing table growth? Or perhaps forgo dual-stack and have a separate physical IPv6 BGP network from IPv4? Are there any other strategies?

William Herrin

12 Mar 12 Mar

1:43 a.m.

On Fri, Mar 11, 2011 at 6:39 PM, Justin Krejci <jkrejci@usinternet.com> wrote:

...

On Wed, 2011-03-09 at 09:32 -0500, John Curran wrote:

...
On Mar 9, 2011, at 12:43 AM, Majdi S. Abbas wrote:

...
I suspect that as we reach exhaustion, more people will be forced to break space out of their provider's v4 aggregates, and announce them, and an unfiltered DFZ may well approach the 'million' entries some vendors now claim to support.

This matches my personal view (and could be viewed as "success" compared to the 5M estimate of Mr. Herrin...)

Are people going to be relying on using default-routing then in the future if they don't upgrade routers to handle large routing table growth? Or perhaps forgo dual-stack and have a separate physical IPv6 BGP network from IPv4? Are there any other strategies?

Hi Justin, IMHO, the most sensible strategy is to recognize that that cost of a route has been dropping faster than the route count has been rising for the past decade. Then recognize that with today's hardware, building a route processor capable of keeping up with 10M routes instead of 1M routes would cost maybe twice as much... 10M being sufficient to handle the worst case estimates for the final size of the IPv4 table in parallel with any reasonable estimate of the IPv6 table in the foreseeable future. Better CPU, more DRAM, bigger TCAM. It could be built today. Finally, get mad at your respective router manufacturers for engineering obsolescence into their product line by declining to give you the option. But that's just my opinion... Regards, Bill Herrin -- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004

Owen DeLong

1:53 a.m.

On Mar 11, 2011, at 5:43 PM, William Herrin wrote:

...

On Fri, Mar 11, 2011 at 6:39 PM, Justin Krejci <jkrejci@usinternet.com> wrote:

...
On Wed, 2011-03-09 at 09:32 -0500, John Curran wrote:

...
On Mar 9, 2011, at 12:43 AM, Majdi S. Abbas wrote:

...
I suspect that as we reach exhaustion, more people will be forced to break space out of their provider's v4 aggregates, and announce them, and an unfiltered DFZ may well approach the 'million' entries some vendors now claim to support.

This matches my personal view (and could be viewed as "success" compared to the 5M estimate of Mr. Herrin...)

Are people going to be relying on using default-routing then in the future if they don't upgrade routers to handle large routing table growth? Or perhaps forgo dual-stack and have a separate physical IPv6 BGP network from IPv4? Are there any other strategies?

Hi Justin,

IMHO, the most sensible strategy is to recognize that that cost of a route has been dropping faster than the route count has been rising for the past decade. Then recognize that with today's hardware, building a route processor capable of keeping up with 10M routes instead of 1M routes would cost maybe twice as much... 10M being sufficient to handle the worst case estimates for the final size of the IPv4 table in parallel with any reasonable estimate of the IPv6 table in the foreseeable future. Better CPU, more DRAM, bigger TCAM. It could be built today.

But the RP is the easy cheap part. It's the line cards and the TCAM/etc. that they use that gets pricey fast.

...

Finally, get mad at your respective router manufacturers for engineering obsolescence into their product line by declining to give you the option.

The option of $60,000 line cards instead of $30,000 or even $25,000 instead of $12,000 does not seem like one that most would have found appealing.

...

But that's just my opinion...

And the above is just mine. Owen

William Herrin

2:14 a.m.

On Fri, Mar 11, 2011 at 8:53 PM, Owen DeLong <owen@delong.com> wrote:

...

On Mar 11, 2011, at 5:43 PM, William Herrin wrote:

...
Finally, get mad at your respective router manufacturers for engineering obsolescence into their product line by declining to give you the option.

The option of $60,000 line cards instead of $30,000 or even $25,000 instead of $12,000 does not seem like one that most would have found appealing.

Ever buy SAS drives instead of SATA even though they're twice as expensive for the same disk space? Why? You may be right, but it'd be nice to have the option. -Bill -- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004

Joel Jaeggli

7:17 a.m.

I'm super-tired of the "but tcams are an expensive non-commodity part not subject to economies of scale". this has been repeated ad nauseam since the raws workshop if not before. You don't have to build a lookup engine around a tcam and in fact you can use less power doing so even though you need more silicon to achieve increased parallelism. RFC 4984 has a lot of useful insights in it but it was flat wrong about two things since 2007. The impact of rate of growth in the DFZ(for one thing churn failed to grow in lockstep), and the ability of the technology to keep up. Not all the devices in your network need a 2 million route FIB, yet getting a device today that has one isn't that hard. and it'll be a lot easier in five years and it likely will do so without having a 144Mbit CAM ASIC. I don't know if we'll be using commercially viable MRAM implementations in place of SRAM cells in a decade or if we'll have more of the same only smaller and faster and much much wider, Or if the LISP religion will take over the world and we'll carry the state for diversely connected edges elsewhere in the stack. What I am confident of is that as an industry we'll be able to deliver something that works without jacking up the Internet routing system and replacing it, and without somehow altering the individual decision making processes in many tens of thousands of autonomous systems. I am also confident that the early adopters will pay more for the technology than the stragglers, that we will grumble about how much it costs and the inevitability of obsolescence, and that life will somehow go on. Joel's widget number 2 On Mar 11, 2011, at 17:53, Owen DeLong <owen@delong.com> wrote:

...

On Mar 11, 2011, at 5:43 PM, William Herrin wrote:

...
On Fri, Mar 11, 2011 at 6:39 PM, Justin Krejci <jkrejci@usinternet.com> wrote:

...
On Wed, 2011-03-09 at 09:32 -0500, John Curran wrote:

...
On Mar 9, 2011, at 12:43 AM, Majdi S. Abbas wrote:

...
I suspect that as we reach exhaustion, more people will be forced to break space out of their provider's v4 aggregates, and announce them, and an unfiltered DFZ may well approach the 'million' entries some vendors now claim to support.

This matches my personal view (and could be viewed as "success" compared to the 5M estimate of Mr. Herrin...)

Are people going to be relying on using default-routing then in the future if they don't upgrade routers to handle large routing table growth? Or perhaps forgo dual-stack and have a separate physical IPv6 BGP network from IPv4? Are there any other strategies?

Hi Justin,

IMHO, the most sensible strategy is to recognize that that cost of a route has been dropping faster than the route count has been rising for the past decade. Then recognize that with today's hardware, building a route processor capable of keeping up with 10M routes instead of 1M routes would cost maybe twice as much... 10M being sufficient to handle the worst case estimates for the final size of the IPv4 table in parallel with any reasonable estimate of the IPv6 table in the foreseeable future. Better CPU, more DRAM, bigger TCAM. It could be built today.

But the RP is the easy cheap part. It's the line cards and the TCAM/etc. that they use that gets pricey fast.

...
Finally, get mad at your respective router manufacturers for engineering obsolescence into their product line by declining to give you the option.

The option of $60,000 line cards instead of $30,000 or even $25,000 instead of $12,000 does not seem like one that most would have found appealing.

...
But that's just my opinion...

And the above is just mine.

Owen

William Herrin

1 p.m.

On Sat, Mar 12, 2011 at 2:17 AM, Joel Jaeggli <joelja@bogus.com> wrote:

...

I'm super-tired of the "but tcams are an expensive non-commodity part not subject to economies of scale". this has been repeated ad nauseam since the raws workshop if not before.

You don't have to build a lookup engine around a tcam and in fact you can use less power doing so even though you need more silicon to achieve increased parallelism.

Hi Joel, You're either building a bunch of big TCAMs or a radix trie engine with sufficient parallelism to get the same aggregate lookup rate. If there's a materially different 3rd way to build a FIB, one that works at least as well, feel free to educate me. And while RIB churn doesn't grow in lockstep with table size, it does grow. Either way when you boost from 1M to 10M you're talking about engineering challenges with heat dissipation and operating challenges with power consumption, not to mention more transistors. I'll be convinced it can be done for less than 2x cost when someone actually does it for less than 2x cost. Whether it's 2x cost or 1.2x cost, the point remains the same: we could have routers today that handle the terminal size of the IPv4 table without breaking the bank. Your favorite router manufacturer has made vague assertions about how they would build one given sufficient customer demand. So make a demand. Regards, Bill Herrin -- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004

Vadim Antonov

2:07 p.m.

On Sat, 2011-03-12 at 08:00 -0500, William Herrin wrote:

...

You're either building a bunch of big TCAMs or a radix trie engine with sufficient parallelism to get the same aggregate lookup rate. If there's a materially different 3rd way to build a FIB, one that works at least as well, feel free to educate me. And while RIB churn doesn't grow in lockstep with table size, it does grow.

Radix trie traversal can be pipelined, with every step in the search being done in separate memory bank. The upper levels of tries are small, and the lower levels contain a lot of gunk which is not used often - so they can be cached on-chip. FIB lookup is much easier than executing instructions like CPUs do precisely because packets are not dependent on each other, so you don't need to stall pipeline (like CPUs do on jumps, I'll skip the discussion of things like branch prediction and speculative execution). This didn't stop folks at Intel producing cheap silicon which executes instructions at astonishing speeds. Where TCAMs really shine is packet classification - but you don't generally need huge TCAM to hold ACLs in.

...

Your favorite router manufacturer has made vague assertions about how they would build one given sufficient customer demand. So make a demand.

OFRV has a track record of producing grossly over-engineered devices, hardware-wise. I've heard a very senior hardware guy who came from OFRV claiming that they do that deliberately to increase barriers to entry for competition, though this doesn't make sense to me. --vadim

Joel Jaeggli

9:43 p.m.

On 3/12/11 5:00 AM, William Herrin wrote:

...

On Sat, Mar 12, 2011 at 2:17 AM, Joel Jaeggli <joelja@bogus.com> wrote:

...
I'm super-tired of the "but tcams are an expensive non-commodity part not subject to economies of scale". this has been repeated ad nauseam since the raws workshop if not before.

You don't have to build a lookup engine around a tcam and in fact you can use less power doing so even though you need more silicon to achieve increased parallelism.

Hi Joel,

You're either building a bunch of big TCAMs or a radix trie engine with sufficient parallelism to get the same aggregate lookup rate.

The trie is working acceptably in 120Gb/s linecards today.

...

If there's a materially different 3rd way to build a FIB, one that works at least as well, feel free to educate me.

Don't need one, that's the point, heroic measures are not in fact required. On the trie side it's the key length O(m) that ditactates the lookup time in the trie and the worst case is therefore bounded by the straight-forward proposition (how to I forward to this /128). having generally shorter routes would got long way towards extending the route capacity of the device, but the table sizes kills you on installing and updating the fib not on the search part.

...

And while RIB churn doesn't grow in lockstep with table size, it does grow.

It does, it just has to not grow faster than our ability to manage it. so long as the remains the case managing the RIB remains in the domain of straight forward capacity management. compressing rib churn out of fib updates and tweaks to bgp state-machines generally is I think an area that has a lot of room for innovation, but that doesn't have to involve the forwarding plane.

...

Either way when you boost from 1M to 10M you're talking about engineering challenges with heat dissipation and operating challenges with power consumption, not to mention more transistors.

We don't need 10 million routes today, we need 2 million in the class of device that currently has 512K (36Mbit CAM) (and has been in production for some time) in 3-5years... That today can/is being done with 64MB RLDRAM. In same time-frame the networks that currently need 2 million route FIBS will need 4-5M. To harp on rldram since I'm somewhat familiar with it, clearly we need faster parts with lower power consumption and it's timely that ddr3 derived parts should begin sampling at some point.

...

I'll be convinced it can be done for less than 2x cost when someone actually does it for less than 2x cost.

part of the exercise is neither building nor buying the capacity before you need it. the later the features needed crystalize into silicon the more likely they are to be usable.

...

Whether it's 2x cost or 1.2x cost, the point remains the same: we could have routers today that handle the terminal size of the IPv4 table without breaking the bank.

Your favorite router manufacturer has made vague assertions about how they would build one given sufficient customer demand. So make a demand.

previous $employer crossed the 800K prefix count in FIBS on couple devices a while ago. I generally don't have to many cases where vendors roll their eyes and laugh hysterically when I talk about what we project our FIB requirements as, that's reserved for other features.

...

Regards, Bill Herrin

William Herrin

13 Mar 13 Mar

12:27 a.m.

On Sat, Mar 12, 2011 at 4:43 PM, Joel Jaeggli <joelja@bogus.com> wrote:

...

On 3/12/11 5:00 AM, William Herrin wrote:

...
I'll be convinced it can be done for less than 2x cost when someone actually does it for less than 2x cost.

part of the exercise is neither building nor buying the capacity before you need it. the later the features needed crystalize into silicon the more likely they are to be usable.

That must be my mistake then, because I thought the exercise was building it in a way that it stays built for the maximum practical number of years. When it has to be touched again (or tweaked if it gets near its limit) that's manpower. Skilled manpower plus the secondary costs from the inevitable delay deploying it is usually the most expensive thing. Regards, Bill Herrin -- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004

Jeff Wheeler

1:44 a.m.

On Sat, Mar 12, 2011 at 7:27 PM, William Herrin <bill@herrin.us> wrote:

...

That must be my mistake then, because I thought the exercise was building it in a way that it stays built for the maximum practical number of years. When it has to be touched again (or tweaked if it

So when you upgrade a device, you always buy the suitable device which has the highest capabilities? You put in a top-of-rack switch with 10GbE for servers with no 10GbE ports and no plans of needing 10GbE connectivity to the next round of servers? You buy a modular router for branch offices that have only a few workstations and no predictable need for upgraded connectivity? This is a good way to waste money, and a bad way to ensure that you will have the *features* you may want in the future. New features will not be back-ported to your box of choice, but you will have sunk unnecessary budget resources into that box, making it harder to justify upgrades. -- Jeff S Wheeler <jsw@inconcepts.biz> Sr Network Operator / Innovative Network Concepts

Christopher Morrow

5:27 p.m.

On Sat, Mar 12, 2011 at 8:44 PM, Jeff Wheeler <jsw@inconcepts.biz> wrote:

...

On Sat, Mar 12, 2011 at 7:27 PM, William Herrin <bill@herrin.us> wrote:

...
That must be my mistake then, because I thought the exercise was building it in a way that it stays built for the maximum practical number of years. When it has to be touched again (or tweaked if it

So when you upgrade a device, you always buy the suitable device which has the highest capabilities? You put in a top-of-rack switch with 10GbE for servers with no 10GbE ports and no plans of needing 10GbE connectivity to the next round of servers? You buy a modular router for branch offices that have only a few workstations and no predictable need for upgraded connectivity?

there's probably a different need in TOR and BO/SOHO locations than core devices, eh?

Jeff Wheeler

6:11 p.m.

On Sun, Mar 13, 2011 at 1:27 PM, Christopher Morrow <morrowc.lists@gmail.com> wrote:

...

there's probably a different need in TOR and BO/SOHO locations than core devices, eh?

In today's backbone, this is certainly true. Feature-driven upgrades shouldn't be much of a factor for "P boxes" today, because modern networks have the option of simply label-switching in the core (just like 1990s networks could ATM/Frame-switch) without doing much of anything else. Feature-driven upgrades should be largely confined to "PE boxes." For the same reason, upgrading a P box should be easy, not hard. After all, it's just label-switching. In today's backbones, it should be more practical than ever to buy the most cost-effective box needed for now and the predictable near-term. Cost per gigabit continues to fall. Buying dramatically more capacity than is planned to be necessary sinks capital dollars into a box that does nothing but depreciate. I realize that organizationally-painful budgeting and purchasing processes often drive networks to buy the biggest thing available. Vendors understand this, too: they love to sell you a much bigger box than you need just because upgrading is hard to get approved so you don't want to do it any more frequently than necessary, even when that behavior is detrimental to cash-flow and bottom line. The more broken your organization, the more you need to spend extra money on "too big" boxes. Sounds pretty self-defeating, doesn't it? -- Jeff S Wheeler <jsw@inconcepts.biz> Sr Network Operator / Innovative Network Concepts

Christopher Morrow

7:42 p.m.

On Sun, Mar 13, 2011 at 2:11 PM, Jeff Wheeler <jsw@inconcepts.biz> wrote:

...

On Sun, Mar 13, 2011 at 1:27 PM, Christopher Morrow <morrowc.lists@gmail.com> wrote:

...
there's probably a different need in TOR and BO/SOHO locations than core devices, eh?

In today's backbone, this is certainly true. Feature-driven upgrades shouldn't be much of a factor for "P boxes" today, because modern networks have the option of simply label-switching in the core (just like 1990s networks could ATM/Frame-switch) without doing much of anything else. Feature-driven upgrades should be largely confined to "PE boxes."

not everyone drinks the mpls koolaide... so it's not always 'just a label switch' and depending upon how large your PE mesh is, there are still some challenges in scaling this. MPLS also only shifts the burden to another place, if you provide ip-transit and you need a full table you'll have to put those routes somewhere. Sure the 'core' may not need that info, but the edge likely does, yes? Have 100g customers today? planning on having them in the next ~8/12/18 months?

...

For the same reason, upgrading a P box should be easy, not hard. After all, it's just label-switching. In today's backbones, it should

upgrades aren't hard, unless you get yourself into a SPOF situation with the 'P' router(s)... mechanically the upgrades aren't hard. Cost-wise though it could be, it depends upon your particular cost structure I imagine.

...

be more practical than ever to buy the most cost-effective box needed for now and the predictable near-term. Cost per gigabit continues to fall. Buying dramatically more capacity than is planned to be necessary sinks capital dollars into a box that does nothing but depreciate.

The discussion at the RAWS meeting, and which seems to hold true for larger networks, is that a box lives in the network for ~5-7 years. First, for the core-class device today, in the core, then progressively further to the edge. Some thought goes into 'today I have X requirements, I can project based on some set of metrics I'll have X+Y tomorrow.'

...

I realize that organizationally-painful budgeting and purchasing processes often drive networks to buy the biggest thing available. Vendors understand this, too: they love to sell you a much bigger box than you need just because upgrading is hard to get approved so you don't want to do it any more frequently than necessary, even when that behavior is detrimental to cash-flow and bottom line. The more broken your organization, the more you need to spend extra money on "too big" boxes. Sounds pretty self-defeating, doesn't it?

sometimes... sometimes it's just business. I suppose the point here is that a box doesn't live ~12 months or even 24, it lives longer. Planning that horizon today is problematic when a box today (even the largest box) tops out just north of 2m routes (v4, I forget the mix v4/6 numbers). your network design may permit you to side step that issue in places, but planning for that number is painful today. -Chris

Jeff Wheeler

9:40 p.m.

On Sun, Mar 13, 2011 at 3:42 PM, Christopher Morrow <morrowc.lists@gmail.com> wrote:

...

not everyone drinks the mpls koolaide... so it's not always 'just a label switch' and depending upon how large your PE mesh is, there are

If it isn't just a label switch, then features can (and sometimes do) drive upgrades (therefore costs.)

...

not need that info, but the edge likely does, yes? Have 100g customers today? planning on having them in the next ~8/12/18 months?

If you did your purchasing the way Bill Herrin suggests, you'd buy a box with 100GbE ports for a POP or branch that is not projected to have 100GbE customers, just because it's the biggest box. His position is that man-power to do an upgrade is always more costly than capital dollars for the actual equipment, and ignores the fact that the biggest box is by no means guaranteed to offer new *features* which may be required. I think most of your post is responding to a mis-read of my post, so I'll skip back to the FIB size question at hand:

...

sometimes... sometimes it's just business. I suppose the point here is that a box doesn't live ~12 months or even 24, it lives longer. Planning that horizon today is problematic when a box today (even the largest box) tops out just north of 2m routes (v4, I forget the mix v4/6 numbers). your network design may permit you to side step that issue in places, but planning for that number is painful today.

I'm not comfortable making the generalization that buying the box with the largest available FIB is always the most cost-effective choice. In some "box roles," traffic growth drives upgrades, and increased FIB size in future boxes will be one advantage of a future upgrade that also increases port speed or density. In other "box roles," features drive upgrades, and again, FIB size may increase in future boxes which will be bought anyway to gain desired features. It's foolish and overly-simplistic to assume that every box upgrade will be driven by an eventual exhaustion of FIB capacity. Currently, FIB capacity is being driven by the needs of service providers' VPN PE boxes. This is great for networks that do not have that need, because it is driving FIB capacity up (or cost down) and further reducing the chance that FIB exhaustion will trigger an upgrade before other factors, such as port speed/density/features. -- Jeff S Wheeler <jsw@inconcepts.biz> Sr Network Operator / Innovative Network Concepts

William Herrin

14 Mar 14 Mar

1:56 a.m.

On Sun, Mar 13, 2011 at 5:40 PM, Jeff Wheeler <jsw@inconcepts.biz> wrote:

...

On Sun, Mar 13, 2011 at 3:42 PM, Christopher Morrow <morrowc.lists@gmail.com> wrote:

...
not need that info, but the edge likely does, yes? Have 100g customers today? planning on having them in the next ~8/12/18 months?

If you did your purchasing the way Bill Herrin suggests, you'd buy a box with 100GbE ports for a POP or branch that is not projected to have 100GbE customers, just because it's the biggest box.

Jeff, No, Chris wouldn't, because that misrepresents my suggestion. What I suggested is that you spend your efforts making solid projections and then buy a box that satisfies the targeted function for the foreseeable future. That way you don't spend manpower replacing it until something materially different than the projections occurs. Which avoids some mistaken-driven and defect-driven outages and has a myriad of similar secondary effects. Circuit outages are minimized when the CWA is on strike. Why? Because nobody's futzing with the equipment. There's a lesson there: maximize reliability by minimizing change. For your information, the ISP where I was the operations director survived the burst of the bubble. While revenues shrunk significantly it was still in the black in 2004 when I left. To the best of my knowledge it remained in the black until it was sold a few years later. There were a number of causes, but one of them was that in the key time frames we were able to crunch the capital budget to almost nothing, there being sufficient excess capacity in most of the equipment we already owned.

...

His position is that man-power to do an upgrade is always more costly than capital dollars for the actual equipment, and ignores the fact that the biggest box is by no means guaranteed to offer new *features* which may be required.

My position is that the terminal size of the IPv4 table is visible on the horizon. Now that it's part of the foreseeable future, I'd like to be able to buy boxes that support it. Regards, Bill Herrin -- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004

George Bonser

9 Mar 9 Mar

7:41 a.m.

...

From: Randy Bush Sent: Tuesday, March 08, 2011 7:44 PM To: Mikael Abrahamsson Cc: nanog@nanog.org Subject: Re: estimation of number of DFZ IPv4 routes at peak in the future

i am more of a pessimist. i suspect that there will be enough v4-only destinations out there that multi-homed enterprises fronting onto dual-stack backbones will announce teenie bits of v4 so they can nat64.

randy

I suspect people will get varying experiences with that. That they will attempt to multi-home with teenie bits of v4 I don't disagree with but that teenie bit better be part of a larger aggregate that can reach at least one of their runs home. There will likely be considerable filtration once there is enough dual-stack pressure put on gear, and v6 being where the growth is it will get priority. Dual-homing to their ISP might be a compromise solution seen more often with v4 in that scenario.

Randy Bush

8:35 a.m.

...

...
i am more of a pessimist. i suspect that there will be enough v4-only destinations out there that multi-homed enterprises fronting onto dual-stack backbones will announce teenie bits of v4 so they can nat64. that teenie bit better be part of a larger aggregate that can reach at least one of their runs home.

the last serious satainc phylters died in 2001. sales&marketing pressure. when eyecandy.com is behind a /27, or your s&m folk sell to weenie.foo who wants you to announce their /26, it will be the end of the /24 barrier.

...

v6 being where the growth is it will get priority.

we wish. wanna start a pool on the growth of v6 announcements vs new multi-homed v4 announcements? randy

Joel Jaeggli

9:18 a.m.

On 3/9/11 12:35 AM, Randy Bush wrote:

...

...
...
i am more of a pessimist. i suspect that there will be enough v4-only destinations out there that multi-homed enterprises fronting onto dual-stack backbones will announce teenie bits of v4 so they can nat64. that teenie bit better be part of a larger aggregate that can reach at least one of their runs home.

the last serious satainc phylters died in 2001. sales&marketing pressure. when eyecandy.com is behind a /27, or your s&m folk sell to weenie.foo who wants you to announce their /26, it will be the end of the /24 barrier.

...
v6 being where the growth is it will get priority.

we wish. wanna start a pool on the growth of v6 announcements vs new multi-homed v4 announcements?

one of these curves is steeper than the other. http://www.cidr-report.org/cgi-bin/plota?file=%2fvar%2fdata%2fbgp%2fv6%2fas2.0%2fbgp-active%2etxt&descr=Active%20BGP%20entries%20%28FIB%29&ylabel=Active%20BGP%20entries%20%28FIB%29&with=step http://www.cidr-report.org/cgi-bin/plota?file=%2fvar%2fdata%2fbgp%2fas2.0%2fbgp-active%2etxt&descr=Active%20BGP%20entries%20%28FIB%29&ylabel=Active%20BGP%20entries%20%28FIB%29&with=step If the slope on the second stays within some reasonable bounds of it's current trajactory then everything's cool, you buy new routers on schedule and the world moves on. The first one however will eventually kill us. The long run is a misleading guide to current affairs. In the long run we are all dead - John Maynard Keynes

...

randy

Randy Bush

9:50 a.m.

...

one of these curves is steeper than the other.

http://www.cidr-report.org/cgi-bin/plota?file=%2fvar%2fdata%2fbgp%2fv6%2fas2.0%2fbgp-active%2etxt&descr=Active%20BGP%20entries%20%28FIB%29&ylabel=Active%20BGP%20entries%20%28FIB%29&with=step

http://www.cidr-report.org/cgi-bin/plota?file=%2fvar%2fdata%2fbgp%2fas2.0%2fbgp-active%2etxt&descr=Active%20BGP%20entries%20%28FIB%29&ylabel=Active%20BGP%20entries%20%28FIB%29&with=step

hey geoff, would you put on same graph :)

Antonio Querubin

9:55 a.m.

On Wed, 9 Mar 2011, Joel Jaeggli wrote:

...

one of these curves is steeper than the other.

http://www.cidr-report.org/cgi-bin/plota?file=%2fvar%2fdata%2fbgp%2fv6%2fas2.0%2fbgp-active%2etxt&descr=Active%20BGP%20entries%20%28FIB%29&ylabel=Active%20BGP%20entries%20%28FIB%29&with=step

http://www.cidr-report.org/cgi-bin/plota?file=%2fvar%2fdata%2fbgp%2fas2.0%2fbgp-active%2etxt&descr=Active%20BGP%20entries%20%28FIB%29&ylabel=Active%20BGP%20entries%20%28FIB%29&with=step

If the slope on the second stays within some reasonable bounds of it's current trajactory then everything's cool, you buy new routers on schedule and the world moves on. The first one however will eventually kill us.

A valid comparison really needs to use the same vertical scale. That first is only 2300 new entries in the last 12 months. The other is 35000 new entries in the same period. Antonio Querubin e-mail/xmpp: tony@lava.net

Joel Jaeggli

4:51 p.m.

On 3/9/11 1:55 AM, Antonio Querubin wrote:

...

On Wed, 9 Mar 2011, Joel Jaeggli wrote:

...
one of these curves is steeper than the other.

http://www.cidr-report.org/cgi-bin/plota?file=%2fvar%2fdata%2fbgp%2fv6%2fas2.0%2fbgp-active%2etxt&descr=Active%20BGP%20entries%20%28FIB%29&ylabel=Active%20BGP%20entries%20%28FIB%29&with=step

http://www.cidr-report.org/cgi-bin/plota?file=%2fvar%2fdata%2fbgp%2fas2.0%2fbgp-active%2etxt&descr=Active%20BGP%20entries%20%28FIB%29&ylabel=Active%20BGP%20entries%20%28FIB%29&with=step

If the slope on the second stays within some reasonable bounds of it's current trajactory then everything's cool, you buy new routers on schedule and the world moves on. The first one however will eventually kill us.

A valid comparison really needs to use the same vertical scale. That first is only 2300 new entries in the last 12 months. The other is 35000 new entries in the same period.

No it doesn't. I'm more concerned about the percentage rather than absolute numbers and one of these things is doubling annually. I'll go out on a limb and say I need 150k ipv6 routes in gear that's supposed to last to 2016. joel

...

Antonio Querubin e-mail/xmpp: tony@lava.net

Arturo Servin

12:06 p.m.

On 9 Mar 2011, at 07:18, Joel Jaeggli wrote:

...

one of these curves is steeper than the other.

That's what we wanted for the first one.

...

http://www.cidr-report.org/cgi-bin/plota?file=%2fvar%2fdata%2fbgp%2fv6%2fas2.0%2fbgp-active%2etxt&descr=Active%20BGP%20entries%20%28FIB%29&ylabel=Active%20BGP%20entries%20%28FIB%29&with=step

...

http://www.cidr-report.org/cgi-bin/plota?file=%2fvar%2fdata%2fbgp%2fas2.0%2fbgp-active%2etxt&descr=Active%20BGP%20entries%20%28FIB%29&ylabel=Active%20BGP%20entries%20%28FIB%29&with=step

If the slope on the second stays within some reasonable bounds of it's current trajactory then everything's cool, you buy new routers on schedule and the world moves on. The first one however will eventually kill us.

It won't, it will take an "S" shape eventually. Possibly around 120k prefixes, then it will follow the normal growth of the Internet as v4 did.

...

The long run is a misleading guide to current affairs. In the long run we are all dead - John Maynard Keynes

...
randy

-as

Owen DeLong

5:28 p.m.

On Mar 9, 2011, at 4:06 AM, Arturo Servin wrote:

...

On 9 Mar 2011, at 07:18, Joel Jaeggli wrote:

...
one of these curves is steeper than the other.

That's what we wanted for the first one.

...
http://www.cidr-report.org/cgi-bin/plota?file=%2fvar%2fdata%2fbgp%2fv6%2fas2.0%2fbgp-active%2etxt&descr=Active%20BGP%20entries%20%28FIB%29&ylabel=Active%20BGP%20entries%20%28FIB%29&with=step

...
http://www.cidr-report.org/cgi-bin/plota?file=%2fvar%2fdata%2fbgp%2fas2.0%2fbgp-active%2etxt&descr=Active%20BGP%20entries%20%28FIB%29&ylabel=Active%20BGP%20entries%20%28FIB%29&with=step

If the slope on the second stays within some reasonable bounds of it's current trajactory then everything's cool, you buy new routers on schedule and the world moves on. The first one however will eventually kill us.

It won't, it will take an "S" shape eventually. Possibly around 120k prefixes, then it will follow the normal growth of the Internet as v4 did.

I think it will grow a lot slower than IPv4 because with rational planning, few organizations should need to add more prefixes annually, the way they had to in IPv4 due to scarcity based allocation policies. Owen

George Herbert

5:48 p.m.

On Wed, Mar 9, 2011 at 9:28 AM, Owen DeLong <owen@delong.com> wrote:

...

On Mar 9, 2011, at 4:06 AM, Arturo Servin wrote:

...
On 9 Mar 2011, at 07:18, Joel Jaeggli wrote:

...
one of these curves is steeper than the other.

That's what we wanted for the first one.

...
http://www.cidr-report.org/cgi-bin/plota?file=%2fvar%2fdata%2fbgp%2fv6%2fas2.0%2fbgp-active%2etxt&descr=Active%20BGP%20entries%20%28FIB%29&ylabel=Active%20BGP%20entries%20%28FIB%29&with=step

...
http://www.cidr-report.org/cgi-bin/plota?file=%2fvar%2fdata%2fbgp%2fas2.0%2fbgp-active%2etxt&descr=Active%20BGP%20entries%20%28FIB%29&ylabel=Active%20BGP%20entries%20%28FIB%29&with=step

If the slope on the second stays within some reasonable bounds of it's current trajactory then everything's cool, you buy new routers on schedule and the world moves on. The first one however will eventually kill us.

It won't, it will take an "S" shape eventually. Possibly around 120k prefixes, then it will follow the normal growth of the Internet as v4 did.

I think it will grow a lot slower than IPv4 because with rational planning, few organizations should need to add more prefixes annually, the way they had to in IPv4 due to scarcity based allocation policies.

...which was, ultimately, a large part of the point of going to 128 bits. The most important one for networks. -- -george william herbert george.herbert@gmail.com

David Conrad

6 p.m.

On Mar 9, 2011, at 7:28 AM, Owen DeLong wrote:

...

...
It won't, it will take an "S" shape eventually. Possibly around 120k prefixes, then it will follow the normal growth of the Internet as v4 did. I think it will grow a lot slower than IPv4 because with rational planning, few organizations should need to add more prefixes annually, the way they had to in IPv4 due to scarcity based allocation policies.

The implication of this statement would seem to be that the reason the routing tables are growing is due primarily to allocations and not deaggregation (e.g., for traffic engineering). Does anyone have any actual data to corroborate or refute this? Regards, -drc

Arturo Servin

6:30 p.m.

http://scholar.google.com/scholar?hl=en&as_sdt=0,5&cluster=6058676534328717115 @article{cittadini2010evolution, title={{Evolution of Internet Address Space Deaggregation: Myths and Reality}}, author={Cittadini, L. and Muhlbauer, W. and Uhlig, S. and Bush, R. and Fran{\c{c}}ois, P. and Maennel, O.}, journal={Selected Areas in Communications, IEEE Journal on}, volume={28}, number={8}, pages={1238--1249}, issn={0733-8716}, year={2010}, publisher={IEEE} } But times are changing and IMHO in the future the growth would be because of deaggregation (in v4). For v6 the growth I assume is (and will be) allocation, but I do not have a research to support that. -as On 9 Mar 2011, at 16:00, David Conrad wrote:

...

On Mar 9, 2011, at 7:28 AM, Owen DeLong wrote:

...
...
It won't, it will take an "S" shape eventually. Possibly around 120k prefixes, then it will follow the normal growth of the Internet as v4 did. I think it will grow a lot slower than IPv4 because with rational planning, few organizations should need to add more prefixes annually, the way they had to in IPv4 due to scarcity based allocation policies.

The implication of this statement would seem to be that the reason the routing tables are growing is due primarily to allocations and not deaggregation (e.g., for traffic engineering). Does anyone have any actual data to corroborate or refute this?

Regards, -drc

Randy Bush

10:30 p.m.

...

The implication of this statement would seem to be that the reason the routing tables are growing is due primarily to allocations and not deaggregation (e.g., for traffic engineering). Does anyone have any actual data to corroborate or refute this?

Luca Cittadini, Wolfgang Mühlbauer, Steve Uhlig, Randy Bush, Pierre Francois, Olaf Maennel, Evolution of Internet Address Space Deaggregation: Myths and Reality, in IEEE Journal on Selected Areas in Communications, Vol. 28, No. 8, October 2010. http://archive.psg.com/jsac-deagg.pdf randy

George Bonser

5:05 p.m.

...

the last serious satainc phylters died in 2001. sales&marketing pressure. when eyecandy.com is behind a /27, or your s&m folk sell to weenie.foo who wants you to announce their /26, it will be the end of the /24 barrier.

Sure, you can sell to someone who wants to announce a /26 and you can carry the route, but you can't force your peers or your peers' peers to take it. That's what I meant by the experience possibly varying. It might work, it might not from some locations.

...

...
v6 being where the growth is it will get priority.

we wish. wanna start a pool on the growth of v6 announcements vs new multi-homed v4 announcements?

I meant growth in traffic, not routing table. If traffic is growing on v6 and a network with dual-stacked routers comes under routing table pressure, v6 might win in the filtering decision.

William Herrin

3:55 a.m.

On Tue, Mar 8, 2011 at 10:17 PM, Mikael Abrahamsson <swmike@swm.pp.se> wrote:

...

We had an interesting discussion the other day at work. We were speculating on how many DFZ IPv4 routes there would be at peak in the future before it starts to decline again due to less IPv4 usage. My guess therefore is a peak around 450-500k IPv4 DFZ routes and that this would happen in around 3-5 years. I wanted to record this for posterity.

What is your guess, any why?

Five million. Assuming the /24 boundary holds (this is likely) and we're only carrying global unicast and anycast routes (1. to 223. excluding RFC1918, 127, etc) the theoretical maximum number of possible IPv4 prefixes is around 28M. (2**24)*0.8~=14M /24's. Plus 7M /23s, 4M /22's, etc. However, practical issues will prevent excessive numbers of fully covered prefixes... So we won't generally see both /24's under a /23. We might see a /24 and a covering /23 but if we do we won't generally see the other /24. This drops us to an upper bound of 14M. There will also be a significant number of prefixes where there's just no gain from breaking them up. You're using the entire /20 at your site and you only have one /24 that you want routed differently than "normal." This will pull it down still further, cutting it somewhere between half and a third of the 14M upper bound. It'll take 10 to 20 years to get there. If we're actually able to start retiring IPv4 in 10 years then it'll peak lower. But if IPv4 sticks around, I think the global IPv4 BGP table will reach a steady state somewhere around 5 million prefixes. Regards, Bill Herrin -- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004

Jared Mauch

4:10 a.m.

On Mar 8, 2011, at 10:17 PM, Mikael Abrahamsson wrote:

...

My guess therefore is a peak around 450-500k IPv4 DFZ routes and that this would happen in around 3-5 years. I wanted to record this for posterity.

What is your guess, any why?

I think it'll end up around the same range, mostly due to hardware with built-in route limits. Some providers will be stuck with this for ~5-7 years due to equipment lifecycle depreciation. - jared

Owen DeLong

4:40 a.m.

You have ignored the probability of disaggregation due to IP trading markets, especially given the wild-west nature of the APNIC transfer policy. Many of the legacy blocks will get dramatically disaggregated in the likely market which could take the DFZ well beyond 500k routes. It will be very interesting to watch. Owen On Mar 8, 2011, at 7:17 PM, Mikael Abrahamsson wrote:

...

Hi.

We had an interesting discussion the other day at work. We were speculating on how many DFZ IPv4 routes there would be at peak in the future before it starts to decline again due to less IPv4 usage. The current number is around 350k, and my personal estimation is that it would grow by at least 100k more due to the the last 5 /8s being carved up at around /22 meaning each /8 ending up with 16k routes, plus the last allocations being seen in the remaining RIR "normal allocations" would be smaller than before plus de-aggregation of space as people "sell" or "lease" subspace of their allocations.

My guess therefore is a peak around 450-500k IPv4 DFZ routes and that this would happen in around 3-5 years. I wanted to record this for posterity.

What is your guess, any why?

-- Mikael Abrahamsson email: swmike@swm.pp.se

George Herbert

4:55 a.m.

On Tue, Mar 8, 2011 at 8:40 PM, Owen DeLong <owen@delong.com> wrote:

...

You have ignored the probability of disaggregation due to IP trading markets, especially given the wild-west nature of the APNIC transfer policy.

Many of the legacy blocks will get dramatically disaggregated in the likely market which could take the DFZ well beyond 500k routes.

It will be very interesting to watch.

Owen

On Mar 8, 2011, at 7:17 PM, Mikael Abrahamsson wrote:

...
Hi.

We had an interesting discussion the other day at work. We were speculating on how many DFZ IPv4 routes there would be at peak in the future before it starts to decline again due to less IPv4 usage. The current number is around 350k, and my personal estimation is that it would grow by at least 100k more due to the the last 5 /8s being carved up at around /22 meaning each /8 ending up with 16k routes, plus the last allocations being seen in the remaining RIR "normal allocations" would be smaller than before plus de-aggregation of space as people "sell" or "lease" subspace of their allocations.

My guess therefore is a peak around 450-500k IPv4 DFZ routes and that this would happen in around 3-5 years. I wanted to record this for posterity.

What is your guess, any why?

Strange, had this exact conversation with a boutique ISP owner I know earlier today... My hope is that things peak around 510k and that routers that can handle about a million routes now can handle all of IPv4Peak and IPv6 growth for their economic lifetimes. Disaggregation and leasing of space (or whatever) could easily spike that significantly, but aren't likely to further increase the rate of new announcements, merely how long we keep going with them. I predict that we can't really predict this yet; IPv6 actual adoption isn't far enough along to tell how bad that will end up being. If it's five years before it dominates things, we're screwed on IPv4 tables. People who need smallish but routeable blocks will be able to find them and pry them loose via enough funds and announce them. 5 more years of that, and current routers go "poof". Many go "poof" sooner. -- -george william herbert george.herbert@gmail.com

Mikael Abrahamsson

6:27 a.m.

On Tue, 8 Mar 2011, Owen DeLong wrote:

...

On Mar 8, 2011, at 7:17 PM, Mikael Abrahamsson wrote:

...

...
last allocations being seen in the remaining RIR "normal allocations" would be smaller than before plus de-aggregation of space as people "sell" or "lease" subspace of their allocations.

...

You have ignored the probability of disaggregation due to IP trading markets, especially given the wild-west nature of the APNIC transfer policy.

No, I haven't ignored it. I'm just more optimistic about IPv6 deployment and that there will be a lot less de-aggregation due to trading than others are. -- Mikael Abrahamsson email: swmike@swm.pp.se

Randy Bush

7:40 a.m.

btw, this discussion should not forget that the load on routers is churn and number of paths, not just prefix count. randy

5354

Age (days ago)

5359

Last active (days ago)

List overview

Download

37 comments

17 participants

participants (17)

Antonio Querubin
Arturo Servin
Christopher Morrow
David Conrad
George Bonser
George Herbert
Jared Mauch
Jeff Wheeler
Joel Jaeggli
John Curran
Justin Krejci
Majdi S. Abbas
Mikael Abrahamsson
Owen DeLong
Randy Bush
Vadim Antonov
William Herrin