CIDR Aggregation Tool
Well, after observing the debates about What To Do About The Routing Table Size, I decided to work on some informal tools to examine the current routing table space for possible aggregations. Right now, the raw data is the routing table from the Net Access MAE-East router. It only searches for aggregates /15 < x < /25 right now - ignoring some fairly obvious aggregation-suggestions in the /8 < x < /16 range. The results are up at http://routes.netaxs.com and the some of the caveats are listed on the web page, but are: 1) When the Net Access MAE-East router has multiple identical routes that point to the same next-hop (192.41.177.x), the system only uses the one first listed in the cisco 'sho ip bgp summ' output - even though that's not always the 'best' route. 2) When an AS advertises both an aggregate and a specific, the specific is 'dropped' by the aggregator. If the input is: {205.89.10.128/17, 205.89.10.130}, the output will be: {205.89.10.128/17} (205.89.10.130 will be dropped). 3) The source of data is the Net Access MAE-East router routing table, and we don't peer with all parties at MAE-East directly - thus, it's possible that the system catches aggregations that are impossible because they're announced to a 3rd party via different paths - but an informal look around doesn't appear to indicate that. 4) The system goes by next-hop rather than by AS-Path. There seems to be a good correlation, but ... Also, the final disclaimer: I've examined the results and they *look* reasonable. But I haven't written tester-tools to attempt to verify by another algorithm that the results are correct. Here's the table at the end of the web page: Before After Run Agg. Run 192.41.177.110 ibm 81 68 192.41.177.115 digex 98 98 192.41.177.120 eu.net 949 775 192.41.177.125 nsn.nasa 126 112 192.41.177.140 ans 2146 1425 192.41.177.145 agis 539 304 192.41.177.150 uscyber 2 2 192.41.177.160 interpath 2 2 192.41.177.170 net99 309 246 192.41.177.180 mci 2 2 192.41.177.181 mci 8963 6828 192.41.177.190 pipex 367 308 192.41.177.210 netcom 370 348 192.41.177.235 psi 1 1 192.41.177.240 icp 4499 3712 192.41.177.241 sprintlink 6429 4694 192.41.177.245 psi 1491 1098 192.41.177.249 alternet 3971 3148 192.41.177.251 es.net 96 88 192.41.177.252 es.net 122 101 192.41.177.6 suranet 470 445 192.41.177.70 internex 1 1 192.41.177.80 ios 10 8 192.41.177.85 cais 184 158 192.41.177.86 hlc 354 243 192.41.177.89 energis 2 2 192.41.177.95 delphi 3 3 A final note: We're open to {Additional static sources of route tables at MAE-East; Additional static sources of route tables at {MAE-West, Pennsauken, or the PacBell NAP}; and dynamic (i.e. the vty password so we can run a sho ip bgp summ every so often) sources at any of the MAEs or NAPs. If we get good feedback, we may automate this to run overnight and keep a history. Avi Freedman freedman@netaxs.com
Avi, Did you not see the aggregation report by Tony Bates on cidrd posted periodically for the last couple of years? It lists aggregation gains for each origin AS, rather than the BGP neighbor in your numbers. IMHO, numbers based on origin AS is much more useful. If you need to invent wheels, let us at least invent better wheels :-) -- Enke Date: Mon, 27 Nov 1995 01:34:55 EST To: cidrd@iepg.org From: Tony Bates <Tony.Bates@mci.net> Subject: In the spirit... Here's the latest top-30. Looks like UUNET-CANADA dropped a fair bit from last time. --Tony. ASnum NetsNow NetsCIDR NetGain % Gain Description AS2493 1047 576 471 45.0% i*internet AS174 1493 1053 440 29.5% Performance Systems International AS544 740 524 216 29.2% The DataNet IP Service AS3848 425 236 189 44.5% WORLDLINX AS568 759 572 187 24.6% Milnet FIXes--144(W)/145(E) AS4628 240 54 186 77.5% APNIC-AS-BLOCK AS1324 465 287 178 38.3% ANS New York City - DNSS 35 AS97 506 338 168 33.2% JvNCnet AS3602 285 175 110 38.6% Intergrated Network Services Inc. AS4230 195 93 102 52.3% EMBRATEL-BR AS1717 649 556 93 14.3% RENATER [....]
Date: Sun, 26 Nov 1995 23:00:49 -0500 From: Avi Freedman <freedman@netaxs.com> To: big-internet@munnari.OZ.AU, cidrd@iepg.org, nanog@merit.edu CC: freedman@netaxs.com
Well, after observing the debates about What To Do About The Routing Table Size, I decided to work on some informal tools to examine the current routing table space for possible aggregations.
Right now, the raw data is the routing table from the Net Access MAE-East router.
It only searches for aggregates /15 < x < /25 right now - ignoring some fairly obvious aggregation-suggestions in the /8 < x < /16 range.
The results are up at http://routes.netaxs.com and the some of the caveats are listed on the web page, but are:
1) When the Net Access MAE-East router has multiple identical routes that point to the same next-hop (192.41.177.x), the system only uses the one first listed in the cisco 'sho ip bgp summ' output - even though that's not always the 'best' route.
2) When an AS advertises both an aggregate and a specific, the specific is 'dropped' by the aggregator. If the input is: {205.89.10.128/17, 205.89.10.130}, the output will be: {205.89.10.128/17} (205.89.10.130 will be dropped).
3) The source of data is the Net Access MAE-East router routing table, and we don't peer with all parties at MAE-East directly - thus, it's possible that the system catches aggregations that are impossible because they're announced to a 3rd party via different paths - but an informal look around doesn't appear to indicate that.
4) The system goes by next-hop rather than by AS-Path. There seems to be a good correlation, but ...
Also, the final disclaimer: I've examined the results and they *look* reasonable. But I haven't written tester-tools to attempt to verify by another algorithm that the results are correct.
Here's the table at the end of the web page:
Before After Run Agg. Run 192.41.177.110 ibm 81 68 192.41.177.115 digex 98 98 192.41.177.120 eu.net 949 775 192.41.177.125 nsn.nasa 126 112 192.41.177.140 ans 2146 1425 192.41.177.145 agis 539 304 192.41.177.150 uscyber 2 2 192.41.177.160 interpath 2 2 192.41.177.170 net99 309 246 192.41.177.180 mci 2 2 192.41.177.181 mci 8963 6828 192.41.177.190 pipex 367 308 192.41.177.210 netcom 370 348 192.41.177.235 psi 1 1 192.41.177.240 icp 4499 3712 192.41.177.241 sprintlink 6429 4694 192.41.177.245 psi 1491 1098 192.41.177.249 alternet 3971 3148 192.41.177.251 es.net 96 88 192.41.177.252 es.net 122 101 192.41.177.6 suranet 470 445 192.41.177.70 internex 1 1 192.41.177.80 ios 10 8 192.41.177.85 cais 184 158 192.41.177.86 hlc 354 243 192.41.177.89 energis 2 2 192.41.177.95 delphi 3 3
A final note: We're open to {Additional static sources of route tables at MAE-East; Additional static sources of route tables at {MAE-West, Pennsauken, or the PacBell NAP}; and dynamic (i.e. the vty password so we can run a sho ip bgp summ every so often) sources at any of the MAEs or NAPs.
If we get good feedback, we may automate this to run overnight and keep a history.
Avi Freedman freedman@netaxs.com
Avi, Did you not see the aggregation report by Tony Bates on cidrd posted periodically for the last couple of years? It lists aggregation gains for each origin AS, rather than the BGP neighbor in your numbers. IMHO, numbers based on origin AS is much more useful.
If you need to invent wheels, let us at least invent better wheels :-)
-- Enke
No, I missed the ASnum-based report by Tony on cidrd. In any case, I thought it would be useful to try to gather the data independently on real data in use by our network... Also, the report based on ASN tends to miss JVNC etc... which can be aggregated by their transit provider at the peering/exchange locations. Reports based on just next-hop seem to catch the top-level aggregation possibilities (i.e. a customer of JVNC & a customer of some other ISP might have adjacent /24s that could be aggregated)... Avi
Perhaps this is just a small error that has to be accepted in your measurements, but we are dual homed and require both the aggregate and the specific.
2) When an AS advertises both an aggregate and a specific, the specific is 'dropped' by the aggregator. If the input is: {205.89.10.128/17, 205.89.10.130}, the output will be: {205.89.10.128/17} (205.89.10.130 will be dropped).
Perhaps this is just a small error that has to be accepted in your measurements, but we are dual homed and require both the aggregate and the specific.
2) When an AS advertises both an aggregate and a specific, the specific is 'dropped' by the aggregator. If the input is: {205.89.10.128/17, 205.89.10.130}, the output will be: {205.89.10.128/17} (205.89.10.130 will be dropped).
The 205.88.10.128 was a random example. I hope that's not you :) There are no "value" judgements made by the tool - it's just suggesting aggregates. And if we see an aggregate and a specific, both set to the same next-hop, it's quite likely that it's the same AS announcing both routes, and that they (your transit provider(s)) could do the aggregation themselves - but the tool *is* deficient in that right now it doesn't consider AS-paths. As an example, picking an IP for branch.com (198.111.253.37): Our route table has: *> 198.111.252.0 192.41.177.145 <--- agis *> 198.111.252.0/22 192.41.177.181 <--- mci *> 198.111.253.0 192.41.177.145 <--- agis *> 198.111.255.0 192.41.177.145 <--- agis So if 198.111.252/23 is suggested as an aggregate for the 192.41.177.145 (AGIS) target, that's because it looks like AGIS could in fact announce 198.111.22.0/23 instead of 198.111.252/0 and 198.111.253.0. Avi
Our route table has: *> 198.111.252.0 192.41.177.145 <--- agis *> 198.111.252.0/22 192.41.177.181 <--- mci *> 198.111.253.0 192.41.177.145 <--- agis *> 198.111.255.0 192.41.177.145 <--- agis
This isn't what agis is supposed to be announcing, I'll have to ask them again to announce 198.111.252/22. There's a couple less routes already :-). Once that is fixed, further aggregation of 198.111.252.0 (say into 198.111/16, as a non real example) would change our routing (in ways we don't want it changed), even with your "next hop the same" criteria because of the additional meaning that specifics have in terms of priority. I agree that your tool is usefull in identifying _potential_ savings.
[Please take any other responses just to cidrd. This is copied to big-inet and nanog so people will see the followup request. I just wanted the announcement to go out maximally, but the details and responses are of no interet to nanog as a whole...]
Our route table has: *> 198.111.252.0 192.41.177.145 <--- agis *> 198.111.252.0/22 192.41.177.181 <--- mci *> 198.111.253.0 192.41.177.145 <--- agis *> 198.111.255.0 192.41.177.145 <--- agis
This isn't what agis is supposed to be announcing, I'll have to ask them again to announce 198.111.252/22. There's a couple less routes already :-).
Once that is fixed, further aggregation of 198.111.252.0 (say into 198.111/16, as a non real example) would change our routing (in ways we don't want it changed), even with your "next hop the same" criteria because of the additional meaning that specifics have in terms of priority.
Well, we only see 198.111.252, 253, and 255 from AGIS, so there's no danger of AGIS over-aggregating even if they combined 252 & 253...
I agree that your tool is usefull in identifying _potential_ savings.
That's all it's for. Avi
Avi, While this is useful as a metric of the degree of aggregation, it is not sufficient to configure aggregation. This is still useful as a means to look at where improvement may be needed. Please don't get me wrong in pointing out that there are limits to the applicability, this *is* useful. In message <199511270400.XAA05234@netaxs.com>, Avi Freedman writes:
Well, after observing the debates about What To Do About The Routing Table Size, I decided to work on some informal tools to examine the current routing table space for possible aggregations.
Right now, the raw data is the routing table from the Net Access MAE-East router.
It only searches for aggregates /15 < x < /25 right now - ignoring some fairly obvious aggregation-suggestions in the /8 < x < /16 range.
The results are up at http://routes.netaxs.com and the some of the caveats are listed on the web page, but are:
1) When the Net Access MAE-East router has multiple identical routes that point to the same next-hop (192.41.177.x), the system only uses the one first listed in the cisco 'sho ip bgp summ' output - even though that's not always the 'best' route.
The routing table seen at one place in the Internet is insufficient to determine whether aggregation is possible. Part of the problem is that equally specific alternate paths are supressed until primary connectivity is lost. You may not see alternate paths for multihomed sites that would prevent aggregation.
2) When an AS advertises both an aggregate and a specific, the specific is 'dropped' by the aggregator. If the input is: {205.89.10.128/17, 205.89.10.130}, the output will be: {205.89.10.128/17} (205.89.10.130 will be dropped).
This is often done to allow a multihomed component of an aggregate to be routed correctly while still providing the backup path, or allowing load splitting across providers (usually load split by AS path filtering or more simply by AS path length). There is no way to tell a mistake from this being done intentionally.
3) The source of data is the Net Access MAE-East router routing table, and we don't peer with all parties at MAE-East directly - thus, it's possible that the system catches aggregations that are impossible because they're announced to a 3rd party via different paths - but an informal look around doesn't appear to indicate that.
4) The system goes by next-hop rather than by AS-Path. There seems to be a good correlation, but ...
You really need to take into account AS path. As far back as 1992 we've seen grossly optimistic estimates based solely on next hop at single points (including estimates from me in June/July 1992). In effect what you have is the degree of aggregation possible if the aggregartion boundry was extended around the entire rest of the world (or at least everyone that peered directly with you at the point of measurement). What can be aggregated according to such an estimate will vary according to who is making the measurement and where.
Also, the final disclaimer: I've examined the results and they *look* reasonable. But I haven't written tester-tools to attempt to verify by another algorithm that the results are correct.
Here's the table at the end of the web page:
Before After Run Agg. Run 192.41.177.110 ibm 81 68 192.41.177.115 digex 98 98 192.41.177.120 eu.net 949 775 192.41.177.125 nsn.nasa 126 112 192.41.177.140 ans 2146 1425 192.41.177.145 agis 539 304 192.41.177.150 uscyber 2 2 192.41.177.160 interpath 2 2 192.41.177.170 net99 309 246 192.41.177.180 mci 2 2 192.41.177.181 mci 8963 6828 192.41.177.190 pipex 367 308 192.41.177.210 netcom 370 348 192.41.177.235 psi 1 1 192.41.177.240 icp 4499 3712 192.41.177.241 sprintlink 6429 4694 192.41.177.245 psi 1491 1098 192.41.177.249 alternet 3971 3148 192.41.177.251 es.net 96 88 192.41.177.252 es.net 122 101 192.41.177.6 suranet 470 445 192.41.177.70 internex 1 1 192.41.177.80 ios 10 8 192.41.177.85 cais 184 158 192.41.177.86 hlc 354 243 192.41.177.89 energis 2 2 192.41.177.95 delphi 3 3
A final note: We're open to {Additional static sources of route tables at MAE-East; Additional static sources of route tables at {MAE-West, Pennsauken, or the PacBell NAP}; and dynamic (i.e. the vty password so we can run a sho ip bgp summ every so often) sources at any of the MAEs or NAPs.
You will find that while improvement is possible, and may still even be fairly substantial, your figures represent an optimistic estimate. Another way of estimating what can be aggregated is by determining from how many places all of the components of an aggregate could be heard in all backup situations. In some cases it might be reasonable to drop some degree of alternate connectivity (fourth or fifth preferred paths) and allow a number of holes (specifically aggregated components). In principle this could be done algorithmically using the IRR. In practice, you need to check with some of the parties involved to make sure registered information (particialrly aut-num AS peerings) are accurate beforehand. Using the IRR you (or we) can select candidates for aggregation and then make sure the aggregation can really be asfely done. This is a little different in than you estimate in that it the viewpoint is what can we aggregate, rather than what might we see better aggregated in the future. The bgp paths at major interconnects could form a useful sanity check, making sure that AS paths do not conflict with IRR AS peering information for any candidate for aggregation.
If we get good feedback, we may automate this to run overnight and keep a history.
Avi Freedman freedman@netaxs.com
Thanks for the information. Curtis ps- This thread was not about ANS, but for those following NANOG activity might who may want it, here is a brief update. We have not yet deployed the code needed to generate configurations that take advantage of aggregates marked in the IRR. For a rough hint at where we are headed, see: ftp://ftp.ans.net/pub/papers/slides/nanog-sep-1995-proxy-agg.ps
Avi,
While this is useful as a metric of the degree of aggregation, it is not sufficient to configure aggregation. This is still useful as a means to look at where improvement may be needed. Please don't get me wrong in pointing out that there are limits to the applicability, this *is* useful.
I should have been much more specific. There was never any intent to suggest that this was a configuration-generator... Nor was there any intent to imply that certain providers are bad or remiss or misconfigured or underconfigured... The hope was that it might be useful as something to look at (a raw generator of aggregation-possibilities). Also, I was interested in seeing how many routes could be aggregated at one peering point (not inside a provider's network, but at one edge). For example, if the processing to compute the aggregations was low, and only aggregated entries were inserted into a cache (but the actual BGP announced more specific entries were kept in the routing table), perhaps that would help if route table and/or cache table size was/is still a problem. But it seems that the current generation of technology that is out there doesn't run into a SP-cache-size wall from either a switching-speed or memory angle.
You will find that while improvement is possible, and may still even be fairly substantial, your figures represent an optimistic estimate.
Agreed. [Suggestions re: IRR deleted for brevity.] Agreed re: the IRR as well. Next on the list is to examine the Merit tools. Sigh - my nanog notes were lost when my 200LX was stolen, but I know where to find the tools...
Thanks for the information.
Thanks for the feedback.
Curtis
Avi
I wouldn't mind if aggregation was done automatically unless the route was specifically registered somewhere as being a "don't aggregate" route. In other words, let people indicate if it wasn't a mistake.
load splitting across providers (usually load split by AS path filtering or more simply by AS path length). There is no way to tell a mistake from this being done intentionally.
In message <m0tK97D-000Nj7C@aero.branch.com>, Jon Zeeff writes:
I wouldn't mind if aggregation was done automatically unless the route was specifically registered somewhere as being a "don't aggregate" route. In other words, let people indicate if it wasn't a mistake.
load splitting across providers (usually load split by AS path filtering or more simply by AS path length). There is no way to tell a mistake from this being done intentionally.
A glance at the IRR is all it takes to find a dual homed site and determined that even though primary connectivity is through the same path as the aggregate an alternate path goes through another path. The answer there is often to aggregate anyway and pass the dual homed exceptions as explicit component routes. Problems can arise if there is no aut-num object for the AS in question, or the object is not accurate, or the routes in question don't have their own AS and are not registered correctly. Of course, if you can't be bothered registering in the IRR correctly, you have to accept the consequences. In the context of ANS's scheme after determining that something could be aggregated we would mark the aggregate with the community ANSAGGREGATE and the components with the community ANSCOMPONENTS. The rest is (not yet deployed, and not quite completely coded) magic. The aggregate would be formed in the next config run. There is also plan to use ANSPROXYAGGR and ANSPROXYCOMP but marking other peoples route objects (the components of a proxy aggregate) with a community is a hard thing to do if not in the maintainer list for the object. Curtis
Curtis Villamizar (curtis@ans.net) on November 28:
There is also plan to use ANSPROXYAGGR and ANSPROXYCOMP but marking other peoples route objects (the components of a proxy aggregate) with a community is a hard thing to do if not in the maintainer list for the object.
There are three proposals to enable marking other peoples' routes in the RPS WG: one based on ISP tags, one on route macros and one on attachment objects. Once one (or more) of these schemes are accepted and implemented, this may not be a difficult problem anymore. Cengiz -- Cengiz Alaettinoglu Information Sciences Institute (310) 822-1511 University of Southern California http://www.isi.edu/div7/people/cengiz
Curtis Villamizar (curtis@ans.net) on November 27:
[lots deleted] Another way of estimating what can be aggregated is by determining from how many places all of the components of an aggregate could be heard in all backup situations. In some cases it might be reasonable to drop some degree of alternate connectivity (fourth or fifth preferred paths) and allow a number of holes (specifically aggregated components). In principle this could be done algorithmically using the IRR. In practice, you need to check with some of the parties involved to make sure registered information (particialrly aut-num AS peerings) are accurate beforehand.
Using the IRR you (or we) can select candidates for aggregation and then make sure the aggregation can really be asfely done. This is a little different in than you estimate in that it the viewpoint is what can we aggregate, rather than what might we see better aggregated in the future. The bgp paths at major interconnects could form a useful sanity check, making sure that AS paths do not conflict with IRR AS peering information for any candidate for aggregation. [lots deleted]
Actually we are working on such a tool, that we call CIDR assistant. A pre-alpha release of this tool will be available before/during the IETF, and there will be a discussion of this tool in the RPS wg. This tool considers the topology and the policies registered in the IRR before suggesting potential aggregations. The amount of policy/topology that is considered is configurable. Cengiz -- Cengiz Alaettinoglu Information Sciences Institute (310) 822-1511 University of Southern California http://www.isi.edu/div7/people/cengiz
participants (5)
-
Avi Freedman
-
Cengiz Alaettinoglu
-
Curtis Villamizar
-
Enke Chen
-
jon@branch.com