On 2/24/2015 6:35 PM, William Herrin wrote:
Anyway, I heard back from DRAGON's authors. Paraphrasing: "An aggregate (e.g. 10.0.0.0/8) must be withdrawn if the aggregate's origin loses its direct route to the filterable disaggregate's origin (e.g. 10.2.3.0/24). The withdrawn aggregate is replaced with a synthesized set of announcements which fully cover the aggregate's address space excluding the unreachable disaggregate (e.g. 10.0.0.0/15, 10.2.0.0/23, 10.2.2.0/24, 10.2.4.0/22, 10.2.8.0/21, 10.2.16.0/20, etc.) When direct connectivity is restored, the aggregate is again announced and the synthetic announcements withdrawn." This overcomes my objection. The aggregate's origin can reasonably be programmed to trigger on the nearby disaggregate's withdrawal. System-wide withdrawal of the aggregate route is a sufficient trigger to cancel filtering on the disaggregate which should then fully propagate. And the overall savings should still be substantial even with transient synthetics in the table. I look forward to seeing how the authors address the many implications of this requirement. I'm not sold just yet but I am suitably impressed. Regards, Bill Herrin
Yipee for huge amounts of automatic updates! I guess convergence latency is better than memory? So, how many /16 networks does a core network have which they hand out to customers that are multi-homed? What is the state of flux? Normally, we'd see the transition states of the more specific routes. Now we'll see multiple updates for each of those transition states (/24 removed so /16 is broken. Another /24 is removed so a /17 is broken, another /24 is removed so a /18 is broken). Provider X lost 50 multihomed customers spread across 20 aggregate networks. Process! Aggregates normally cover unassigned space as well. Do we now have to define to the router which space is supposed to be used and which is not so it knows when to break apart an aggregate? Removing a route "don't come this way!" is roughly the same as breaking the aggregate except for the extra processing time. It is likely that a node choosing between 2 aggregates would also be choosing the same between 2 more specific routes. Until convergence is done, it'd still route the wrong way in either case. One could stipulate that convergence might be slightly longer in this case due to update processing. Routing might be contrary to desire in cases where more specific route is advertised one way only and then an aggregate is used as a fallback. While the node filtering the more specific route may consider the path the same so it filters, the next node is making a choice between aggregates and may choose to send the traffic the other way because it's less AS hops; but don't worry, the 256k line backup will do just fine! Consider this simplistic model: A------B \ / C C is a business or ISP with it's own address space. It normally advertises an aggregate /20 to A and B. A and B local-pref C's routes because that's what transit providers do. C is under a DDOS attack. They issue a covering /24 to B and a /32 to B for blackhole service. B will propagate the /32 through it's entire network because the hop is to a discard (nifty!), however, the /24 will be the same as the /20, so it is filtered out. We can change the local-pref (go communities) of the /24 and that will allow it to propagate to A. A will accept the /24, presumably because the /24 doesn't match the selected /20 chosen (because of local pref). However! A--D---B \ / C D may or may not filter the /24 from B. It depends on their routing policy. A may only see the /20 from D and thus send all it's DDOS traffic on to C due to local-pref. Sorry, C. Next time, please manually change your BGP so you no longer advertise an aggregate. Oh, and it will be simpler for you to change if you just do /24 networks from now on and don't bother with the aggregate headache. SUMMARY: What is the cost if aggregates start being broke apart and not used because people want to insure their traffic does what they want? What is the cost of all these aggregate networks being broken up because their more specific routes aren't there? What is the cost of managing which smaller networks are supposed to be there and which are just unassigned currently to prevent aggregate breakup? Jack P.S. I didn't delve completely into all the documents and so perhaps I misunderstood or missed something important. My concerns may be completely unjustified.