the network at the belleview meeting was fantastic, thanks to the host, xkl, and the usual suspects (merit, tony, ...). but there was one outage. as best i know, that outage was caused by one of the two upstream transit isps bouncing (at least) the nanog prefix on one of their routers far over the water in seattle. of course, that is why we had two upstreams. but, flap amplification and dear old route flap damping caused our prefix to be damped by a number of global isps. yes, our prefix, not just the path. for those who want to read up on route flap damping and why this caused problems, please see <http://www.nanog.org/mtg-0210/flap.html> "Route Flap Damping: Harmful?" from the 2002 nanog eugene meeting. for those wishing historical perspective on route flap damping, document ripe-378 (may 2006) says 1.1 Background In the early 1990s the accelerating growth in the number of prefixes being announced to the Internet (often due to inadequate prefix-aggregation), the denser meshing through multiple inter-provider paths, and increased instabilities started to cause significant impact on the performance and efficiency of the Internet backbone routers. Every time a routing prefix became unreachable because of a single line-flap, the withdrawal was advertised to the whole core Internet and handled by every single router that carried the full Internet routing table. It was soon realized that the increasing routing churn created significant processing load on routing engines, sometimes sufficiently high load to cause router crashes. To overcome this situation RFD was developed in 1993 and has since been integrated into most router BGP software implementations. RFD is described in detail in RFC 2439. RFD is now used in many service provider networks in the Internet. for reasons described in the 2002 preso cited above, and demonstrated by the network outage in belleview, ripe-378 goes on to conclude 4.0 Recommendation This Routing Working Group document proposes that with the current implementations of BGP flap damping, the application of flap damping in ISP networks is NOT recommended. The recommendations given in ripe-229 and previous documents [2] are considered obsolete henceforth. i.e., it's time to turn it off. you are damaging your customers and others' customers. randy
Randy Bush wrote:
i.e., it's time to turn it off. you are damaging your customers and others' customers.
There is a growing number of "Tier 1" NSPs who do not dampen anymore (or at least they don't dampen their customers). NTT is one of them. Who are the others? -David
i.e., it's time to turn it off. you are damaging your customers and others' customers. There is a growing number of "Tier 1" NSPs who do not dampen anymore (or at least they don't dampen their customers).
damping one's customers has never been very sane. they pay us to put up with their <bleep>. damping a customer is direct death to them. i wish all my competitors did that. damping one's peers has been another matter. this is what caused the nanog meeting prefix to be widely damped, and this is the issue i am addressing. and, if you tell us that you need to damp in order to save your routers from drowning from churn, then you had best stand up and cry "bs!" when dave and john they tell us two million prefixes is just fine. and your rir attendees had best be on the very prefix-count-conservative side in rir pi space allocation discussions. randy
On (2007-06-23 08:22 -1000), Randy Bush wrote:
for those wishing historical perspective on route flap damping, document ripe-378 (may 2006) says
i.e., it's time to turn it off. you are damaging your customers and others' customers.
I've always thought that damping as an idea is a good one, but implementation is done horribly wrong. I want my customers to get best of the stable paths, i.e. I'd like to see method to dynamically worsen routes in path selection that are unstable, local-pref would be the obvious choice for me. -- ++ytti
participants (3)
-
David Ulevitch
-
Randy Bush
-
Saku Ytti