From: Craig Labovitz <labovit@merit.edu> * there are an awful lot of withdraw/announcements out there * we don't know were they are comming from, and no, it is not Cisco's fault * some valid, and standards-compliant vendor implementation decisions are contributing a small fraction of extra withdraws (yes, Cisco is one of these vendors, but there are probably others)
You know, Craig, sometimes you are just too nice. If "we don't know were they are comming from" [sic], you don't know whether it is or is not Cisco's fault. Why bend over backwards to give such public deference? An implementation that propagates _extra_ withdrawals shouldn't _hide_ behind "standards compliant". In fact, I don't think _is_ either "valid" or "standards compliant". There is no standard that says "send extra BGP withdrawals for routes that you are not currently announcing". It was just a bug in the implementation. We _know_ (based on your research and Cisco's admission) that Cisco's have/had this bug. Therefore, it is at least Cisco's "fault". We don't _know_ (no research, no admissions) that anybody else has this problem; so the value of "probably" is more like "possibly". In any case, it "probably" doesn't matter if anybody else has the problem, since Cisco controls 85+% of the backbone market.
* there are differing opinions on how much of a problem all of this extra withdraws pose for Internet
There are always differing opinions about the Internet. Heck, there are differing opinions about whether the Internet is even useful. We are not looking at or talking about opinions, we are looking at operational facts. And the facts indicate that the duplicate withdrawals comprise a large fraction of the BGP traffic, which in turn comprises a large fraction of the backbone router CPU usage.
* even though it is not clear there was/is a significant problem, vendors (Cisco) have created fixes to limit some of the extraneous routing information
Have these fixes been universally deployed yet? What release? And how did that affect the withdrawals seen in the overall backbone?
* a recently noticed 30-second periodicity to updates tends to suggest a systematic problem in the infrastructure.
What you mean is a systematic problem in the routing code, or maybe even in the BGP specification. Occam's Razor. The "infrastructure" is something else entirely. The number of wires in the routing mesh are not likely to have any effect on the periodicity of the traffic sent over them.
This 30 second periodicity occurs with both widthraws and announcements (and seems independent of the extra withdraw problem).
Yes, so why are we talking about it in the same breath (paragraph)? Wouldn't it be better to acknowledge it as a separate issue? And how is the research on the source of the problem progressing? WSimpson@UMich.edu Key fingerprint = 17 40 5E 67 15 6F 31 26 DD 0D B9 9B 6A 15 2C 32 BSimpson@MorningStar.com Key fingerprint = 2E 07 23 03 C5 62 70 D3 59 B1 4F 5E 1D C2 C1 A2