massive routes hijack at AS48400, up to 6000 AS affected?
Hi all, Jan 24 23:20 - Jan 25 01:45 UK time, from LINX peers I have seen major performance degradation on unusually strange route to some eastern Europe countries - see MTR at the bottom of this email. If this is true, it is exactly what few people told us(and we knew) last year. Probably AS48400, which is defined as two ISP multihomed non-transit at RIPEDB, announced various prefixes it had from their ISP.. actually from one ISP to another becoming a transit ISP. I am sure this was not the only faulty point in this failure, but it took quite a while for ISPs to fix it.. It would be interesting to know whether that was unintentional... I know that at least two countries in that way were almost taken internationally offline. I wonder if some sort of action should/will be taken.. Unfortunately ripe's asinuse doesn't tell the story about 48400, but you would find that in raw update logs and bgpplay where I counted around 6k AS were affected. AS PATH 8468 8732 12389 48400 48400 48400 48400 20485 1299 20965 2847 4. te5-1.telehouse-east.core.enta.net 0.1% 2991 1.0 12.1 0.9 282.4 37.5 5. te5-1.telecity-hex.core.enta.net 0.1% 2991 1.2 3.5 0.9 218.9 17.7 6. te5-3.frankfurt.core.enta.net 0.0% 2991 18.0 22.7 17.8 224.9 25.0 7. decix.comcor.ru 0.0% 2991 64.5 64.1 63.6 92.6 0.9 8. 213.171.44.10 90.5% 2991 113.6 115.9 111.9 193.5 8.6 9. 87.226.137.162 99.4% 2991 107.0 106.0 104.4 107.8 0.9 10. 92.50.194.254 98.9% 2991 121.0 123.2 120.4 135.1 2.7 11. vgd15.transtelecom.net 90.4% 2991 114.8 113.4 111.2 117.6 1.1 12. adm-b1-link.telia.net 88.9% 2991 103.7 104.4 102.1 108.0 1.1 13. adm-bb2-link.telia.net 89.9% 2991 105.8 104.5 102.3 119.3 1.5 14. hbg-bb2-pos7-0-0.telia.net 89.8% 2991 116.4 115.7 112.9 122.0 1.2 15. kbn-bb2-pos1-2-0.telia.net 90.7% 2986 120.6 120.7 117.9 137.1 1.7 kbn-bb2-link.telia.net kbn-bb2-link.telia.net 16. kbn-b2-pos5-0.telia.net 89.7% 2986 121.0 124.7 118.4 257.9 16.1 kbn-b2-link.telia.net 17. dante-ic-125712-kbn-b2.c.telia.net 89.2% 2985 127.1 123.2 119.3 207.3 8.9 18. so-0-0-0.rt1.tal.ee.geant2.net 89.4% 2984 135.2 136.8 133.9 162.7 2.8 19. so-1-0-0.rt1.rig.lv.geant2.net 89.2% 2984 145.4 144.7 142.2 170.9 2.1 20. 62.40.112.169 90.0% 2981 151.1 152.1 149.8 171.6 1.8 21. litnet-gw.rt1.kau.lt.geant2.net 90.8% 2981 152.4 152.3 149.6 158.7 1.2 22 .. Regards, Andrius KK
On Jan 24, 2009, at 9:47 PM, AKK wrote:
Hi all,
Jan 24 23:20 - Jan 25 01:45 UK time, from LINX peers I have seen major performance degradation on unusually strange route to some eastern Europe countries - see MTR at the bottom of this email.
If this is true, it is exactly what few people told us(and we knew) last year. Probably AS48400, which is defined as two ISP multihomed non-transit at RIPEDB, announced various prefixes it had from their ISP.. actually from one ISP to another becoming a transit ISP. I am sure this was not the only faulty point in this failure, but it took quite a while for ISPs to fix it.. It would be interesting to know whether that was unintentional... I know that at least two countries in that way were almost taken internationally offline. I wonder if some sort of action should/will be taken.. Unfortunately ripe's asinuse doesn't tell the story about 48400, but you would find that in raw update logs and bgpplay where I counted around 6k AS were affected.
A cursory looks suggests this was nothing more than an ordinary route leak (inspection of the leaking AS, as well as location of that AS in the relevant paths, and the preservation of the prefix lengths; versus some deaggregation or re-origination, and the proximity of those leaked routes to the leaking ASes transit providers). Of course, an ordinary route leak could be the result of accidental configuration, but it could have been malicious as well. When you've got: 1) an AS multi-homed to two different ISPs, AND 2) that AS fails to scope what they announce to those ISPs (i.e., advertise only locally originated and downstream prefixes explicitly), AND 3) one or both of the ISPs employ neither per-prefix, or explicit AS path filtering on ingress, OR 4) they do, but they enable a BGP session _before they apply that ingress policy on the session This is exactly what you get.... To complicate things further, common RFC 1998-style routing policy models result in most clueful ISPs preferring customer routes over peer routes (e.g., via local preference), so those leaked routes are now the preferred path to ALL those prefixes (because local preference trumps AS path), so the customer, with a T1, E1, 100M Ethernet, or whatever, is now the sole primary transit data path between the two networks in question, and all their non-customer egress traffic takes that route, the congestion and collateral damage makes fixing the mistake .. challenging. We always had fallback AS path filters we'd apply to peers in the past that were automatically applied to all new sessions *before* they were turned up in order to avert this type of problem. Basically, the AS path filters listed all the ASNs that we bi-laterally interconnected with, so that if a customer leaked any of their other transit ISPs routes to us, they'd be dropped. Ohh, and regarding 4) above, I experienced that first hand in 1995, when a BGP session between iMCI and Sprint was turned up before ANY ingress BGP policy was applied. The T1 customer immediately became the sole transit path for all iMCI -> Sprint traffic, and all iMCI -> non-iMCI-customer traffic, as they were taking full routes from Sprint, and we preferred those paths (because of default local preference) over all other paths. Took a while to get the router rebooted to fixed the problem. I suspect you can find some evidence of this event in some dusty NANOG archives out there somewhere. Glad to see things have evolved so little... I'll dig a little deeper and qualify the terse look I've already had when I get some time. I suspect others will be looking as well. -danny
participants (2)
-
AKK
-
Danny McPherson