2008.02.19 NANOG 42 simple effective 50ms resilience for IPTV
Gotta bop these out fast, CRG West gathering is about to start. ^_^; Matt 2008.02.19 simple, effective O(50ms) resilience for IPTV Dino and Clarence from Cisco talk about multicast only fast reroute. go through problem, solution, two examples, some questions. packet loss is greatest loss on video apps expected human MTBA is about 2 hours node/link failures MTBA is about 100 hours losing I frame with 50ms reroute has same visual impact as with 400ms reroute switching time requirements 500-1000ms unicast routing with FC 100-500 unicast routing 50-100ms problem space for MoFR multicast streams need resiliency for network outages need fast switchover times with near 0 packet loss 50-100ms, definitely <<< 1second existing redundancy one source, multiple diverse network paths multiple sourcessend same stream on diverse paths device recieves, drops dupes. (source redundancy model) for really fast switchover, can't use messaging, takes too long can't repair when failure occurs need to make before break can't depend on unicast routing, takes too long needs to be relatively low cost incremental deployment is good MoFRR, depends solely on PIM, doesn't wait for unicast routing protocol to reconverge make before break alternative to source redundancy; don't have to provision extra sources, no dupe frames upstream routers don't need to have MoFRR Disadvantages; depends on equal cost multipath could work with NECMP tweak costs using feasible successor technology extensions for ring tech redundant data in some parts of network not so bad with dense receivers Allow Dest router to send alternate-join message along secondary path. A would have 2 OIFS leading to single reciever When RPF path is up, dupes come to D from C, but D RPF fails on packets from B So, there's some wasted bandwidth, but only as long as there's no membership on the alternate path. Local decision to accept packets from alternate path can go quickly, it's a local decision, no signalling needed. Observations; more redundant data as you have more ECMP path layers. But RPF failures help reduce data, but tend to converge towards a single point of failure. If NECMP paths are used, though longer, they may be less congested where data arrives faster, so more packet loss could occur. Ring topology extensions distinguish ring interfaces to be such allow alt-join to go on longer path only two interfaces shortest path is RPF interface other is alt-interface Only the immediate dest router sends alt-join; rest of hops upstream send regular rpf join. Need recievers to be able to accept join messages along the alt-join path, even if it is upstream on the RPF interface for this to work. routers only forward to RPF interface when data is received on an alt-RPF interface because its upstream failed. Doesn't matter if you're on repair path or main path, you still have to forward alt-join messages around the ring. If you recieve data on your RPF interface, you also send it along the alternate path. It's like counterrotating data on FDDI rings, you're just picking which path you accept it from. A cube implements diverse paths...wow, look at the slide, that's hard to describe. each pop ends up having connections to each ring that way. Failure detection--hardest part to solve. direct link failures detected fast. neighbor failures can be detected via bfd fast upstream router or link failures take time. use one solution to detect all cases. Monitor data flow on the RPF path constant bit-rate apps have expected packet arrival times use counters to see if packets have been recieved polling interval is loss budget if counter doesn't increment within interval, you might have a failure; switch to alternate interface, you'll get data, may be redundant, but that's ok. MoFRR patent application filed 4/26/2007 extensions for uh...stuff... Q: Eric notes he's doubled the state in the network; no additional s,g state in the network, no additional entries in the MRIB; you need a new field in multicast RIB for which interface is your alt interface. Q: Anne Johnson, CPAC networks--50ms to 100ms as timing target; is there actually a timing dependency given that you're setting up repair path in advance of actually needing them. But what if you're link is more than 200ms away? Are there implementations of this that currently exist? Not yet, but they're working on it. Sounds like she'll be a beta tester. :D Randy is back up for the IPv6. How many people are successfully on IPv6? A bunch. How many are successfully on IPv4? A bunch, but not quite as many. PC reminds you to fill out your survey forms once you get your connectivity back. 1400 hours resume with lightning talks.
participants (1)
-
Matthew Petach