What is broken for one provider and fixed at another may very well break something else that was working before at the first provider, yes? Besides the difficulties of assigning a true metric to the overall reachability of a /8 or any aggregate for that matter ("ok we decreased rtt by 20ms to these 3 destinations doing 15Mbps each but we increased rtt to this other destination doing 40Mbps by 60ms so we're better right?"), do you really want to see the problems you are supposed to be solving with optimized routing popping up and going away again throughout the day?
You hit the nail on the head. Fundametally, any route optimization technique that tries to treat an aggregate of the network as a blob which can be measured will suffer the same type of problems as an IP over ATM network. There will always be a hidden layer of complexity that will affect your traffic flow and which you cannot influence. For instance, if you look at a graph of point-to-point latency over a pure IP network it will usually be nearly flat with the occasional minor blip caused by a packet being buffered somewhere along the path. But look at the same graph for a path that includes an ATM network in the middle and it will be jumping all over the place, and sometimes you will even see wild oscillations with a regular frequency. The IP layer can do nothing about this but suffer. The same thing will happen with any measurement system that tries to classify a path through someone else's AS. You have no control over what happens in that AS and, more importantly, you have no control over the peering points bewteen ASes. Your measurements are as meaningless as measurements of an IP over ATM network. --Michael Dillon