Richard A Steenbergen wrote:
The issue that you describe does indeed offer some constraints to the application of route optimization technology. Within the scope of this issue, though, I think that you would agree that a network which is ALL transit would face no challenge here -- and more specifically, if there is a routing optimization decision among local transit links, that problem could be solved independantly of the existance of "non-transit" links.
Just noting why it will never be anything other than a small customer transit-only solution. As long as you are guaranteed by design that your product will never be applicable to large networks or networks with any peering, you know that odds are VERY slim you'll ever have anyone with real network clue using the product. Under such conditions, snake oil sales flurish.
It appears to me that you've acknowledged that route optimization solves a problem, albeit one that is not a complete solution for your network. The claims of 'snake oil' seems inappropriate in this context. One step further: if you are running a network of this type, then there seems to be a large likelihood that you are selling transit. Thus, your customers may well be using technology of this sort to provide real solutions to THEIR problems. (specifically, they may be directing traffic towards providers that are to _their_ advantage; and be gaining detailed insight as to the real quality of connectivity being provided to them.) It's not clear to me how you chose to define "real network clue", but I would not suggest that your customers are completely lacking in that area. :)
In other cases, it may be possible to define the set of destinations that are legal over a given link, and constrain measurements for that link.
Good luck making this scale. :)
Granted - it is a limited solution -- but still a solution that does solve a set of real-world problems.
What is broken for one provider and fixed at another may very well break something else that was working before at the first provider, yes? Besides the difficulties of assigning a true metric to the overall reachability of a /8 or any aggregate for that matter ("ok we decreased rtt by 20ms to these 3 destinations doing 15Mbps each but we increased rtt to this other destination doing 40Mbps by 60ms so we're better right?"),
Having measurement traffic that directly correlates to actual traffic makes this problem much more managable.
The problems then become:
* The quicker you try to react, the more you place yourself at risk of starting a best path flap cycle.
* Congestion does not only happen on your uplink circuit, it can happen at every point along the path, including peers, backbone circuits, and even the end user/site links. While I find the sales pitches of people touting the horrors of peering to be quite sad (from Internap to the classic MAE Dulles :P), peering capacity is largely based on the ability to predict the traffic levels far in advance. It doesn't take that many "large" customers selecting certain destinations through one provider at once to blow up a peer in one region.
Flap control is an important consideration. Note that in the described topology, changing the selection of an egress point does not affect the routing tables of external networks (as opposed to flapping of route advertisements, for inbound traffic.) I do think that it's useful to compare the behaviour of "mortal" BGP in the conditions you describe ... if BGP selects a path that is, or becomes, congested ... BGP has no feedback mechanism to make a change until the overall topology changes, or until manual intervention. An automated route optimization system can evaluate the performance, and current load, of alternate egresses, make an automated change to the egress, and then monitor the success of the change. In most cases, the overall conditions will have been improved. In the case you describe above, the route change results in suboptimal performance, and a new decision is needed. This process needs to have effective flap control. This is an area in which I've seen a fair amount of development; and have seen good results in years of production use.
Balancing the traffic of a GigE and a couple of FastE transits to keep each one uncongested may be enough functionality to sell some boxes to some low end users, but this falls into the categories I've described above, and does nothing to address the true end to end performance.
It's not clear to me what you mean here by "true end to end performance". I don't pretend that the approach being discussed is a COMPREHENSIVE solution to all the problems that can impair performance; but I do think that for the class of performance problems that are directly observable via inspection of alternate egresses, redirecting the egress does in fact address "true end to end performance".
Thus the only real solution to the problem if you actually want to optimize traffic is:
c) Dynamically measure all of the possible deaggregations of all active space, and dynamically determine which prefixes need to be deaggregated to what level.
Note that in any of the above cases, the de-aggregated routes should be marked NO_EXPORT.
Throw away the BGP routing table completely, and build your own based on the topology and metrics you have detected. Of course, this means saying goodbye to the usual failsafe method of keeping the normal BGP routes in the table with a lower localpref so if the box falls over you just fail back to normal BGP path selection.
This alone seems to make adoption of such technology rather difficult ...
And probably more importantly, there isn't enough scale in the traffic probing system to gather the necessary topology info once for every customer... ... Maybe if you made everyone's boxes report data back to a central site, you could gather something useful from it.
IMHO, that approach has demonstrated scalability limitations. Performance, and load information, tends to get stale very quickly. ------------------------------------------------------ While it does seem obvious that a richer palette of routing policy control SHOULD be a core part of the routing fabric, I don't expect to see BGPv4, (or multihoming under IPv6,) providing real solutions for this set of problems for the foreseeable future. cheers -- Sean