They sit on /28's and smaller in some cases. I'm certainly not going to be successful in acquiring ASN's for these people to do proper load balancing between multiple ISP's and most major ISP's see little benefit in modifying route tables to include our small netblock.
Right on man!
Its these cases I'm concerned with. In my mind, irrespective of the comments on the functionality of DNS for this purpose, I see little other choice.
DNS is not the droid you're looking for. Using DNS response times to predict likely TCP performance is silly for at least as many reasons as using BGP aspath lengths to predict likely TCP performance is silly.
That being said, if anyone has better ideas on how to provide for high availability to millions of web sites worldwide, please let me know.
TCP performance is affected by congestion symmetry, since TCP uses the spacing of ACK packets to control the spacing of data packets. While there's no way to guarantee congestion symmetry, one of the leading indicators of whether you will have congestion symmetry is "whether you have path symmetry." Furthermore, the leading indicator of whether you have path symmetry is "whether the outbound flow's first hop is the same as the incoming flow's last hop." Thus http://www.vix.com/pub/vixie/ifdefault/. Try it. If you don't know how to apply patches to your kernel, then have a consultant do it. I wrote this for a pornography distributor whose pageviews-per-second went up by a factor of 1.7 peak and 1.25 average just as a result of using "interface defaults" rather than speaking BGP and trying to run defaultless.
On 12 Mar 2000, Paul Vixie wrote:
That being said, if anyone has better ideas on how to provide for high availability to millions of web sites worldwide, please let me know.
TCP performance is affected by congestion symmetry, since TCP uses the spacing of ACK packets to control the spacing of data packets. While there's no way to guarantee congestion symmetry, one of the leading indicators of whether you will have congestion symmetry is "whether you have path symmetry." Furthermore, the leading indicator of whether you have path symmetry is "whether the outbound flow's first hop is the same as the incoming flow's last hop."
Thus http://www.vix.com/pub/vixie/ifdefault/. Try it. If you don't know how to apply patches to your kernel, then have a consultant do it. I wrote this for a pornography distributor whose pageviews-per-second went up by a factor of 1.7 peak and 1.25 average just as a result of using "interface defaults" rather than speaking BGP and trying to run defaultless.
That doesn't solve one of the growing uses of such systems, which is so-called "geographical redundancy". More and more, it simply isn't acceptable to have a single location with a bunch of network links, with an attempt being made to optimize how those links are used. A single location is a single location no matter what you do with it. You need multiple locations, with a reasonably robust and somewhat (although not necessarily completely) transparent failover between them. In these cases, any best path benefits are secondary. The ways to do this sort of thing are very limited. Whatever you do, you end up needing either "smart" (ie. sometimes lame) DNS servers or to originate BGP routes from multiple locations with the same IP address actually going to different machines depending on which route is used (which is lame even more often, although it is not as bad if one facility is normally an unused backup one, but that introduces lots of other issues). If you have a better solution for this, I'm sure the world would love to hear it. Yes, many or most or all of the current implementations do or can be configured to do some questionable things. However, your solution doesn't address the whole "distributed" aspect of it.
That doesn't solve one of the growing uses of such systems, which is so-called "geographical redundancy".
That's a strawman error. This isn't the kind of system you'd use for geographical diversity. That's a different problem and has a different solution. ifdefault solves the multihoming-without-bgp problem, and as far as I know, ifdefault solves THAT problem better than any alternative I've ever heard of.
The ways to do this sort of thing are very limited.
Yes.
Whatever you do, you end up needing either "smart" (ie. sometimes lame) DNS servers or to originate BGP routes from multiple locations with the same IP address actually going to different machines depending on which route is used (which is lame even more often, although it is not as bad if one facility is normally an unused backup one, but that introduces lots of other issues).
Inconsistent-AS. "Ick."
If you have a better solution for this, I'm sure the world would love to hear it. Yes, many or most or all of the current implementations do or can be configured to do some questionable things. However, your solution doesn't address the whole "distributed" aspect of it.
There are three interesting aspects to the geographical diversity problem. 1. "site down." In this event, the other site(s) can detect this by periodic monitoring, and use RFC2136 DNS Dynamic Update to remove the down site's A RR so that clients won't trip over it. This requires low TTL's and it's not elegant but it's better than nothing. I've used this for local redundant clusters like MX farms, too. 2. "network partition." Both (all) mirror sites might be reachable by a lot of clients but they can't reach each other and there are many clients who can each only reach one of the mirrors rather than all reaching all. Monitoring and RFC2136 won't help in this case unless there's an authoritative nameserver colocated with each content server and you have to be willing to pay the incoherence penalty which I'm not. 3. "worst case." Even without a partition, there are some client/mirror pairings which will fare considerably worse than others. DNS round robin, the default for BIND since 1994 or so, spreads the pain evenly but doesn't make it stop. The important technical thing to do all three of cases 1, 2, and 3 is to be able to issue HTTP redirects to a better server, if there's a reason to think that there is a better server for a given client. Now, as some of you know, I cofounded a company several years ago to address this problem (which as I said earlier is different from the multihoming- without-BGP problem), but we changed direction before completing this work. All I've got to say about THAT right now is that my biggest competitive worry as CTO of Vayu Communications Inc. was a company called "rndnetworks" and their product called "radware". Nobody else, ESPECIALLY Cisco with their Distributed Director, came anywhere close to solving the right problem in the right way. rndnetworks, on the other hand, caused me to lose sleep at night. Therefore, when a customer of MFN/Abovenet asks for a recommendation, I tell them to look into the rndnetworks products in this area.
Inconsistent-AS. "Ick."
an as should have a single routing policy. that policy could be that all prefixes beginning with odd numbers are only announced in florida and those begining with even are only announced in stockholm. when one is a transit customer, one gets to set one's policy as one wishes. now peering can be a different matter. in that case, peers tend to make have complex requirements, e.g. some requirem that, for north american peerings, all prefixes must be announced at all peerings and with the same as paths. we need to remember that there are non-trivial hosting sites which pay for transit. randy
Its these cases I'm concerned with. In my mind, irrespective of the comments on the functionality of DNS for this purpose, I see little
other
choice.
DNS is not the droid you're looking for.
Using DNS response times to predict likely TCP performance is silly for at least as many reasons as using BGP aspath lengths to predict likely TCP performance is silly.
That being said, if anyone has better ideas on how to provide for high availability to millions of web sites worldwide, please let me know.
TCP performance is affected by congestion symmetry, since TCP uses the spacing of ACK packets to control the spacing of data packets. While there's no way to guarantee congestion symmetry, one of the leading indicators of whether you will have congestion symmetry is "whether you have path symmetry." Furthermore, the leading indicator of whether you have path symmetry is "whether the outbound flow's first hop is the same as the incoming flow's last hop."
Just a quick note in clarification, I am less interested in intelligently directing the traffic to the closest or most optimal server farm that I am in purely ensuring that the traffic can be balance between sites that sit within different AS's. ------- Peter Van Oene Senior Systems Engineer UNIS LUMIN Inc. www.unislumin.com
On Mon, Mar 13, 2000, Peter A. van Oene wrote:
Its these cases I'm concerned with. In my mind, irrespective of the comments on the functionality of DNS for this purpose, I see little
other
choice.
DNS is not the droid you're looking for.
Using DNS response times to predict likely TCP performance is silly for at least as many reasons as using BGP aspath lengths to predict likely TCP performance is silly.
That being said, if anyone has better ideas on how to provide for high availability to millions of web sites worldwide, please let me know.
TCP performance is affected by congestion symmetry, since TCP uses the spacing of ACK packets to control the spacing of data packets. While there's no way to guarantee congestion symmetry, one of the leading indicators of whether you will have congestion symmetry is "whether you have path symmetry." Furthermore, the leading indicator of whether you have path symmetry is "whether the outbound flow's first hop is the same as the incoming flow's last hop."
Just a quick note in clarification, I am less interested in intelligently directing the traffic to the closest or most optimal server farm that I am in purely ensuring that the traffic can be balance between sites that sit within different AS's.
How do you propose to do this? It is a non trivial solution - if you don't believe me, go ask someone trying to do it (eg Akami). People forget the magic tenant things are built on here - "The less the control you have on a network, the harder you have to work to deliver a given quality of service", where quality of service is something other than 'terrible' . Adrian
Just a quick note in clarification, I am less interested in intelligently directing the traffic to the closest or most optimal server farm that I am in purely ensuring that the traffic can be balance between sites that sit within different AS's.
How do you propose to do this? It is a non trivial solution - if you don't believe me, go ask someone trying to do it (eg Akami).
People forget the magic tenant things are built on here - "The less the control you have on a network, the harder you have to work to deliver a given quality of service", where quality of service is something other than 'terrible' .
Naturally, if it was a trivial issue, we wouldn't be debating it here. I don't necessary propose one optimal way over another. I have simple made the statement that a DNS oriented approach can be feasible in many situations. As far as magic goes, I'd like to meet the group that runs the entire Internet because that is the network I'm concerned about. QoS at this point is not the issue at hand in its normal sense. Peter ------- Peter Van Oene Senior Systems Engineer UNIS LUMIN Inc. www.unislumin.com
Naturally, if it was a trivial issue, we wouldn't be debating it here.
you're somewhat disingenuous or new to this list.
a DNS oriented approach can be feasible in many situations.
yup. unfortunately the hack is not a good net citizen (some folk don't appreciate packets thrown at their servers), and some versions are not very accurate (as the server for foo.bar may be quite net.far from the host foo.bar). but then most bgp hacks, though better net citizens, are not brilliantly accurate either. the anycast hack really being the only one that scales and performs at all well. ymmv and tanstaafl. job security for senior geeks. randy
Yo All! On Mon, 13 Mar 2000, Randy Bush wrote:
a DNS oriented approach can be feasible in many situations.
yup. unfortunately the hack is not a good net citizen (some folk don't appreciate packets thrown at their servers), and some versions are not very accurate (as the server for foo.bar may be quite net.far from the host foo.bar).
I have DNS servers in Singapore and Germany for hosts in California (and vice versa). I do this so the next time NorCal has a major power outage, or Taiwan has an earthquake, that I do not lose my DNS in major sections or internet space! Checking the path to my DNS servers tells you NOTHING about the path to my hosts. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 20340 Empire Ave, Suite E-3, Bend, OR 97701 gem@rellim.com Tel:+1(541)382-8588 Fax: +1(541)382-8676
a DNS oriented approach can be feasible in many situations.
yup. unfortunately the hack is not a good net citizen (some folk don't appreciate packets thrown at their servers), and some versions are not very accurate (as the server for foo.bar may be quite net.far from the host foo.bar).
but then most bgp hacks, though better net citizens, are not brilliantly accurate either. the anycast hack really being the only one that scales and performs at all well.
Help me our here. Can we not agree that exponentially more websites will require the ability to multihome to different AS's to achieve proper redundancy / disaster recovery? Instead of simply saying "if it ain't BGP its crap" like the above, or the E Gavron's telling me that these customers should simple advertise their little netblocks out of two or more AS's, can someone suggest some viable solutions? The hard reality is that there isn't enough AS space. This is unbelievably obvious. With that in mind, how do I multihome? I am currently engaged in a great number of projects that face this exact challenge. In liue of more stategic solutions or co-location into proper facilities, I personally don't see a better mechanism than load distribution techniques similar to the 3DNS. ------- Peter Van Oene Senior Systems Engineer UNIS LUMIN Inc. www.unislumin.com
On Wed, Mar 15, 2000, Peter A. van Oene wrote:
a DNS oriented approach can be feasible in many situations.
yup. unfortunately the hack is not a good net citizen (some folk don't appreciate packets thrown at their servers), and some versions are not very accurate (as the server for foo.bar may be quite net.far from the host foo.bar).
but then most bgp hacks, though better net citizens, are not brilliantly accurate either. the anycast hack really being the only one that scales and performs at all well.
Help me our here. Can we not agree that exponentially more websites will require the ability to multihome to different AS's to achieve proper redundancy / disaster recovery?
Yes.
Instead of simply saying "if it ain't BGP its crap" like the above, or the E Gavron's telling me that these customers should simple advertise their little netblocks out of two or more AS's, can someone suggest some viable solutions?
Viable ? Yes. Scales well ? Yes, if you're willing to put the work into it.
The hard reality is that there isn't enough AS space. This is unbelievably obvious. With that in mind, how do I multihome?
Multihome by its very definition refers to *network* multihoming. You are after service multihoming. With the current protocol set as it stands? You can do some neat tricks, but there is no sure fire way to achieve the level of redundancy that you're after without a *lot* of work. I'm not talking initial setup work, I'm talking maintainence.
I am currently engaged in a great number of projects that face this exact challenge. In liue of more stategic solutions or co-location into proper facilities, I personally don't see a better mechanism than load distribution techniques similar to the 3DNS.
Basically you're after a new and neat way of multihoming services without requiring it to be done at the network level. Now, you have a couple of options as far as I can see: * You can pick one of the hacks and work with it. It'll work now, as how how well it'll work, how it will scale, and how it will handle the changing internet is well, anyone's guess. * You can do some research into finding an elegant solution to the problem. We will probably all love you. The trouble with this is it won't be instantaneous, and you'll have to pull some magic to get people to adopt it. * You can coloate with a large backbone which already multihomes, and work with them to develop redudancy for your services. This means you're at their mercy for service guarantees, and trying to deal with most existing network providers to do the tricks needed to present redundant services isn't going to be easy. At the moment, I would try 2 and 3, but thats because I'm not in a "It has to be working now, or we don't get paid" boat. Unfortunately, most of us are too busy working on other pressing issues (read: paid employment) to push research into this sort of stuff (please, if you are, stick your hand up now!). What you and a whole heap of other people need to realise is that the way the internet is built *now* prohibits people from doing redundant services without multihoming. You can do it, but you can't do it properly. I'm all up for dreaming up some DNS-URI-object cache type hack, but unless someone is going to pay a bunch of people to do the research, you might be stuck. Its a standard software development thing - you build something with some basic assumptions. You then are totally free to do what the hell you want, as long as you don't want to change the base assumptions. To change the base assumptions isn't trivial. :-) Until the right company releases something of course. But then, you're at their mercy. Adrian
participants (8)
-
Adrian Chadd
-
adrian@creative.net.au
-
Gary E. Miller
-
Marc Slemko
-
Paul A Vixie
-
Paul Vixie
-
Peter A. van Oene
-
Randy Bush