multi homing pressure
"Firms must defend against ISP clashes, warns Gartner Commercial row between ISPs shows vulnerability of single sourcing says analyst" http://www.computerweekly.com/Articles/Article.aspx?liArticleID=212391 Looks like it's about to enter the corporate rule book "Gartner said every location that requires mission-critical internet connectivity, including externally hosted websites, should be multi-homed" and end "tier 1" hosting brandon
On Wed, Oct 19, 2005 at 03:48:09PM +0100, Brandon Butterworth wrote:
"Firms must defend against ISP clashes, warns Gartner Commercial row between ISPs shows vulnerability of single sourcing says analyst" http://www.computerweekly.com/Articles/Article.aspx?liArticleID=212391
Looks like it's about to enter the corporate rule book
"Gartner said every location that requires mission-critical internet connectivity, including externally hosted websites, should be multi-homed"
it will be interesting to see if this has acutal impact on ASN allocation rates globally. - jared -- Jared Mauch | pgp key available via finger from jared@puck.nether.net clue++; | http://puck.nether.net/~jared/ My statements are only mine.
In a message written on Wed, Oct 19, 2005 at 11:31:32AM -0400, Jared Mauch wrote:
it will be interesting to see if this has acutal impact on ASN allocation rates globally.
I have done no analysis, but I do believe this is having an effect on the number of prefixes announced by many of the players involved. Looking at the top 10-20 peers over here, all of them show prefixes announced by the peers to be growing faster than the global prefix table. The only way that makes sense is if existing prefixes are being announced through additional providers. It would be interesting to see those more into BGP routing analysis to look at that (possible) trend. It's probably causing a shift in how BGP processing occures on both a device and a network level (more redundant paths) which could have implications for future gear. -- Leo Bicknell - bicknell@ufp.org - CCIE 3440 PGP keys at http://www.ufp.org/~bicknell/ Read TMBG List - tmbg-list-request@tmbg.org, www.tmbg.org
jared@puck.nether.net (Jared Mauch) writes:
it will be interesting to see if this has acutal impact on ASN allocation rates globally.
i don't think so. multihoming without bgp isn't as hard as qualifying for PI space. i think we'll finally see enterprise-sized multihoming NAT/proxy products. -- Paul Vixie
On Wed, Oct 19, 2005 at 10:19:28PM +0000, Paul Vixie scribed:
jared@puck.nether.net (Jared Mauch) writes:
it will be interesting to see if this has acutal impact on ASN allocation rates globally.
i don't think so. multihoming without bgp isn't as hard as qualifying for PI space. i think we'll finally see enterprise-sized multihoming NAT/proxy products.
If you can run Squid, you can multihome your web connections today. It's a little bit awkward to configure, but then again, so is Squid. People are welcome to poke at, fold, spindle, or mutilate: http://nms.lcs.mit.edu/ron/ronweb/#code (Part of my thesis work, Monet is a modification to Squid that causes it to try to open N TCP connections to a Web server that it wants to talk to. It uses the first SYN ACK to return, and closes the other connections to be a nice neighbor. It's shockingly effective at improving availability to Web sites that are themselves multihomed or otherwise good. Warning: Often still leads to annoyance if you find yourself able to browse the web but not do anything else. We do have a NAT version of this that works with arbitrary protocols. If people are interested, I'll try to convince my former student to dig up the code and make it a bit prettier.) -Dave
On Wed, 19 Oct 2005, David G. Andersen wrote:
If you can run Squid, you can multihome your web connections today. It's a little bit awkward to configure, but then again, so is Squid. People are welcome to poke at, fold, spindle, or mutilate:
http://nms.lcs.mit.edu/ron/ronweb/#code
(Part of my thesis work,
Hehe, google for "vixie ifdefault". regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: Why do they call it baby-SITTING when all you do is run after them?
On Thu, Oct 20, 2005 at 03:18:35AM +0100, Paul Jakma scribed:
On Wed, 19 Oct 2005, David G. Andersen wrote:
If you can run Squid, you can multihome your web connections today. It's a little bit awkward to configure, but then again, so is Squid. People are welcome to poke at, fold, spindle, or mutilate:
http://nms.lcs.mit.edu/ron/ronweb/#code
(Part of my thesis work,
Hehe, google for "vixie ifdefault".
Right. Vix was talking about the inbound path - I'm talking about the outbound path. Complimentary solutions to the same problem. -Dave
http://nms.lcs.mit.edu/ron/ronweb/#code
(Part of my thesis work,
Hehe, google for "vixie ifdefault".
Paul's use of Squid is mentioned in this NANOG posting: http://www.cctec.com/maillists/nanog/historical/9702/msg00431.html Here are the notes from the SF NANOG presentation: http://www.academ.com/nanog/feb1997/multihoming.html --Michael Dillon
On Oct 20, 2005, at 5:37 AM, Michael.Dillon@btradianz.com wrote:
http://nms.lcs.mit.edu/ron/ronweb/#code
(Part of my thesis work,
Hehe, google for "vixie ifdefault".
Paul's use of Squid is mentioned in this NANOG posting: http://www.cctec.com/maillists/nanog/historical/9702/msg00431.html Here are the notes from the SF NANOG presentation: http://www.academ.com/nanog/feb1997/multihoming.html
Right. Though the details are very sparse, this is stock Squid running in accelerator mode. The solution I described is quite different (for one, it's normal-mode squid for _outbound_ requests, and second, it actually probes the links to see if they're working). A commercial solution that looks a lot more like the stuff we built are products by Stonesoft ("Multi-Link Technology") and Fatpipe ("Redundant Array of Independent Lines"). RadWare's "LinkProof" has a similar style, though the actual technique they use is more link- centric instead of path-centric. -Dave
On Wed, 19 Oct 2005, Brandon Butterworth wrote:
"Gartner said every location that requires mission-critical internet connectivity, including externally hosted websites, should be multi-homed"
200k routes, here we come! -- -- Todd Vierling <tv@duh.org> <tv@pobox.com> <todd@vierling.name>
On Wed, 19 Oct 2005, Todd Vierling wrote:
On Wed, 19 Oct 2005, Brandon Butterworth wrote:
"Gartner said every location that requires mission-critical internet connectivity, including externally hosted websites, should be multi-homed"
200k routes, here we come!
it is just good common sense though, eh? Also, there have been some pressures through the SOX and other compliance activities to push dual carrier things in the recent past.
On Wed, 19 Oct 2005, Christopher L. Morrow wrote:
"Gartner said every location that requires mission-critical internet connectivity, including externally hosted websites, should be multi-homed"
200k routes, here we come!
it is just good common sense though, eh?
Well, not necessarily. Tier-2s should be given much more credit than they typically are in write-ups like this. When a customer is single homed to a tier-2 that has multiple tier-1 upstreams, and uses a delegated netblock from the tier-2's aggregations, that means one less ASN and one or more less routes in the global table. It's a Good Thing(tm). -- -- Todd Vierling <tv@duh.org> <tv@pobox.com> <todd@vierling.name>
tv@duh.org (Todd Vierling) wrote:
Tier-2s should be given much more credit than they typically are in write-ups like this. When a customer is single homed to a tier-2 that has multiple tier-1 upstreams, and uses a delegated netblock from the tier-2's aggregations, that means one less ASN and one or more less routes in the global table.
That's the operators' view, but not the customer's. The customer wants redundancy. So we should try to find a way to tell them "Hey, it's mostly Tier-1's (or wannabes) that play such games, stick to a trustworthy Tier-2. And, hey, btw., connect redundantly to them, so you have line failure resiliency and also a competent partner that cares for everything else." Only seeing the operators' view will amount to nothing in the customer's will to run along with the Tier-2. Eventually, it breaks down to trust. And customers learn that the "big players" are not always trustworthy. Oh, and customers do not always remember names. Yours, Elmar. -- "Begehe nur nicht den Fehler, Meinung durch Sachverstand zu substituieren." (PLemken, <bu6o7e$e6v0p$2@ID-31.news.uni-berlin.de>) --------------------------------------------------------------[ ELMI-RIPE ]---
On Wed, 19 Oct 2005, Elmar K. Bins wrote:
Tier-2s should be given much more credit than they typically are in write-ups like this. When a customer is single homed to a tier-2 that has multiple tier-1 upstreams, and uses a delegated netblock from the tier-2's aggregations, that means one less ASN and one or more less routes in the global table.
That's the operators' view, but not the customer's. The customer wants redundancy.
That's why SLAs exist.
So we should try to find a way to tell them "Hey, it's mostly Tier-1's (or wannabes) that play such games, stick to a trustworthy Tier-2. And, hey, btw., connect redundantly to them, so you have line failure resiliency and also a competent partner that cares for everything else."
Something like that, but not quite. Whenever one of these reports, which boil down to "everyone must multi-home!", appears, it typically has a stark lack of information on alternatives to *direct* multi-homing. Many customers would rather not multihome directly, and prefer "set it and forget it" connectivity. It's much easier to maintain a multi-pipe connection that consists of one static default route than a pipe to multiple carriers. The former requires simple physical pipe management, which can be left alone for 99% of the time. The latter requires BGP feed, an ASN, and typically much more than 1% of an employee's time to keep running smoothly. Obtaining single-homed connectivity from a Tier-2 mostly "outsources" network support, and small to medium size businesses tend to like that. It's not the only leaf end solution to the problem, but it's a viable one (and can be less costly to the rest of the world in turn). -- -- Todd Vierling <tv@duh.org> <tv@pobox.com> <todd@vierling.name>
At 12:20 -0400 10/19/05, Todd Vierling wrote:
That's why SLAs exist.
Do SLA's say "if you are out of the water for 30 minutes we will also cover your lost business revenue?" There are some times with service guarantees just are not enough (e.g., manned space flight support). -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Edward Lewis +1-571-434-5468 NeuStar True story: Only a routing "expert" would fly London->Minneapolis->Dallas->Minneapolis to get home from a conference. (Cities changed to protect his identity.)
On Oct 19, 2005, at 12:34 PM, Edward Lewis wrote:
At 12:20 -0400 10/19/05, Todd Vierling wrote:
That's why SLAs exist.
Do SLA's say "if you are out of the water for 30 minutes we will also cover your lost business revenue?" There are some times with service guarantees just are not enough (e.g., manned space flight support).
There is no such thing as an SLA on the Internet that is worth what you lose when the line goes down. Part of the problem of paying so little for a "mission critical" application. And part of the reason multi-homing is so important for a "mission critical" application. -- TTFN, patrick
On Wed, 19 Oct 2005, Todd Vierling wrote:
That's the operators' view, but not the customer's. The customer wants redundancy.
That's why SLAs exist.
SLAs exist to provide a means of allowing a vendor to 'feel your pain' when you experience some type of a service outage. They generally do not exist to act as a cost recovery mechanism for customers who lose revenue when mission critical app XYZ can't access 'the network', people coming in from 'the network' cannot access it. Being able to deduct some percentage of your monthly spend with your carrier often doesn't balance well against a network outage that affects the mission critical app that brings in a substantial percentage of the company's revenue. Each organization's tolerance for outages (read: revenue impact) must be weighed against the costs of multihoming and making the investments in infrastructure to improve relibility.
Something like that, but not quite. Whenever one of these reports, which boil down to "everyone must multi-home!", appears, it typically has a stark lack of information on alternatives to *direct* multi-homing.
Hence Chris's earlier post about the multitude of think tanks which exist to state the obvious and make a buck while doing it :-)
Many customers would rather not multihome directly, and prefer "set it and forget it" connectivity. It's much easier to maintain a multi-pipe connection that consists of one static default route than a pipe to multiple carriers. The former requires simple physical pipe management, which can be left alone for 99% of the time. The latter requires BGP feed, an ASN, and typically much more than 1% of an employee's time to keep running smoothly.
I disagree with some of this. I've set BGP up for customers before on many occasions. Many were quite happy with a primary-backup mode of connectivity, which can be accommodated by getting an ASN, configuring BGP on your router(s) and with your upstreams, announcing your route(s) and accepting a default route from those upstreams. My experience has been that these types of setups are also pretty much 'fire and forget' as far as the customer is concerned - *once they're up and running*. If customer XYZ doesn't have the expertise in-house to set it up, they will often bring in a consultant to do it. I do agree that the process of applying for an ASN and so forth can take some time, but it's typically a one-time process. Customers who want to actually attempt to tune traffic to fit the size of their pipes are the ones who require much more time in the maintenance of their BGP environment, and often have higher capex and opex to go with it (bigger router to handle full BGP feeds, $router_vendor support contracts for same, etc).
Obtaining single-homed connectivity from a Tier-2 mostly "outsources" network support, and small to medium size businesses tend to like that. It's not the only leaf end solution to the problem, but it's a viable one (and can be less costly to the rest of the world in turn).
That depends greatly on the business need that drives the decision in the first place. Plus, some organizations are finding that locating critical service outside of their borders either voliates their security policies, or can hamper compliance with outside security mandates (GLB, SOX, HIPAA, etc). Maintaining compliance + improving reliability can motivate organizations to multihome. jms
TV> Date: Wed, 19 Oct 2005 12:20:25 -0400 (EDT) TV> From: Todd Vierling TV> That's why SLAs exist. I thought SLAs existed to comfort nontechnical people into signing contracts, then denying credits via careful weasel words when the time comes for the customer to collect. Or maybe I'm just cynical. TV> Many customers would rather not multihome directly, and prefer "set it and TV> forget it" connectivity. It's much easier to maintain a multi-pipe TV> connection that consists of one static default route than a pipe to multiple TV> carriers. The former requires simple physical pipe management, which can be TV> left alone for 99% of the time. The latter requires BGP feed, an ASN, and TV> typically much more than 1% of an employee's time to keep running smoothly. Single carrier + multiple POPs != difficult. Even a lowly 2500 can be loaded up with a carrier-assigned private ASN and fed a couple routes. (Maybe it's a little more complex when one needs equal-cost multipath, but it's still hardly rocket science.) TV> Obtaining single-homed connectivity from a Tier-2 mostly "outsources" TV> network support, and small to medium size businesses tend to like that. See above. TV> It's not the only leaf end solution to the problem, but it's a viable one TV> (and can be less costly to the rest of the world in turn). Eddy -- Everquick Internet - http://www.everquick.net/ A division of Brotsman & Dreger, Inc. - http://www.brotsman.com/ Bandwidth, consulting, e-commerce, hosting, and network building Phone: +1 785 865 5885 Lawrence and [inter]national Phone: +1 316 794 8922 Wichita ________________________________________________________________________ DO NOT send mail to the following addresses: davidc@brics.com -*- jfconmaapaq@intc.net -*- sam@everquick.net Sending mail to spambait addresses is a great way to get blocked. Ditto for broken OOO autoresponders and foolish AV software backscatter.
On Oct 19, 2005, at 12:20 PM, Todd Vierling wrote:
Many customers would rather not multihome directly, and prefer "set it and forget it" connectivity. It's much easier to maintain a multi-pipe connection that consists of one static default route than a pipe to multiple carriers. The former requires simple physical pipe management, which can be left alone for 99% of the time. The latter requires BGP feed, an ASN, and typically much more than 1% of an employee's time to keep running smoothly.
Hrm, people keep saying that BGP is hard and takes time. As well as my end-user-facing network responsibilities, I also have corporate network responsibilities here. All of our corporate hub locations are multi-homed (or soon will be)... and I honestly can't remember the last time I made any changes (besides IOS upgrades) to BGP configs for the 2 hubs in the US. (We're moving physical locations in the "international" hubs and taking new providers, so I'm discounting those changes as you'd have similar changes in a single homed statically routed move). If you don't have multihoming requirements other than availability then it really can be fire and forget.
John Payne wrote:
Hrm, people keep saying that BGP is hard and takes time.
As well as my end-user-facing network responsibilities, I also have corporate network responsibilities here. All of our corporate hub locations are multi-homed (or soon will be)... and I honestly can't remember the last time I made any changes (besides IOS upgrades) to BGP configs for the 2 hubs in the US. (We're moving physical locations in the "international" hubs and taking new providers, so I'm discounting those changes as you'd have similar changes in a single homed statically routed move).
If you don't have multihoming requirements other than availability then it really can be fire and forget.
Except for those pesky bogon filters.... which corporations seem to like to "fire and forget". -- Mark Radabaugh Amplex mark@amplex.net 419.837.5015
On Oct 19, 2005, at 3:31 PM, Mark Radabaugh wrote:
John Payne wrote:
[...]
If you don't have multihoming requirements other than availability then it really can be fire and forget.
Except for those pesky bogon filters.... which corporations seem to like to "fire and forget".
Perhaps that's something you should "outsource" to your provider? Plus, you missed the part: "If you don't have multihoming requirements other than availability". Bogon filters do _not_ enhance availability. (And please don't argue about things like attacks from unallocated space - the disconnectivity caused by un- updated bogon filters is much higher than the increase from things like this.) -- TTFN, patrick
Or you can get automated bogon feeds from our good friends at cymru.. http://www.cymru.com/BGP/bogon-rs.html Peter Kranz pkranz@unwiredltd.com -----Original Message----- From: owner-nanog@merit.edu [mailto:owner-nanog@merit.edu] On Behalf Of Patrick W. Gilmore Sent: Wednesday, October 19, 2005 12:49 PM To: nanog@merit.edu Cc: Patrick W. Gilmore Subject: Re: multi homing pressure On Oct 19, 2005, at 3:31 PM, Mark Radabaugh wrote:
John Payne wrote:
[...]
If you don't have multihoming requirements other than availability then it really can be fire and forget.
Except for those pesky bogon filters.... which corporations seem to like to "fire and forget".
Perhaps that's something you should "outsource" to your provider? Plus, you missed the part: "If you don't have multihoming requirements other than availability". Bogon filters do _not_ enhance availability. (And please don't argue about things like attacks from unallocated space - the disconnectivity caused by un- updated bogon filters is much higher than the increase from things like this.) -- TTFN, patrick
-----Original Message----- From: owner-nanog@merit.edu [mailto:owner-nanog@merit.edu] On Behalf Of Mark Radabaugh Sent: Wednesday, October 19, 2005 2:31 PM To: nanog@merit.edu Subject: Re: multi homing pressure
John Payne wrote:
Hrm, people keep saying that BGP is hard and takes time.
As well as my end-user-facing network responsibilities,
I also have
corporate network responsibilities here. All of our corporate hub locations are multi-homed (or soon will be)... and I honestly can't remember the last time I made any changes (besides IOS upgrades) to BGP configs for the 2 hubs in the US. (We're moving
locations in the "international" hubs and taking new
If only I'd had the foresight to configure the all of the customers I've setup on BGP with Bogon filters, and more complex routing policies than defaults + provider customer routes, then I would have made mountains of recurring revenue from this "maintenance", and I would be reading this thread in my mountain cabin with beleaguered amusement. Alas, I met the customers requirement, it has to "just work"... And it does. (and yes, on the network I administer at my day job, I bogon/rpf filter and aggressively traffic engineer.) -ejay physical providers, so
I'm discounting those changes as you'd have similar changes in a single homed statically routed move).
If you don't have multihoming requirements other than availability then it really can be fire and forget.
Except for those pesky bogon filters.... which corporations seem to like to "fire and forget".
-- Mark Radabaugh
Amplex mark@amplex.net 419.837.5015
On Wed, 19 Oct 2005, John Payne wrote:
Hrm, people keep saying that BGP is hard and takes time.
As well as my end-user-facing network responsibilities, I also have corporate network responsibilities here. All of our corporate hub locations are multi-homed (or soon will be)... and I honestly can't remember the last time I made any changes
It's not changes that make BGP maintenance consume time; rather, it's tracking down connectivity issues when problems do arise. If the upstream is responsible for multihoming instead, they also are responsible for keeping the knowledge resources to do that problem-hunt. It's another side effect of the choice to outsource the multihoming responsibility, and is one of the factors to consider when choosing a redundancy approach. -- -- Todd Vierling <tv@duh.org> <tv@pobox.com> <todd@vierling.name>
That's the operators' view, but not the customer's. The customer wants redundancy.
That's why SLAs exist.
No... SLAs exist to extract some compensation when the level of service doesn't meet the need. In a mission critical situation, SLAs are pretty worthless. The primary benefit of an SLA is that it (hopefully) provides incentive to the provider to avoid screwing up the service. It doesn't directly do anything to directly improve the service or restore service from an outage.
Many customers would rather not multihome directly, and prefer "set it and forget it" connectivity. It's much easier to maintain a multi-pipe connection that consists of one static default route than a pipe to multiple carriers. The former requires simple physical pipe management, which can be left alone for 99% of the time. The latter requires BGP feed, an ASN, and typically much more than 1% of an employee's time to keep running smoothly.
I've done simple ASN/BGP based multihoming for a number of businesses, and, it can be done on a mostly set-and-forget basis. If you have your upstreams supply 0.0.0.0/0 via BGP and no other routes, and, you advertise your networks, believe it or not, that's a pretty stable configuration. If your upstreams are reasonably reliable, that works pretty well. If not, and, you care about knowing what your upstreams can't reach at the moment, then, you need a full feed and life becomes slightly more complicated.
Obtaining single-homed connectivity from a Tier-2 mostly "outsources" network support, and small to medium size businesses tend to like that. It's not the only leaf end solution to the problem, but it's a viable one (and can be less costly to the rest of the world in turn).
It's also not a complete solution to the problem. Sure, there are a class of customers whose needs that meets. However, because this meets some needs does not mean that it is legitimate to pretend that other needs do not exist. While we're at this, I'll debunk a few common myths: Myth: Renumbering pain is proportional to network size. Fact: Renumbering pain is proportional to the number of devices which need to be touched over which you do not have administrative control. A /16 which is entirely under my control and which is not present in ACL and other entries outside of my control is much easier to renumber than a /24 which contains a bunch of VPN terminators and serves 10,000s of customers who all have my addresses in their VPN boxes and ACLs on their firewalls. Myth: The need to multihome is proportional to the size of the organization. Fact: Some large organizations have only a few critical needs for the network and those needs are easily colocated in other facilities. For the rest of their use, being single-homed or multi-piped to a single provider is quite adequate. Some small organizations, OTOH, cannot function if their access to the network is impaired. For these organizations, multihoming is not only important, it's life and death. Myth: PI space is not needed in IPv6 because we fixed the need. Fact: PI space in IPv6 scares people because of the potential for unconstrained routing table growth. What is needed is to fundamentally change the way we do routing. IPv6 stopped well short of this goal, and, as such, provides little benefit beyond the original TUBA proposal, having decided that all of the real problems that needed to be solved were "too hard". IPv6 does nothing to eliminate the need for PI space and ASNs at end sites that need to be truly multihomed. Shim6 is an attempt at providing some level of workaround to this deficiency, but, for full functionality, it requires significant implementation on all affected end-nodes and some of the routing infrastructure. For now, it's just a pipe dream. In the long-run, it's a very complex hack to replace what could be a relatively simple correction. Owen -- If it wasn't crypto-signed, it probably didn't come from me.
On Wed, 19 Oct 2005, Owen DeLong wrote:
I've done simple ASN/BGP based multihoming for a number of businesses, and, it can be done on a mostly set-and-forget basis. If you have your upstreams supply 0.0.0.0/0 via BGP and no other routes, and, you advertise your networks, believe it or not, that's a pretty stable configuration. If your upstreams are reasonably reliable, that works pretty well. If not, and, you care about knowing what your upstreams can't reach at the moment, then, you need a full feed and life becomes slightly more complicated.
There's really nothing more complicated about taking 2 (or more) full views, other than keeping an eye on available memory. The C&W/PSI incident a few years ago and the more recent Cogent/Level3 incident are perfect examples of why taking two 0/0's really doesn't cut it if you want reliable connectivity to the "whole internet". Cisco burned a lot people by building routers with needlessly limited RAM capacities (planned obsolescence?). Because of that, one customer wouldn't buy another cisco, and instead went Imagestream. They have 3 full views and no worries now. They were so happy with that Imagestream, they ended up buying a bunch more for internal WAN needs. Another customer I dealt with recently was fairly typical of the "small multihomer" I'd guess. They were multihomed to two Tier1 providers and wanted to replace one of them with us. Their BGP had been done either by a consultant or former employee and was definitely set and forgot on autopilot. Their router (cisco 3640) kept "dying" and they'd just power cycle it as needed. When I got in to take a look, I found it was taking full views and had pretty much no RAM left...and it was announcing all their space deaggregated as /24s for no reason. They weren't willing to shell out the $ for a bigger router, so I ended up configuring them for full routes from us and customer routes from their other (a Tier1) provider (and fixing their advertisements). Other than expansion (more network statements), running out of RAM again, or changing providers, I doubt their BGP config will need to be touched in the forseeable future. ---------------------------------------------------------------------- Jon Lewis | I route Senior Network Engineer | therefore you are Atlantic Net | _________ http://www.lewis.org/~jlewis/pgp for PGP public key_________
--On October 19, 2005 11:17:02 PM -0400 Jon Lewis <jlewis@lewis.org> wrote:
On Wed, 19 Oct 2005, Owen DeLong wrote:
I've done simple ASN/BGP based multihoming for a number of businesses, and, it can be done on a mostly set-and-forget basis. If you have your upstreams supply 0.0.0.0/0 via BGP and no other routes, and, you advertise your networks, believe it or not, that's a pretty stable configuration. If your upstreams are reasonably reliable, that works pretty well. If not, and, you care about knowing what your upstreams can't reach at the moment, then, you need a full feed and life becomes slightly more complicated.
There's really nothing more complicated about taking 2 (or more) full views, other than keeping an eye on available memory. The C&W/PSI incident a few years ago and the more recent Cogent/Level3 incident are perfect examples of why taking two 0/0's really doesn't cut it if you want reliable connectivity to the "whole internet".
Yes and no. Most people that will spend the $$ for routers with enough memory to handle multiple full feeds are also looking to get a certain amount of TE capability out of the deal, and, at that point, babysitting the TE becomes more than 0.01 FTE (closer to 0.30 in my experience).
Cisco burned a lot people by building routers with needlessly limited RAM capacities (planned obsolescence?). Because of that, one customer wouldn't buy another cisco, and instead went Imagestream. They have 3 full views and no worries now. They were so happy with that Imagestream, they ended up buying a bunch more for internal WAN needs.
That's an interesting way to look at it. I think that at the time those routers were designed (I'm assuming you are talking AGS+ here), there was no concept of why anyone would ever need that much memory, and, designing a board to accommodate it would have seriously increased the size and price of the router. If you're talking about more recent, then, it's a marketing decision to not facilitate full tables on low-end routers lest they start eating into their high-end router business.
Another customer I dealt with recently was fairly typical of the "small multihomer" I'd guess. They were multihomed to two Tier1 providers and wanted to replace one of them with us. Their BGP had been done either by a consultant or former employee and was definitely set and forgot on autopilot. Their router (cisco 3640) kept "dying" and they'd just power
Lol... Yep, that happens.
cycle it as needed. When I got in to take a look, I found it was taking full views and had pretty much no RAM left...and it was announcing all their space deaggregated as /24s for no reason. They weren't willing to shell out the $ for a bigger router, so I ended up configuring them for full routes from us and customer routes from their other (a Tier1) provider (and fixing their advertisements). Other than expansion (more network statements), running out of RAM again, or changing providers, I doubt their BGP config will need to be touched in the forseeable future.
That could be true, but, how long do you really think the RAM will last? Owen
On Wed, 19 Oct 2005, Owen DeLong wrote:
Yes and no. Most people that will spend the $$ for routers with enough memory to handle multiple full feeds are also looking to get a certain amount of TE capability out of the deal, and, at that point, babysitting the TE becomes more than 0.01 FTE (closer to 0.30 in my experience).
Some may. The one I'm talking about with the Imagestreams really doesn't. They've overprovisioned the heck out of their network after the C&W/PSI thing and really have no need for TE. In fact, no attempt at all has been made to influence their traffic. Just a simple let BGP take care of it config.
That's an interesting way to look at it. I think that at the time those routers were designed (I'm assuming you are talking AGS+ here), there
I wasn't thinking that far back. I'm talking about the 3640 and 2610/2611/2620/2621s. For many end users, these routers would be just fine for multihoming with a few T1s, if they had the RAM capacity for several full views. At the time the above customer was multihoming, their only real options with cisco were the 3660 and 7200 series, which were overkill (and overpriced if you want new gear from say Tech Data). cisco finally has come out with replacements for those "little routers" with much bigger RAM capacities...but they're a little late.
That could be true, but, how long do you really think the RAM will last?
I suppose it won't. I just checked up on them. Seems they must have canceled their other provider (I hope so anyway...its been down at least a week), and with just 1 full view from us, they have 2.3mb free. I guess its time to get them on the phone and see about either shutting off BGP or just sending them 0/0. Another 3640 bites the dust. ---------------------------------------------------------------------- Jon Lewis | I route Senior Network Engineer | therefore you are Atlantic Net | _________ http://www.lewis.org/~jlewis/pgp for PGP public key_________
At 11:59 AM 19/10/2005, Elmar K. Bins wrote:
tv@duh.org (Todd Vierling) wrote:
Tier-2s should be given much more credit than they typically are in write-ups like this. When a customer is single homed to a tier-2 that has multiple tier-1 upstreams, and uses a delegated netblock from the tier-2's aggregations, that means one less ASN and one or more less routes in the global table.
That's the operators' view, but not the customer's. The customer wants redundancy.
The customer wants reliability and BGP is not necessarily the way for them to do it. Telling a typical corporate IT department with generalized IT skills (read no large Internetworking experience) to now become BGP masters will only open up a news ways to disrupt their network connectivity. There are better ways to do it as you describe below.
So we should try to find a way to tell them "Hey, it's mostly Tier-1's (or wannabes) that play such games, stick to a trustworthy Tier-2. And, hey, btw., connect redundantly to them, so you have line failure resiliency and also a competent partner that cares for everything else."
Agreed! ---Mike
mike@sentex.net (Mike Tancsa) wrote:
The customer wants redundancy.
The customer wants reliability
That's what you know and what I know. The customer has already jumped one step ahead from "reliability" to "multiple providers", just like he does with parcel services etc.
There are better ways to do it as you describe below.
Yup, but the customer needs to be made aware of them. Elmi. -- "Begehe nur nicht den Fehler, Meinung durch Sachverstand zu substituieren." (PLemken, <bu6o7e$e6v0p$2@ID-31.news.uni-berlin.de>) --------------------------------------------------------------[ ELMI-RIPE ]---
On Wed, 19 Oct 2005, Todd Vierling wrote:
On Wed, 19 Oct 2005, Christopher L. Morrow wrote:
"Gartner said every location that requires mission-critical internet connectivity, including externally hosted websites, should be multi-homed"
200k routes, here we come!
it is just good common sense though, eh?
Well, not necessarily.
Sorry,
"Gartner said every location that requires mission-critical internet connectivity, including externally hosted websites, should be multi-homed"
is common sense. If you have something 'mission critical' to your business you had better have more than one link out... It'd make great sense to make sure that the links in question atleast didn't end up on the same router at the far end, and while you are at it, get them to different providers (hopefully) in different telco-hotels.
Tier-2s should be given much more credit than they typically are in write-ups like this. When a customer is single homed to a tier-2 that has multiple tier-1 upstreams, and uses a delegated netblock from the tier-2's aggregations, that means one less ASN and one or more less routes in the global table.
I'm not such a believer in the tier-n classifications, single homing a critical resource is just dumb, regardless of the 'tier' you stick it in. gartner, as gartner normally does, is just stating the obvious and making money doing it.
On Oct 19, 2005, at 11:54 AM, Todd Vierling wrote:
"Gartner said every location that requires mission-critical internet connectivity, including externally hosted websites, should be multi-homed"
200k routes, here we come!
it is just good common sense though, eh?
Well, not necessarily.
Tier-2s should be given much more credit than they typically are in write-ups like this. When a customer is single homed to a tier-2 that has multiple tier-1 upstreams, and uses a delegated netblock from the tier-2's aggregations, that means one less ASN and one or more less routes in the global table.
It's a Good Thing(tm).
For you. For the customer with an Internet "mission critical app", being tied to a Tier 2 has it's own set of problems, which might actually be worse than being tied to a Tier 1. The Internet is a business tool. If providers do not meet business requirements, providers will not be supported. Period. If 200K routes, or more ASNs, or small customers multihoming, or stuff like that scares you, find another line of work. (Hint-hint to the v6 fanatics.) -- TTFN, patrick
On Wed, 2005-10-19 at 12:03 -0400, Patrick W. Gilmore wrote:
For the customer with an Internet "mission critical app", being tied to a Tier 2 has it's own set of problems, which might actually be worse than being tied to a Tier 1.
I think this is largely dependant on the specific topology and redundancy in the Tier-2's network and the way they provide multiple uplinks. When done well, with uplinks spread over separate physical locations, well thought out IP adressing and de-centralised exits from the Tier-2's network out to multiple Tier-n's, there's usually a benefit to multi-homed connections to a Tier-2 rather than a Tier-1, with minimum capacity and pricing being the most important ones. -- --- Erik Haagsman Network Architect We Dare BV Tel: +31(0)10-7507008 Fax: +31(0)10-7507005 http://www.we-dare.nl
On Oct 19, 2005, at 12:08 PM, Elmar K. Bins wrote:
patrick@ianai.net (Patrick W. Gilmore) wrote:
For the customer with an Internet "mission critical app", being tied to a Tier 2 has it's own set of problems, which might actually be worse than being tied to a Tier 1.
Please elaborate.
I probably used poor word choice. The "Tier" of a provider is just marketing unless you are talking about the networks in the "DFZ" ("SFI club", or whatever). When I made that statement, I was thinking more about the marketing hype, meaning a "tier two" not only has transit, but is a smaller, probably regional provider. Naturally, a "tier two" might very well be a huge company with network assets on multiple continents and 10s of Gbps (or more) of traffic. The problems with a small provider might include: * Business viability * Global reach * Capacity * Redundant architecture * Etc., etc., etc. Depends on the customer which of these are important. The guy with 10 Mbps of traffic all in Nashville, TN doesn't care about global reach or capacity, but might be VERY interested in redundant architecture & business viability. A mom-n-pop shop might not be the right choice for him. The guy with 12 datacenters in 6 states - or countries - might have different ideas of what is important. Maybe he's OK with one site going offline one day a year if he can save $10K on transit costs, so a mom-n-pop shop would be fine. Personally, I think if your application really is "mission critical", then you _must_ be multi-homed with your own space and own ASN. Anything less and you've tied your business to a vendor. I might like some networks, but I don't want my corporate life & death to be tied to any of them when I can ensure my independence for what is probably a small sum of money in the grand scheme of things. The question of which providers to use as upstreams is a business decision based on the items above, cost, and lots of other stuff. Sorry if I was unclear before. -- TTFn, patrick
patrick@ianai.net (Patrick W. Gilmore) wrote:
The problems with a small provider might include:
* Business viability * Global reach * Capacity * Redundant architecture * Etc., etc., etc.
Thanks - understood ;-) I see, btw, a lot of Tier-3 (or -4, -5) providers that have easily survived their Tier-2 Transits going down the river. If customers can be convinced that the tier thing has nothing to do with business viability and/or stability, and that size does not always matter, we might get them on the right track. As long as they believe the marketing-speak, that Tier-1 ISPs are god (no double-o here) and Tier-2's are still quite good, but everything else is crap for "real" business, well... Always remember: For every customer, their stuff _is_ mission critical. So everyone will take the multihoming road if they can afford it. We can make it more expensive, or we can offer other solutions. Yours, Elmar (formerly with a Tier-17, quite stable though) -- "Begehe nur nicht den Fehler, Meinung durch Sachverstand zu substituieren." (PLemken, <bu6o7e$e6v0p$2@ID-31.news.uni-berlin.de>) --------------------------------------------------------------[ ELMI-RIPE ]---
Always remember: For every customer, their stuff _is_ mission critical. So everyone will take the multihoming road if they can afford it.
We can make it more expensive, or we can offer other solutions.
Why should we do either? Why not fix the way we do routing so that it's OK for everyone to multihome? The fundamental problem is that IP addresses are serving more than one purpose, and, as a result, we have a bunch of unnecessary baggage on the routing system. Today, an IP address serves as an End System Identifier _AND_ as a Routing Location Identifier. This is sort of divided in the Network/Host portions, but, the problem is that it isn't really divided. The Host portion of the address is only part of the End System Identifier, and, the network portion is also necessary in order to uniquely identify that Host. There's the rub. Imagine instead, a world where Routing Location Identifiers are not coupled to End System Identifiers and Interdomain routing (AS-AS routing) occurred based on Routing Location Identifier, and only Intra-AS routing depended on the End System Identifier. For example: Host A connected to ISP X then ISP Y to ISP Z which provides service to Host B. Today, A, X, Y, Z all need to know how to reach B. If we separated the RLI from the ESI, then, the fact that B is reached via Z only has to be available information that can be looked up by A, and, X and Y only need to know how to get to Z. Only Z needs to know how to reach B. This allows the amount of data kept by each point along the way to be much smaller. We already have a separate RLI in IP, but, we fail to use it this way because we are missing: The way for A (or X) to look up the Host->RLI data. Routers and routing protocols that think in terms of RLI reachability instead of Host group (prefix) reachability. Todays RLI is called an ASN. Imagine if you could lookup the Origin-AS(es) for a given prefix through a query (similar to DNS, some protocol to be developed) and then, instead of sending to B, you send a packet addressed as: DST RLI: Z DST HOST: B SRC HOST: A Now, until it reaches ISP Z, nobody needs to look at anything but DST RLI Z to make a forwarding decision. Ideally, I think this should be implemented so that A sends to default and the first DFZ router does the lookup and DST RLI insertion, but, it could be done at the source host. I realize this requires code modification and protocol modification to make it work, but, doesn't this solve the routing table size problem and allow universal multihoming? A multihomed site that was not in the DFZ could simply return multiple RLI records and the inserting router could choose what it thought was the best one. Owen -- If it wasn't crypto-signed, it probably didn't come from me.
I wanted to answer on this, because I thought along the same lines. owen@delong.com (Owen DeLong) wrote:
For example:
Host A connected to ISP X then ISP Y to ISP Z which provides service to Host B.
Today, A, X, Y, Z all need to know how to reach B.
If we separated the RLI from the ESI, then, the fact that B is reached via Z only has to be available information that can be looked up by A, and, X and Y only need to know how to get to Z. Only Z needs to know how to reach B. This allows the amount of data kept by each point along the way to be much smaller.
My idea (somebody had it before, I'm sure, but then, it is my head that got invaded by it, so here she comes...): Rewriting would IMHO not work easily, but encapsulation would. Admittedly, this idea has occurred and lead to MPLS implementations (which are weak at interconnecting ISPs anyway). Well, let's see what else we can do, that MPLS maybe cannot. If the end user does not determine the RLI themselves, but their ISP does (on edge routers), it looks like this: A is the customer, Internet access provided by X B is the customer of Z Y is an intermediate system A -> [Src: a.b.c.d Dst: e.f.g.h Data: ...] -> X X -> Add envelope -> [RLI: Z [Src: a.b.c.d Dst: e.f.g.h Data: ...] X -> [RLI: Z [Src: a.b.c.d Dst: e.f.g.h Data: ...] -> Y Y -> [RLI: Z [something]] -> Z Z -> Remove envelope -> [Src: a.b.c.d Dst: e.f.g.h Data: ...] Z -> [Src: a.b.c.d Dst: e.f.g.h Data: ...] -> B Routing decision is thus made by looking up paths for Z. Multihoming works the same, but we get multiple RLIs in the packet. If B is multihomed, I am not in favour of A (or X) selecting the location of B to be used. I believe the routing system should be able to determine that, like it's done right now. We have some major points here, and one possible ballbreaker: + Prefixes (ESI) have gone from the routing process + Customers are hidden behind their ISPs + Packets carry their routing information (instead of ESI info) + Packets may as well be deeply inspected, if necessary - Edge routers need to be extremely powerful, because they have to determine all the ESI <-> RLI information Ballbreaker (shared with Owen's idea): - This scheme needs the ISPs' edge routers to do the looking up, and if we do not find a way to incorporate updating the lookup table into part of the routing system, we are in violation of Par. 2.1.20 of draft-irtf-routing-reqs-03.txt, which is a very sensible requirement IMHO. I'm not saying this solves all problems, but I did not want this idea lost in the mists of time; maybe it's a starting point, maybe it is not (I'm still not through with the draft). But at least it differentiates between DFZ (aka Internet Core) routing and edge routing. Elmar. -- "Begehe nur nicht den Fehler, Meinung durch Sachverstand zu substituieren." (PLemken, <bu6o7e$e6v0p$2@ID-31.news.uni-berlin.de>) --------------------------------------------------------------[ ELMI-RIPE ]---
Rewriting would IMHO not work easily, but encapsulation would. Admittedly, this idea has occurred and lead to MPLS implementations (which are weak at interconnecting ISPs anyway).
Why wouldn't rewriting work? The "encapsulation" you show below is little different from the rewrite I propose. First, let's start with something that looks a little more like an IPv6 datagram: [Dst: ::B Src: ::A Prot: ICMP [Type: Echo Req [Data: ...]]] Now, let's look at what the first DFZ router would do to the packet: [RLI: Z Dst: ::B Src: ::A Prot: ICMP [Type: Echo Req [Data: ...]]] or [DST: ::B Src: ::A EXT[RLI: Z] Prot: ICMP [...]]] Then, Upon arrival at the first Router within AS Z, the packet is rewritten again: [Dst: ::B Src ::A Prot: ICMP [Type: Echo Req [Data: ...]]] So... Nobody outside the DFZ needs to change anything, all the checksums and such at hosts that should check them still work, and, even IPSec packet tampering would not detect this since the final packet arrives unchanged. Further, any router along the way that doesn't understand the Extension header doesn't have to really look at it, so, during transition, routing can occur on either RLI or Dst. If you encapsulate, you lose that transitional ability.
Well, let's see what else we can do, that MPLS maybe cannot.
Perhaps become ubiquitous implementation in the DFZ?
If the end user does not determine the RLI themselves, but their ISP does (on edge routers), it looks like this:
Actually, even that isn't necessarily an accurate characterization of what I am suggesting. The packet should not be rewritten until it reaches a DFZ router outside of AS Z. Whether that is within AS Y, or somewhere upstream (possibly more than one level upstream) of AS Y, that's where the initial rewrite should occur ideally. If the first DFZ router doesn't yet know about RLI, however, then, the first RLI aware router in the DFZ prior to reaching AS Z should do the rewrite.
A is the customer, Internet access provided by X B is the customer of Z Y is an intermediate system
A -> [Src: a.b.c.d Dst: e.f.g.h Data: ...] -> X X -> Add envelope -> [RLI: Z [Src: a.b.c.d Dst: e.f.g.h Data: ...] X -> [RLI: Z [Src: a.b.c.d Dst: e.f.g.h Data: ...] -> Y Y -> [RLI: Z [something]] -> Z Z -> Remove envelope -> [Src: a.b.c.d Dst: e.f.g.h Data: ...] Z -> [Src: a.b.c.d Dst: e.f.g.h Data: ...] -> B
Routing decision is thus made by looking up paths for Z. Multihoming works the same, but we get multiple RLIs in the packet.
Um... No... You don't want multiple RLIs in the packet. You want the router inserting the RLI to have the ability to chose from multiple RLIs. If you start playing with changing RLI along the way, then, you run into serious difficulty with looping possibilities. By choosing an RLI close to the source that, at the time of selection, had a valid dynamic advertised (BGP) AS Path for reachability, you seriously reduce the likelihood of looping the packet.
If B is multihomed, I am not in favour of A (or X) selecting the location of B to be used. I believe the routing system should be able to determine that, like it's done right now.
Look... The first DFZ router selects the location of B to be used in todays world, why should this change?
We have some major points here, and one possible ballbreaker:
+ Prefixes (ESI) have gone from the routing process
That's a GOOD thing.
+ Customers are hidden behind their ISPs
I'm not sure what you mean by that.
+ Packets carry their routing information (instead of ESI info)
No. Under my proposal, packets carry both RLI and ESI information, but, in separate fields.
+ Packets may as well be deeply inspected, if necessary
That already happens, but, it is not necessary under my proposal.
- Edge routers need to be extremely powerful, because they have to determine all the ESI <-> RLI information
Nope. DFZ routers (which already need to be extremely powerful) need to be able to perform lookups for ESI->RLI (one way, btw) mapping. This could be accomplished by a protocol similar to DNS, but, more secure and authenticated. Trading a lookup at first sight of destination prefix, then cache against trying to manage a 32 bit routing table (4 billion routes?) is likely a scalability win. Even if it's just 1 million routes, I think it is a win.
Ballbreaker (shared with Owen's idea): - This scheme needs the ISPs' edge routers to do the looking up, and if we do not find a way to incorporate updating the lookup table into part of the routing system, we are in violation of Par. 2.1.20 of draft-irtf-routing-reqs-03.txt, which is a very sensible requirement IMHO.
No... This scheme needs DFZ routers to do the lookup. This is going to require significant changes to RFCs for full implementation anyway, and, no, the whole point of my proposal is for routers NOT to have to carry full lookup information, so, it is my intent to modify that requirement.
But at least it differentiates between DFZ (aka Internet Core) routing and edge routing.
I think that is the necessary first step. I also think that the idea of maintaining global knowledge of the entire routing data (as required in Par. 2.1.20) scales about as well as the IEN116 hosts file we all knew and loved (hint, when was the last time you FTPd /etc/hosts from SRI?) Owen -- If it wasn't crypto-signed, it probably didn't come from me.
owen@delong.com (Owen DeLong) wrote:
Why wouldn't rewriting work? The "encapsulation" you show below is little different from the rewrite I propose.
Except that it conserves the original addressing information, which I believe to be important.
First, let's start with something that looks a little more like an IPv6 datagram:
You're only talking v6? Why? Anyway, let's follow this through...
[DST: ::B Src: ::A EXT[RLI: Z] Prot: ICMP [...]]]
Then, Upon arrival at the first Router within AS Z, the packet is rewritten again:
[Dst: ::B Src ::A Prot: ICMP [Type: Echo Req [Data: ...]]]
You have used special fields in the IP header. Well, that's an elegant way to do it _if_ you have this field. You do not have this in IPv4, and that's what we'll be stuck with for the next couple of years, unfortunately (or not: I can remember v4 addresses much more easily...)
final packet arrives unchanged. Further, any router along the way that doesn't understand the Extension header doesn't have to really look at it, so, during transition, routing can occur on either RLI or Dst. If you encapsulate, you lose that transitional ability.
Good point you have here.
Actually, even that isn't necessarily an accurate characterization of what I am suggesting. The packet should not be rewritten until it reaches a DFZ router outside of AS Z. Whether that is within AS Y, or somewhere upstream (possibly more than one level upstream) of AS Y, that's where the initial rewrite should occur ideally. If the first DFZ router doesn't yet know about RLI, however, then, the first RLI aware router in the DFZ prior to reaching AS Z should do the rewrite.
I see a couple of shortcomings to your idea: - it is limited to an IP protocol that carries a RLI header field - you only include one RLI in the packet header I do neither believe that we'll get rid of v4 soon, nor do I think it is a good idea to let the sender decide to which RLI to route the packet. The benefit of multihoming is lost then.
Um... No... You don't want multiple RLIs in the packet. You want the router inserting the RLI to have the ability to chose from multiple RLIs.
Definitely not.
If you start playing with changing RLI along the way, then, you run into serious difficulty with looping possibilities.
That is not intended. Another way to avoid loops must be found, and I believe the danger is pretty small. The RLIs in the packet are not changed in transit. But of course every new router can choose towards which RLI to send the packet. Luckily, distance on a working path in the Internet generally decreases as you approach a target you have chosen. I do see that there is a danger of looping, but I believe a way to detect that can be found.
By choosing an RLI close to the source that, at the time of selection, had a valid dynamic advertised (BGP) AS Path for reachability, you seriously reduce the likelihood of looping the packet.
Yes, but you lose the benefit of multihoming, because the rewriting edge router may carry outdated information or simply make a "bad" choice. I'd rather have routing intelligence in the core than in the edge.
If B is multihomed, I am not in favour of A (or X) selecting the location of B to be used. I believe the routing system should be able to determine that, like it's done right now.
Look... The first DFZ router selects the location of B to be used in todays world, why should this change?
I am not sure why you believe that the firsts DFZ router that is being traversed does the choice today. In paths like (from source to multihomed-target): A B C D T A B E F T Who exactly chooses? IMHO it's AS B that does the selection. And: B is closer to the target, aka the source of the routing information. Its BGP table is more probable to be up-to-date.
+ Prefixes (ESI) have gone from the routing process That's a GOOD thing.
Yup. Longest match sucks.
+ Customers are hidden behind their ISPs I'm not sure what you mean by that.
Neither customer Z's ESI nor RLI (they don't need one) are visible in the core. Only their ISPs' RLIs are visible.
No... This scheme needs DFZ routers to do the lookup. This is going to require significant changes to RFCs for full implementation anyway, and, no, the whole point of my proposal is for routers NOT to have to carry full lookup information, so, it is my intent to modify that requirement.
If I understand the idea correctly, you have to distribute two types of wide-area routing information: One ESI-based and one RLI-based. This is because any DFZ box max or may not be able to RLI-route and/or, if it sees that that's not been done yet, perform the translation. Of course, not every DFZ router needs both those tables, but there are some that do. Oh, and you do of course have to distribute the mapping info.
But at least it differentiates between DFZ (aka Internet Core) routing and edge routing.
I think that is the necessary first step.
Then I do not understand why you want the DFZ routers to be able to translate.
I also think that the idea of maintaining global knowledge of the entire routing data (as required in Par. 2.1.20) scales about as well as the IEN116 hosts file we all knew and loved (hint, when was the last time you FTPd /etc/hosts from SRI?)
Alright, then please do explain how in your model the system is going to bootstrap itself...I believe, 2.1.20 is there for a very good reason (to save me the hassle and fly around the world pushing DVDs full or initial routing information into my routers)... I am not sure whether I have fully understood your idea, its implications (I've tried to describe above what I understood, but I'd like to be corrected there); I do see that your idea is based on the assumption that it only has to work with an IP protocol that has special header fields for routing infor, and is not applicable to the Internet as of today (except in a very small part, called "IPv6 world"). And what I do especially not like is the source of the packet predetermining the topological destination, because it only has a limited view. Apart from that, I like it, because it is almost as simple as my own idea, but obviously more thoroughly thought through ;) (That's a lot of th'es there). Elmar. -- "Begehe nur nicht den Fehler, Meinung durch Sachverstand zu substituieren." (PLemken, <bu6o7e$e6v0p$2@ID-31.news.uni-berlin.de>) --------------------------------------------------------------[ ELMI-RIPE ]---
Because of the number of misconceptions of my idea presented, I'm posting this to the list. Those uninterested, feel free to ignore. Those interested, feel free to follow up with me directly. After this, I will not be continuing this on the list unless there is significant interest from multiple parties. Owen --On October 21, 2005 12:12:22 AM +0200 "Elmar K. Bins" <elmi@4ever.de> wrote:
owen@delong.com (Owen DeLong) wrote:
Why wouldn't rewriting work? The "encapsulation" you show below is little different from the rewrite I propose.
Except that it conserves the original addressing information, which I believe to be important.
Look at what I proposed again... My rewrite does NOT modify the original addressing, it ADDs data to the header.
First, let's start with something that looks a little more like an IPv6 datagram:
You're only talking v6? Why? Anyway, let's follow this through...
Because we don't really need to solve this in V4. V4 multihoming is well understood and is unlikely to hit a scaling limit on router capabilities before we hit an end of life on address space.
[DST: ::B Src: ::A EXT[RLI: Z] Prot: ICMP [...]]]
Then, Upon arrival at the first Router within AS Z, the packet is rewritten again:
[Dst: ::B Src ::A Prot: ICMP [Type: Echo Req [Data: ...]]]
You have used special fields in the IP header. Well, that's an elegant way to do it _if_ you have this field. You do not have this in IPv4, and that's what we'll be stuck with for the next couple of years, unfortunately (or not: I can remember v4 addresses much more easily...)
Again... Multihoming already works in V4 and there is no real need to solve this in the V4 world.
final packet arrives unchanged. Further, any router along the way that doesn't understand the Extension header doesn't have to really look at it, so, during transition, routing can occur on either RLI or Dst. If you encapsulate, you lose that transitional ability.
Good point you have here.
Actually, even that isn't necessarily an accurate characterization of what I am suggesting. The packet should not be rewritten until it reaches a DFZ router outside of AS Z. Whether that is within AS Y, or somewhere upstream (possibly more than one level upstream) of AS Y, that's where the initial rewrite should occur ideally. If the first DFZ router doesn't yet know about RLI, however, then, the first RLI aware router in the DFZ prior to reaching AS Z should do the rewrite.
I see a couple of shortcomings to your idea: - it is limited to an IP protocol that carries a RLI header field - you only include one RLI in the packet header
You only need one RLI in the packet header. More would actually be bad. Let me 'splain. If you are routing on RLI, then, you need to choose the best path and stick to it. If the packet doesn't make it through that way, that's OK... That's what retransmits are for. If you start rerouting it on the fly, it's likely to loop a lot before dying, but, little else is achieved. Worse, it's likely to loop even if it might have gotten there given one path and only one path chosen as best by the RLI inserting router.
I do neither believe that we'll get rid of v4 soon, nor do I think it is a good idea to let the sender decide to which RLI to route the packet. The benefit of multihoming is lost then.
No, it is not. Since the RLI inserting router has up to date dynamic information about which RLIs are reachable and at what cost (BGP distance vector data), you have the same overall effect as dynamic routing today. Just instead of trading prefix routes everywhere, you trade AS reachability info everywhere and map prefixes to ASs.
Um... No... You don't want multiple RLIs in the packet. You want the router inserting the RLI to have the ability to chose from multiple RLIs.
Definitely not.
Definitely so.
If you start playing with changing RLI along the way, then, you run into serious difficulty with looping possibilities.
That is not intended. Another way to avoid loops must be found, and I believe the danger is pretty small. The RLIs in the packet are not changed in transit. But of course every new router can choose towards which RLI to send the packet. Luckily, distance on a working path in the Internet generally decreases as you approach a target you have chosen. I do see that there is a danger of looping, but I believe a way to detect that can be found.
Why. Why not keep it simple and recognize that when routing changes, some packets get lost during the shuffle. This is the way it is today, and, this wouldn't be any worse with this system. Also, this means that loop detection continues to work essentially the way it does today, and, it doesn't require near as much new code or protocol support as what you propose.
By choosing an RLI close to the source that, at the time of selection, had a valid dynamic advertised (BGP) AS Path for reachability, you seriously reduce the likelihood of looping the packet.
Yes, but you lose the benefit of multihoming, because the rewriting edge router may carry outdated information or simply make a "bad" choice. I'd rather have routing intelligence in the core than in the edge.
No. You have nearly the same advantage you have today. If the path goes away, then, hopefully by the time of retransmit, the RLI inserting router will have learned that that RLI destination is no longer reachable, and, he will insert a different one in the retransmitted packet. Same as what happens today with the retransmitted packet being sent a different way.
If B is multihomed, I am not in favour of A (or X) selecting the location of B to be used. I believe the routing system should be able to determine that, like it's done right now.
Look... The first DFZ router selects the location of B to be used in todays world, why should this change?
I am not sure why you believe that the firsts DFZ router that is being traversed does the choice today. In paths like (from source to multihomed-target):
A B C D T A B E F T
Who exactly chooses? IMHO it's AS B that does the selection. And: B is closer to the target, aka the source of the routing information. Its BGP table is more probable to be up-to-date.
Right... B is the first DFZ router. A is not likely DFZ since A is not multihomed in your scenario. No need for A to be DFZ if A only talks to B.
+ Prefixes (ESI) have gone from the routing process That's a GOOD thing.
Yup. Longest match sucks.
Nope, but, big routing tables suck.
+ Customers are hidden behind their ISPs I'm not sure what you mean by that.
Neither customer Z's ESI nor RLI (they don't need one) are visible in the core. Only their ISPs' RLIs are visible.
Z's ESI is visible in the core, but, not carried in the routing table. Z does not have an RLI, but, instead uses the RLIs of their provider(s).
No... This scheme needs DFZ routers to do the lookup. This is going to require significant changes to RFCs for full implementation anyway, and, no, the whole point of my proposal is for routers NOT to have to carry full lookup information, so, it is my intent to modify that requirement.
If I understand the idea correctly, you have to distribute two types of wide-area routing information: One ESI-based and one RLI-based. This is because any DFZ box max or may not be able to RLI-route and/or, if it sees that that's not been done yet, perform the translation. Of course, not every DFZ router needs both those tables, but there are some that do.
Initially, yes, there will have to be hybrid table overlap (RLI global table and prefix-based global table). However, don't confuse prefix-based global table with ESI map. The ESI map would be a distributed database (like current A record lookups in DNS) for Query(ESI)->{RLI, RLI, RLI...} (set of destination RLIs for given ESI). In the long run (once this is ubiquitous on core routers), the global prefix-based table can be abandoned freeing router memory. Hopefully that would occur before the global table and this table grew to require significant hardware upgrades, and, would make significant room for caching ESI->RLI lookups.
Oh, and you do of course have to distribute the mapping info.
No, you don't have to distribute it. You _CAN_ provide it for lookup instead.
But at least it differentiates between DFZ (aka Internet Core) routing and edge routing.
I think that is the necessary first step.
Then I do not understand why you want the DFZ routers to be able to translate.
I don't know what you mean by translate.
I also think that the idea of maintaining global knowledge of the entire routing data (as required in Par. 2.1.20) scales about as well as the IEN116 hosts file we all knew and loved (hint, when was the last time you FTPd /etc/hosts from SRI?)
Alright, then please do explain how in your model the system is going to bootstrap itself...I believe, 2.1.20 is there for a very good reason (to save me the hassle and fly around the world pushing DVDs full or initial routing information into my routers)...
Well... Since we already have RIRs, I don't see a reason that the top level of the hierarchy for this information couldn't be managed as ANYCAST servers at well known addresses run by the RIRs and/or IANA. All space originates from there anyway, so, it is a natural point of hierarchy. In essence, the router will learn the path to the Root and Top Level RLIs which will be fixed ASNs assigned as part of this protocol deployment. Only the root is truly necessary.
I am not sure whether I have fully understood your idea, its implications (I've tried to describe above what I understood, but I'd like to be corrected there); I do see that your idea is based on the assumption that it only has to work with an IP protocol that has special header fields for routing infor, and is not applicable to the Internet as of today (except in a very small part, called "IPv6 world"). And what I do especially not like is the source of the packet predetermining the topological destination, because it only has a limited view.
The source of the packet does not determine it. The first DFZ router (often many routers removed from the source) determines it's best path. Just like today when the first DFZ router makes a choice, e.g. between forwarding to 701 or 3561 to get to 10565. Once the packet is handed off to 701, it's not going to come back and go via 3561 in most cases. If 701 loses it's connection downstream towards 10565, it will likely drop the packet.
Apart from that, I like it, because it is almost as simple as my own idea, but obviously more thoroughly thought through ;)
Thank you.
(That's a lot of th'es there).
lol Owen
Re Owen, Just a short (ok, now I read it again, it's grown...) answer to the list, but you're right, we might continue this in private. (Reply-To set) Thanks for being so patient explaining everything, and for discussing with a (still somewhat) hairy-head like myself :-) owen@delong.com (Owen DeLong) wrote:
You're only talking v6? Why? Anyway, let's follow this through...
Because we don't really need to solve this in V4. V4 multihoming is well understood and is unlikely to hit a scaling limit on router capabilities before we hit an end of life on address space. ... Again... Multihoming already works in V4 and there is no real need to solve this in the V4 world.
I can expect a strongly rising demand of end-customers to multihome right now, and we still have a bunch of /24s to go on. But then, it may only add another 300Kprefixes to the BGP table, which is not really an order of magnitude. As to the "it works" - surely it does, but up to now I believed it wouldn't scale far enough. Maybe I'm wrong (see Moore).
You only need one RLI in the packet header. More would actually be bad. Let me 'splain. If you are routing on RLI, then, you need to choose the best path and stick to it. If the packet doesn't make it through that way, that's OK... That's what retransmits are for. If you start rerouting it on the fly, it's likely to loop a lot before dying, but, little else is achieved. Worse, it's likely to loop even if it might have gotten there given one path and only one path chosen as best by the RLI inserting router.
Actually, I don't understand the last part; why should it loop in this case? It's a matter of destination(s) look-up on the "core" routers, just like in your model. Only the destination's potentially more than one. It would of course loop anyway if it entered (the same part of) the same transit AS again, but that is independent of whether you see the ESI or not (aka RLI insertion vs. encapsulation). I'm still not comfortable with the box in Sao Paolo determining whether the packet should go to ISP A in Hamburg or ISP B in Munich or ISP C in Frankfurt (from where the respective ISP would forward it to the customer in Cologne). This decision can easily be made later on and result in a "better" path.
No, it is not. Since the RLI inserting router has up to date dynamic information about which RLIs are reachable and at what cost (BGP
The inserting router is less probable to have up-to-date RLI topology information than routers closer to the packet's destination, due to the way the topology information gets distributed.
No. You have nearly the same advantage you have today. If the path goes away, then, hopefully by the time of retransmit, the RLI inserting router will have learned that that RLI destination is no longer reachable, and, he will insert a different one in the retransmitted packet. Same as what happens today with the retransmitted packet being sent a different way.
I don't like "hopefully" here, but maybe that's our trade-off anyway. You are, nonetheless, giving the "RLI inserting router" somewhat "hotter" information, if it has to make the topological choice (choose destination RLI and, implicitly, select a group of possible paths over all others). If it were only to know the translation information which does not change as often, I'd be much happier. What I also do not like is the wrong analogy to today's routing mechanism. You claim implicitly that the RLI inserting router's new decision was the same as what happened in the Internet routing system today: rerouting packets. This means, in other words, you're making a global choice locally. But of course, the current system does not reroute at the packet source (only), it can do this on any hop between source and destination and thus makes only local choices locally. This is a significant difference, because it makes adaptation to changes easier, faster, and it works with only partial convergence along the path.
Who exactly chooses? IMHO it's AS B that does the selection. And: B is closer to the target, aka the source of the routing information. Its BGP table is more probable to be up-to-date.
Right... B is the first DFZ router. A is not likely DFZ since A is not multihomed in your scenario. No need for A to be DFZ if A only talks to B.
Yesyesyes, consider A B C D E F T A B C D G H T What now? Is "D" necessarily the first DFZ router? I think not. So you are still using B for the RLI insertion; B has to make the choice, and that choice may be wrong or sub-optimal.
Z's ESI is visible in the core, but, not carried in the routing table. Z does not have an RLI, but, instead uses the RLIs of their provider(s).
Yup, in your "add something to the header" scenario, the ESI is still visible. In mine it is not (it is, but encapsulated). Actually, it does not matter, as long as the destination can revive this information ("destination" as in "the re-translating router").
In the long run (once this is ubiquitous on core routers), the global prefix-based table can be abandoned freeing router memory. Hopefully that would occur before the global table and this table grew to require significant hardware upgrades, and, would make significant room for caching ESI->RLI lookups.
Moving the intelligence out of the core. Well, yes, that's an advantage for the migration phase (which could take decades).
No, you don't have to distribute it. You _CAN_ provide it for lookup instead.
How do I get there? Bootstrapping? 2.1.20? That's not moot at all.
Then I do not understand why you want the DFZ routers to be able to translate.
I don't know what you mean by translate.
Translate ESI to RLI and insert that into the packet header.
Well... Since we already have RIRs, I don't see a reason that the top level of the hierarchy for this information couldn't be managed as ANYCAST servers at well known addresses run by the RIRs and/or IANA. All space originates from there anyway, so, it is a natural point of hierarchy. In essence, the router will learn the path to the Root and Top Level RLIs which will be fixed ASNs assigned as part of this protocol deployment. Only the root is truly necessary.
Special ASNs/RLIs, reserved for this? What about extensibility there? And actually, that's not bootstrapping the system, because if you're in the DFZ, you need a specific path to go there, and you have to get it from somewhere. So either there's a hole in the idea or I'm too dumb to understand. Or do you mean, every RLI hosts such an anycasted server, in order for their routers to be able to reach it? Lest we forget that the anycasted RIR routing topology servers also need to get updated somehow... Btw, yes, I like the idea of RIRs and RAs taking control over Internet routing (it's only logical, and it's necessary albeit currently impossible); I would propose the same. Not everybody may like that, though.
The source of the packet does not determine it. The first DFZ router (often many routers removed from the source) determines it's best path. Just like today when the first DFZ router makes a choice, e.g. between forwarding to 701 or 3561 to get to 10565. Once the packet is handed off to 701, it's not going to come back and go via 3561 in most cases. If 701 loses it's connection downstream towards 10565, it will likely drop the packet.
BGP has always been based on the idea of contiguous ASs. That's why 3561 will refuse to accept the packet. That avoids loops, but it also makes life harder in other respects. I think your "first DFZ router" is quite close to the packet source in most cases. It usually is the customer's own router (if they participate in DFZ routing, like we do), or it's one of the upstreams' edge routers. At least, it is, as soon as prefix-based routing has disappeared from large parts of the DFZ. Cheers, Elmar. -- "Begehe nur nicht den Fehler, Meinung durch Sachverstand zu substituieren." (PLemken, <bu6o7e$e6v0p$2@ID-31.news.uni-berlin.de>) --------------------------------------------------------------[ ELMI-RIPE ]---
For the customer with an Internet "mission critical app", being tied to a Tier 2 has it's own set of problems, which might actually be worse than being tied to a Tier 1.
The key word is "might". In fact, I would posit that a Tier 2 with multiply redundant transit to all of the Tier 1s could theoretically have better connectivity than an actual Tier 1. The Tier 2 transit provides flexibility that the transit-free Tier 1s do not have. Just my opinion. Anyway, it has been my experience that most (but not all) of the customers that want to "multihome" are _really_ wanting either: A. geographic/router redundancy. or B. easy renumbering. Geographic redundancy can be done within a single AS and IP block. They just don't know to ask it that way. (And easy renumbering will eventually be solved with v6. Eventually.) The demand for multi-homing might not be as great as suspected. John
At 01:05 PM 10/19/2005, John Dupuy wrote:
For the customer with an Internet "mission critical app", being tied to a Tier 2 has it's own set of problems, which might actually be worse than being tied to a Tier 1.
The key word is "might". In fact, I would posit that a Tier 2 with multiply redundant transit to all of the Tier 1s could theoretically have better connectivity than an actual Tier 1. The Tier 2 transit provides flexibility that the transit-free Tier 1s do not have. Just my opinion.
Anyway, it has been my experience that most (but not all) of the customers that want to "multihome" are _really_ wanting either: A. geographic/router redundancy. or B. easy renumbering. Geographic redundancy can be done within a single AS and IP block. They just don't know to ask it that way. (And easy renumbering will eventually be solved with v6. Eventually.)
It has been my experience that most needing to multihome wish to do so to avoid failures within an ISP, failures with a circuit to the ISP, and failures with routers. I should think that with the recent L3/Cogent issue, it should be QUITE clear that multihoming requires linking to two separate backbones, or two separate regionals that buy transit from multiple backbones. Vagaries in backbone providers is high on the list, IMO, and rules out the "multihome to a single provider" approach.
The demand for multi-homing might not be as great as suspected.
John
It is not true. Many tier-2 ISP specializes in very ghigh quality Internet access, so mnasking problems of big ISP (who in reality never can provide high quality Internet at all). Good example - Internap. So, it is not about tier-1 vs tier-2, it is about ISP specialized on cheap acvcess and ISP specialized on quality access. Is COGENT (for example only - I have nothing against them) tier-1 ISP - may be; are they high quality ISP - in NO WAY (they just provide bandwidth to nowhere without any clue). ----- Original Message ----- From: "John Dupuy" <jdupuy-list@socket.net> To: "Patrick W. Gilmore" <patrick@ianai.net>; <nanog@merit.edu> Sent: Wednesday, October 19, 2005 10:05 AM Subject: Re: multi homing pressure
For the customer with an Internet "mission critical app", being tied to a Tier 2 has it's own set of problems, which might actually be worse than being tied to a Tier 1.
The key word is "might". In fact, I would posit that a Tier 2 with
multiply
redundant transit to all of the Tier 1s could theoretically have better connectivity than an actual Tier 1. The Tier 2 transit provides flexibility that the transit-free Tier 1s do not have. Just my opinion.
Anyway, it has been my experience that most (but not all) of the customers that want to "multihome" are _really_ wanting either: A. geographic/router redundancy. or B. easy renumbering. Geographic redundancy can be done within a single AS and IP block. They just don't know to ask it that way. (And easy renumbering will eventually be solved with v6. Eventually.)
The demand for multi-homing might not be as great as suspected.
John
On Sun, Oct 23, 2005 at 11:23:38PM -0700, Alexei Roudnev wrote:
It is not true. Many tier-2 ISP specializes in very ghigh quality Internet access, so mnasking problems of big ISP (who in reality never can provide high quality Internet at all). Good example - Internap.
Masking "problems" of a big ISP and yet creating problems of their own. Have you seen completely multi-transited "tier2 networks" flapping hard core?
So, it is not about tier-1 vs tier-2, it is about ISP specialized on cheap acvcess and ISP specialized on quality access. Is COGENT (for example only - I have nothing against them) tier-1 ISP - may be; are they high quality ISP - in NO WAY (they just provide bandwidth to nowhere without any clue).
Non-sense. James
----- Original Message ----- From: "John Dupuy" <jdupuy-list@socket.net> To: "Patrick W. Gilmore" <patrick@ianai.net>; <nanog@merit.edu> Sent: Wednesday, October 19, 2005 10:05 AM Subject: Re: multi homing pressure
For the customer with an Internet "mission critical app", being tied to a Tier 2 has it's own set of problems, which might actually be worse than being tied to a Tier 1.
The key word is "might". In fact, I would posit that a Tier 2 with
multiply
redundant transit to all of the Tier 1s could theoretically have better connectivity than an actual Tier 1. The Tier 2 transit provides flexibility that the transit-free Tier 1s do not have. Just my opinion.
Anyway, it has been my experience that most (but not all) of the customers that want to "multihome" are _really_ wanting either: A. geographic/router redundancy. or B. easy renumbering. Geographic redundancy can be done within a single AS and IP block. They just don't know to ask it that way. (And easy renumbering will eventually be solved with v6. Eventually.)
The demand for multi-homing might not be as great as suspected.
John
Well, not necessarily.
Tier-2s should be given much more credit than they typically are in write-ups like this. When a customer is single homed to a tier-2 that has multiple tier-1 upstreams, and uses a delegated netblock from the tier-2's aggregations, that means one less ASN and one or more less routes in the global table.
It's a Good Thing(tm).
Not for the single-homed customer when the Tier-2 service is interrupted. Owen -- If it wasn't crypto-signed, it probably didn't come from me.
participants (27)
-
Alexei Roudnev
-
Brandon Butterworth
-
Christopher L. Morrow
-
Daniel Senie
-
David Andersen
-
David G. Andersen
-
Edward B. Dreger
-
Edward Lewis
-
Ejay Hire
-
Elmar K. Bins
-
Erik Haagsman
-
James
-
Jared Mauch
-
John Dupuy
-
John Payne
-
Jon Lewis
-
Justin M. Streiner
-
Leo Bicknell
-
Mark Radabaugh
-
Michael.Dillon@btradianz.com
-
Mike Tancsa
-
Owen DeLong
-
Patrick W. Gilmore
-
Paul Jakma
-
Paul Vixie
-
Peter Kranz
-
Todd Vierling