Re: Does anyone multihome anymore?

newer
Re: For want of a single ethernet...

older
RE: Does anyone multihome anymore?

Mike Tancsa

22 Aug 2007 22 Aug '07

2:25 p.m.

At 10:11 AM 8/22/2007, Paul Kelly :: Blacknight Solutions wrote:

...

Mike Tancsa wrote:

...
At 03:49 AM 8/22/2007, Security Admin (NetSec) wrote:

...
Pardon my forwardness, but don't people just multi-home these days? If your

Multihoming is great for when there is a total outage. In the case of Cogent on Monday, it wasnt "down"... In this case, there is only so much you can do to influence how packets come back at you as BGP doesnt know anything about a "lossy" or slow connections.

---Mike

Take the carrier that is causing you issues out of your eBGP setup and all's well....

Hi, In my case, I have 6453 and 174 for transit. I want to get to 577 which is directly connected to 6453 and 174. 577 has a higher local pref on paths via 174. Short of shutting my 174 session (or some deaggregation), I dont have a way to influence how 577 gets back to me. I can easily exit out 6453, but it does nothing for the return packets. I have enough capacity on 6453 to handle all my traffic, but its a Draconian step to take and some traffic via 174 is fine and would be worse if I fully shut the session. (ie. peers of 174 in Toronto) ---Mike

Show replies by date

Dan Armstrong

22 Aug 22 Aug

2:33 p.m.

New subject: Does anyone multihome anymore?

We're connected to Teleglobe(6453), Telus(852), TeliaSonera(1299), MCI(701), and L3(3356) We don't play any economic games with our traffic - our routing policy is (theoretically) designed to give the best possible product to our customers, and although we weren't dead in the water during the cable cut, we had major problems - especially to Bell(577) for the same reason as Mike. So what is the solution? So do we connect to Bell as well? Even though they are the ones with the moronic routing policy? It would solve the problem, but it's certainly not the way to support "quality" carriers by purchasing only quality bandwidth... Mike Tancsa wrote:

...

At 10:11 AM 8/22/2007, Paul Kelly :: Blacknight Solutions wrote:

...
Mike Tancsa wrote:

...
At 03:49 AM 8/22/2007, Security Admin (NetSec) wrote:

...
Pardon my forwardness, but don't people just multi-home these days? If your

Multihoming is great for when there is a total outage. In the case of Cogent on Monday, it wasnt "down"... In this case, there is only so

much

...
you can do to influence how packets come back at you as BGP doesnt know anything about a "lossy" or slow connections.

---Mike

Take the carrier that is causing you issues out of your eBGP setup and all's well....

Hi, In my case, I have 6453 and 174 for transit. I want to get to 577 which is directly connected to 6453 and 174. 577 has a higher local pref on paths via 174. Short of shutting my 174 session (or some deaggregation), I dont have a way to influence how 577 gets back to me. I can easily exit out 6453, but it does nothing for the return packets. I have enough capacity on 6453 to handle all my traffic, but its a Draconian step to take and some traffic via 174 is fine and would be worse if I fully shut the session. (ie. peers of 174 in Toronto)

---Mike

Steve Gibbard

7:26 p.m.

New subject: Does anyone multihome anymore?

On Wed, 22 Aug 2007, Mike Tancsa wrote:

...

...
...
Multihoming is great for when there is a total outage. In the case of Cogent on Monday, it wasnt "down"... In this case, there is only so much you can do to influence how packets come back at you as BGP doesnt know anything about a "lossy" or slow connections.

---Mike

Take the carrier that is causing you issues out of your eBGP setup and all's well....

Hi, In my case, I have 6453 and 174 for transit. I want to get to 577 which is directly connected to 6453 and 174. 577 has a higher local pref on paths via 174. Short of shutting my 174 session (or some deaggregation), I dont have a way to influence how 577 gets back to me. I can easily exit out 6453, but it does nothing for the return packets. I have enough capacity on 6453 to handle all my traffic, but its a Draconian step to take and some traffic via 174 is fine and would be worse if I fully shut the session. (ie. peers of 174 in Toronto)

I'm posting too much this week and should stop, but... Again, this is a matter of thinking about design goals. What were you trying to accomplish when you bought redundant connections? It probably wasn't "redundancy," but rather something that redundancy would give you. What redundancy gives you is a better statistical probability that not all of the redundant components will be broken at once. It should be noted that multi-homing is just one of many areas of possible redundancy. Anything else that can break -- routers, switches, cables, etc. can all be set up redundantly. No amount of redundancy in any of those components guarantees reliability. What they do mean is that your network can keep functioning if some components break, as long as you still have enough of whatever component it is to keep running. So, in a redundant setup, what happens when a component breaks? In an ideal situation, it breaks cleanly, fail-over happens automatically, and nobody notices. Then you just have to hope your monitoring system is good enough that you know there's something to fix. But, in an ideal situation, things wouldn't break at all, so designing your procedures around "ideal" failure scenarios doesn't make much sense. What redundancy really gives you is the ability to have outages not turn into major disruptions; the ability, when you see that a component is malfunctioning, to turn it off and go back to sleep. You can then do the real fix later, when it's more convenient or less disruptive. Thought about that way, there's nothing "Draconian" about turning off a connection (or a switch, or a router, or any other redundant component) that's not doing what you want it to. Instead, you're taking advantage of a main feature of your design. If your other providers are doing 95th percentile billing, you even have a day and a half per month that you can leave a connection down at no financial cost. The alternative, as you seem to have noticed, is to spend your day stressing out about your network not working properly, and complaining about being helpless. You don't need redundancy for that. -Steve

Mike Tancsa

8:32 p.m.

New subject: Does anyone multihome anymore?

At 03:26 PM 8/22/2007, Steve Gibbard wrote:

...

Thought about that way, there's nothing "Draconian" about turning off a connection (or a switch, or a router, or any other redundant component) that's not doing what you want it to.

While I agree in general with what you are getting at, one point to add is cost. All these goals are constrained within a business case to make. In my case, I could turn off my Cogent connection, but I would have ended up punishing connectivity to other networks that are off Cogent in Toronto only. This would have forced them to get to me via Cogent's pop in Chicago, which was overloaded. So to fix my connectivity into AS577, I would have to hose another group of users in Toronto. Now I could of course add more diversity by adding another connection in Toronto. But, I have to justify the business case to do that. Is it worth the extra money for the few times this particular type of outage happens ? In my case probably not. The cost to privately peer with 577 is quite high and there are no good transit providers at 151 Front that have good connectivity to Bell other than via Chicago.

...

Instead, you're taking advantage of a main feature of your design. If your other providers are doing 95th percentile billing, you even have a day and a half per month that you can leave a connection down at no financial cost. The alternative, as you seem to have noticed, is to spend your day stressing out about your network not working properly, and complaining about being helpless. You don't need redundancy for that.

I didnt mean to sound complaintive. My original post to NANOG was more of trying to get details as to what was going on beyond the rather basic info 1st level support and the cogent status page was saying. After the original post, various questions / comments came up as to what could and could not be done in this situation. ---Mike

6527

Age (days ago)

6527

Last active (days ago)

List overview

Download

3 comments

3 participants

participants (3)

Dan Armstrong
Mike Tancsa
Steve Gibbard