Hi, I'm looking at a setup where we use BGP to announce PI space to two upstream ISPs. ISP A provides a 30Mb/s connection and ISP B provides a 10Mb/s. Originally the plan was to use ISP B's link as a backup and local pref traffic outbound via ISP A and pref inbound using AS prepend via ISP A. It has now been requested to be able to distribute traffic across both links rather than preference traffic to the higher speed link. We are going to be using Juniper SRX210s to do this. I have some questions: - Is this really a good idea, as the BGP process won't care what the utilisation of the links are and you will see situations where the lower speed link gets used even though the high speed link utilisation is 0? - If we are doing this, I don't want to take a full routing table, I would rather just take the ISPs routes and perhaps their connected customers. One ISP has said they will only provide full routing table or default. I really don't want to take a full table, is receiving default only going to be a problem for my setup? - Any advice on how to avoid situations where the low bandwidth link is being used even though there is 0 utilisation on the high bandwidth link? Thanks Ahmed
You can just accept directly-connected peers from each network (or within 2 AS's, etc) then point a default at each one with different preferences. You can do with with two edges if you like also: iBGP between the edges, and push default into OSPF from both. WRT dynamic load balancing... generally if your network is large enough for two upstreams you'll have a pretty good distribution of flows so once you get the prefs and prepends setup the way you like, thing won't shift that rapidly. In my experience at least... -Jack Carrozzo On Tue, Jan 18, 2011 at 1:32 PM, Ahmed Yousuf <ayousuf0079@gmail.com> wrote:
Hi,
I'm looking at a setup where we use BGP to announce PI space to two upstream ISPs. ISP A provides a 30Mb/s connection and ISP B provides a 10Mb/s. Originally the plan was to use ISP B's link as a backup and local pref traffic outbound via ISP A and pref inbound using AS prepend via ISP A. It has now been requested to be able to distribute traffic across both links rather than preference traffic to the higher speed link. We are going to be using Juniper SRX210s to do this. I have some questions:
- Is this really a good idea, as the BGP process won't care what the utilisation of the links are and you will see situations where the lower speed link gets used even though the high speed link utilisation is 0?
- If we are doing this, I don't want to take a full routing table, I would rather just take the ISPs routes and perhaps their connected customers. One ISP has said they will only provide full routing table or default. I really don't want to take a full table, is receiving default only going to be a problem for my setup?
- Any advice on how to avoid situations where the low bandwidth link is being used even though there is 0 utilisation on the high bandwidth link?
Thanks
Ahmed
You really limit yourself when you just take a default from a provider. If you take 2 default's (one from each provider) for whatever reason, once you change the local pref on one of them, it's all your traffic outbound or none. I always request a full table + default, so you can filter to best suit your needs. This way, you can just accept /8's and get some sort of balancing at least (even if you just say all even /8's pref'd on one gateway and all odd /8's from the other provider, etc). Of course this won't be symmetrical, but thats the nature eBGP on the internet. You'll have to watch it and adjust as needed so that you won't saturate your slower link. Max On Tue, Jan 18, 2011 at 12:32 PM, Ahmed Yousuf <ayousuf0079@gmail.com>wrote:
Hi,
I'm looking at a setup where we use BGP to announce PI space to two upstream ISPs. ISP A provides a 30Mb/s connection and ISP B provides a 10Mb/s. Originally the plan was to use ISP B's link as a backup and local pref traffic outbound via ISP A and pref inbound using AS prepend via ISP A. It has now been requested to be able to distribute traffic across both links rather than preference traffic to the higher speed link. We are going to be using Juniper SRX210s to do this. I have some questions:
- Is this really a good idea, as the BGP process won't care what the utilisation of the links are and you will see situations where the lower speed link gets used even though the high speed link utilisation is 0?
- If we are doing this, I don't want to take a full routing table, I would rather just take the ISPs routes and perhaps their connected customers. One ISP has said they will only provide full routing table or default. I really don't want to take a full table, is receiving default only going to be a problem for my setup?
- Any advice on how to avoid situations where the low bandwidth link is being used even though there is 0 utilisation on the high bandwidth link?
Thanks
Ahmed
From: Ahmed Yousuf Sent: Tuesday, January 18, 2011 10:32 AM To: nanog@nanog.org Subject: Dual Homed BGP for failover
- Is this really a good idea, as the BGP process won't care what the utilisation of the links are and you will see situations where the lower speed link gets used even though the high speed link utilisation is 0?
It is possible. But one thing, and I know it is a semantics nit but it is really important. There is no difference in the "speed" of the links. There is a difference in the capacity of the two but the traffic flows at the same "speed" across both. That said, have you actually tried seeing what the "natural" breakdown of the traffic is? Without any AS prepend or local pref adjustment, what is the natural ratio of traffic on the two links? Generally different ISPs have different connectivity and some destinations will be favored via one path and others via the other path. It might be useful to determine how BGP naturally routes things first and then you can get an idea of what needs adjusting.
- If we are doing this, I don't want to take a full routing table, I would rather just take the ISPs routes and perhaps their connected customers. One ISP has said they will only provide full routing table or default. I really don't want to take a full table, is receiving default only going to be a problem for my setup?
Interesting. Most ISPs offer "default", "full", or "customer routes". You can take a full table but simply filter out any that aren't from your ISPs ASN or within one hop of it and only install the routes that meet those criteria. In addition to using AS prepending, your providers might offer communities that allow you to control redistribution of your routing information to their peers. You might want to tell the ISP on the smaller link not to announce your routes to a major peer. That major peer will now find its path to you via the larger pipe.
- Any advice on how to avoid situations where the low bandwidth link is being used even though there is 0 utilisation on the high bandwidth link?
If that happens, it would mean that the world does not see your path via the high bandwidth pipe as being an attractive path. As mentioned above, you might be able to append communities to your routes to the lower bandwidth ISP that control how they redistribute your routes. One example might be something like "don't redistribute my routes if you see them coming from another source" in which case that ISP only redistributes your routes when they don't see the announcement via the high bandwidth provider and effectively acts as a backup outside of their own AS but you would still receive traffic originated within their AS over the low bandwidth connection.
Ahmed
G
On Tue, Jan 18, 2011 at 1:32 PM, Ahmed Yousuf <ayousuf0079@gmail.com> wrote:
It has now been requested to be able to distribute traffic across both links rather than preference traffic to the higher speed link. - Is this really a good idea, as the BGP process won't care what the utilisation of the links are and you will see situations where the lower speed link gets used even though the high speed link utilisation is 0?
Hi Ahmed, This really isn't an either/or situation. You can prefer the higher speed link without excluding the lower speed link. One common way to do this (there are better ones but this one is easy) is to prepend the AS path you send and receive on the lower speed link so that it's longer.
- If we are doing this, I don't want to take a full routing table, I would rather just take the ISPs routes and perhaps their connected customers. One ISP has said they will only provide full routing table or default. I really don't want to take a full table, is receiving default only going to be a problem for my setup?
IMO, that would be a mistake. Taking significantly less than a full table severely limits your options for balancing traffic between the links.
- Any advice on how to avoid situations where the low bandwidth link is being used even though there is 0 utilisation on the high bandwidth link?
Any particular communication is either going to go through one link or the other. I'm generalizing here, ignoring some subtleties, but if packets between two particular hosts have picked the low speed link, they will take that one instead of the high speed link. So in a sense it isn't possible to prevent that situation. However, you can adjust the preferences for one path versus the other so that you're not leaving either circuit underused overall and the disparity between your circuits (30 and 10) is not enough to cause major performance issues in and of itself. Regards, Bill Herrin -- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004
On 1/18/2011 1:00 PM, William Herrin wrote:
IMO, that would be a mistake. Taking significantly less than a full table severely limits your options for balancing traffic between the links.
It should also be noted that taking a full table, doesn't mean you have to use the full table. Apply filters to smaller routes or long ASPATHs that you don't want, and then assign preferences, communities, prepends, etc as necessary for the routes you actually accept. This means your sync time is longer and you'll have more updates, but it will still keep the local routing table much lower. Jack
Someone should advise him that if he wants to take in a full BGP routing table that he makes sure his router can handle it! I would hate for him to open the floodgates and his production router shuts down. LOL....
Date: Tue, 18 Jan 2011 13:12:18 -0600 From: jbates@brightok.net To: bill@herrin.us Subject: Re: Dual Homed BGP for failover CC: ayousuf0079@gmail.com; nanog@nanog.org
On 1/18/2011 1:00 PM, William Herrin wrote:
IMO, that would be a mistake. Taking significantly less than a full table severely limits your options for balancing traffic between the links.
It should also be noted that taking a full table, doesn't mean you have to use the full table. Apply filters to smaller routes or long ASPATHs that you don't want, and then assign preferences, communities, prepends, etc as necessary for the routes you actually accept.
This means your sync time is longer and you'll have more updates, but it will still keep the local routing table much lower.
Jack
-----Original Message----- From: Brandon Kim Sent: Tuesday, January 18, 2011 11:57 AM To: jbates@brightok.net; bill@herrin.us Cc: ayousuf0079@gmail.com; nanog group Subject: RE: Dual Homed BGP for failover
Someone should advise him that if he wants to take in a full BGP routing table that he makes sure his router can handle it! I would hate for him to open the floodgates and his production router shuts down. LOL....
One can take a full feed but filter so only a subset of the routes are actually installed. For example, filter all routes that are more than one AS away from the immediate upstream.
On 1/18/2011 2:05 PM, George Bonser wrote:
One can take a full feed but filter so only a subset of the routes are actually installed. For example, filter all routes that are more than one AS away from the immediate upstream.
You should still be careful, as most processors keep a copy of filtered routes as well, so while your forwarding table may not increase, your route processor memory most likely will. I haven't checked, but I presume IOS and Junos have a knob to disable this feature? Jack
On Tue, Jan 18, 2011 at 3:57 PM, Jack Bates <jbates@brightok.net> wrote:
You should still be careful, as most processors keep a copy of filtered routes as well, so while your forwarding table may not increase, your route processor memory most likely will.
I don't think this is the case, on IOS at least. Some years ago I was rocking some 7500s with $not_enough ram for multiple full tables, but with a prefix list to accept le 23 they worked fine. -Jack Carrozzo
On 1/18/2011 3:03 PM, Jack Carrozzo wrote:
I don't think this is the case, on IOS at least. Some years ago I was rocking some 7500s with $not_enough ram for multiple full tables, but with a prefix list to accept le 23 they worked fine.
On JunOS, I know I can view pre and post filtered bgp updates ingress and egress. I seem to recall seeing similar functionality introduced into IOS, though I'm less certain. It's still always advisable to be careful. :) Jack
Yep, the great thing about IOS without 'commit confirmed' is when you remove a bgp filter, it runs out of memory, reboots, brings up peers, runs out of memory, reboots... meanwhile if you're trying to get in over a public interface you're cursing John Chamber's very existence. Not that that's ever happened to me of course... -Jack Carrozzo On Tue, Jan 18, 2011 at 4:19 PM, Jack Bates <jbates@brightok.net> wrote:
On 1/18/2011 3:03 PM, Jack Carrozzo wrote:
I don't think this is the case, on IOS at least. Some years ago I was rocking some 7500s with $not_enough ram for multiple full tables, but with a prefix list to accept le 23 they worked fine.
On JunOS, I know I can view pre and post filtered bgp updates ingress and egress. I seem to recall seeing similar functionality introduced into IOS, though I'm less certain. It's still always advisable to be careful. :)
Jack
Me <3's "commit confirmed" ... maybe someone from Cisco should be watching :) On Tue, Jan 18, 2011 at 3:21 PM, Jack Carrozzo <jack@crepinc.com> wrote:
Yep, the great thing about IOS without 'commit confirmed' is when you remove a bgp filter, it runs out of memory, reboots, brings up peers, runs out of memory, reboots... meanwhile if you're trying to get in over a public interface you're cursing John Chamber's very existence. Not that that's ever happened to me of course...
-Jack Carrozzo
On Tue, Jan 18, 2011 at 4:19 PM, Jack Bates <jbates@brightok.net> wrote:
On 1/18/2011 3:03 PM, Jack Carrozzo wrote:
I don't think this is the case, on IOS at least. Some years ago I was rocking some 7500s with $not_enough ram for multiple full tables, but with a prefix list to accept le 23 they worked fine.
On JunOS, I know I can view pre and post filtered bgp updates ingress and egress. I seem to recall seeing similar functionality introduced into
IOS,
though I'm less certain. It's still always advisable to be careful. :)
Jack
Thanks to all for the responses, certainly illuminating. I'm now more aware of what I can do and what tools are available. The following makes sense to me: - Take full routing tables and default from both ISPs and decide how I filter the routes that get installed in my routers. - Originally apply the same filters on both and monitor the links to see what the natural distribution is, when we let the BGP process decide how the traffic is routed. Need to think more about which filters to apply here, the SRX210s are quoted as having capacity for 16k routes. - Once we have a better idea of the traffic profiles start changing the filters to preference certain traffic over the higher speed link. One way this might be done, is to filter based on RIPE or ARIN addresses. We are most concerned about maintaining capacity for European traffic, so install RIPE routes on the higher capacity link and ARIN routes on the lower capacity links. - Accept that we are never going to get an ideal distribution of traffic and continue monitoring and adjusting local pref/prepends etc. as and when we need to change the distribution of traffic. Hopefully we don't need to do this that often. Thoughts? Ahmed From: Max Pierson [mailto:nmaxpierson@gmail.com] Sent: 18 January 2011 21:30 To: Jack Carrozzo Cc: Jack Bates; ayousuf0079@gmail.com; nanog group Subject: Re: Dual Homed BGP for failover Me <3's "commit confirmed" ... maybe someone from Cisco should be watching :) On Tue, Jan 18, 2011 at 3:21 PM, Jack Carrozzo <jack@crepinc.com> wrote: Yep, the great thing about IOS without 'commit confirmed' is when you remove a bgp filter, it runs out of memory, reboots, brings up peers, runs out of memory, reboots... meanwhile if you're trying to get in over a public interface you're cursing John Chamber's very existence. Not that that's ever happened to me of course... -Jack Carrozzo On Tue, Jan 18, 2011 at 4:19 PM, Jack Bates <jbates@brightok.net> wrote:
On 1/18/2011 3:03 PM, Jack Carrozzo wrote:
I don't think this is the case, on IOS at least. Some years ago I was rocking some 7500s with $not_enough ram for multiple full tables, but with a prefix list to accept le 23 they worked fine.
On JunOS, I know I can view pre and post filtered bgp updates ingress and egress. I seem to recall seeing similar functionality introduced into IOS, though I'm less certain. It's still always advisable to be careful. :)
Jack
On Wed, 19 Jan 2011 10:23:47 -0000, Ahmed Yousuf wrote
- Accept that we are never going to get an ideal distribution of traffic and continue monitoring and adjusting local pref/prepends etc. as and when we need to change the distribution of traffic. Hopefully we don't need to do this that often.
^ This. You're fighting a loosing battle with such slow links. Given the limited route capacity of your router you might as well set up statics aimed at each link and forget about BGP shaping. Just keep a floating default pointed at each peer. -Randy
We're doing BGP to announce our PI space and make sure that our PI space is reachable through both ISPs in case one link goes down. This is the primary need to do the BGP here. Unfortunately my boss has requested that we make use of the capacity of both links, rather than pref traffic out of the higher capacity link. -----Original Message----- From: Randy McAnally [mailto:rsm@fast-serv.com] Sent: 19 January 2011 14:00 To: Ahmed Yousuf; 'nanog group' Subject: RE: Dual Homed BGP for failover On Wed, 19 Jan 2011 10:23:47 -0000, Ahmed Yousuf wrote
- Accept that we are never going to get an ideal distribution of traffic and continue monitoring and adjusting local pref/prepends etc. as and when we need to change the distribution of traffic. Hopefully we don't need to do this that often.
^ This. You're fighting a loosing battle with such slow links. Given the limited route capacity of your router you might as well set up statics aimed at each link and forget about BGP shaping. Just keep a floating default pointed at each peer. -Randy
On Wed, 19 Jan 2011 14:26:32 -0000, Ahmed Yousuf wrote
We're doing BGP to announce our PI space and make sure that our PI space is reachable through both ISPs in case one link goes down. This is the primary need to do the BGP here. Unfortunately my boss has requested that we make use of the capacity of both links, rather than pref traffic out of the higher capacity link.
Understood! you would _still_ take default BGP routes, I was implying more along the lines (in cisco speak): ! Tweak as necessary to get a good balance ip route 0.0.0.0 128.0.0.0 <peer1> ip route 128.0.0.0 128.0.0.0 <peer2> Set up SLA tracking on the peer IPs to retract the routes if either peer goes down. Either that or get more RAM on your router and go the BGP-only method. -Randy
On Tue, Jan 18, 2011 at 12:05 PM, George Bonser <gbonser@seven.com> wrote:
-----Original Message----- From: Brandon Kim Sent: Tuesday, January 18, 2011 11:57 AM To: jbates@brightok.net; bill@herrin.us Cc: ayousuf0079@gmail.com; nanog group Subject: RE: Dual Homed BGP for failover One can take a full feed but filter so only a subset of the routes are actually installed. For example, filter all routes that are more than one AS away from the immediate upstream.
I remember in IOS the BGP config should not have "soft-reconfiguration inbound" for this uplink session, otherwise routing-engine will still keep one copy of full table in memory. -- Michel~
I would be hesitant to do full tables on an SRX210, particularly if you only have an SRX210B with 512MB of RAM. I'm not sure what filtering would do in terms of memory usage, because I have not tried it. I generally put a separate edge device in to handle the upstream and BGP, and use the SRX purely for firewall. You can even have completely redundant edge routers and redundant firewalls, and mesh them with iBGP. This is the setup we are using in our office (2 Cisco 2821 routers on the edge, and 2 Juniper SRX240H firewalls right behind them). Since each of the 2 uplinks we have are ethernet, I have both routers connected to both providers. This gives us ultimate redundancy at very low cost. -Randy -- | Randy Carpenter | Vice President - IT Services | Red Hat Certified Engineer | First Network Group, Inc. | (800)578-6381, Opt. 1 ---- ----- Original Message -----
On 1/18/2011 1:00 PM, William Herrin wrote:
IMO, that would be a mistake. Taking significantly less than a full table severely limits your options for balancing traffic between the links.
It should also be noted that taking a full table, doesn't mean you have to use the full table. Apply filters to smaller routes or long ASPATHs that you don't want, and then assign preferences, communities, prepends, etc as necessary for the routes you actually accept.
This means your sync time is longer and you'll have more updates, but it will still keep the local routing table much lower.
Jack
participants (10)
-
Ahmed Yousuf
-
Brandon Kim
-
George Bonser
-
Jack Bates
-
Jack Carrozzo
-
Max Pierson
-
Michel de Nostredame
-
Randy Carpenter
-
Randy McAnally
-
William Herrin