BGP Design question.

Bret Palsson

22 Jun 2011 22 Jun '11

10:27 p.m.

Here is my current setup in ASCII art. (Please view in a fixed width font.) Below the art I'll write out the setup. +--------+ +--------+ | Peer A | | Peer A | <-Many carriers. Using 1 carrier +---+----+ +----+---+ for this scenario. |eBGP | eBGP | | +---+----+iBGP+----+---+ | Router +----+ Router | <-Netiron CERs Routers. +-+------+ +------+-+ |A `.P A.' |P <-A/P indicates Active/Passive | `. .' | link. | :: | +-+------+' `+------+-+ |Act. FW | |Pas. FW | <-Firewalls Active/Passive. +--------+ +--------+ To keep this scenario simple, I'm multihoming to one carrier. I have two Netiron CERs. Each have a eBGP connection to the same peer. The CERs have an iBGP connection to each other. That works all fine and dandy. Feel free to comment, however if you think there is a better way to do this. Here comes the tricky part. I have two firewalls in an Active/Passive setup. When one fails the other is configured exactly the same and picks up where the other left off. (Yes, all the sessions etc. are actively mirrored between the devices) I am using OSPFv2 between the CERs and the Firewalls. Failover works just fine, however when I fail an OSPF link that has the active default route, ingress traffic still routes fine and dandy, but egress traffic doesn't. Both Netiron's OSPF are setup to advertise they are the default route. What I'm wondering is, if OSPF is the right solution for this. How do others solve this problem? Thanks, Bret Note: Since lately ipv6 has been a hot topic, I'll state that after we get the BGP all figured out and working properly, ipv6 is our next project. :)

Show replies by date

Brant I. Stevens

22 Jun 22 Jun

10:33 p.m.

On 6/22/11 6:27 PM, "Bret Palsson" <bret@getjive.com> wrote:

...

Here is my current setup in ASCII art. (Please view in a fixed width font.) Below the art I'll write out the setup.

+--------+ +--------+ | Peer A | | Peer A | <-Many carriers. Using 1 carrier +---+----+ +----+---+ for this scenario. |eBGP | eBGP | | +---+----+iBGP+----+---+ | Router +----+ Router | <-Netiron CERs Routers. +-+------+ +------+-+ |A `.P A.' |P <-A/P indicates Active/Passive | `. .' | link. | :: | +-+------+' `+------+-+ |Act. FW | |Pas. FW | <-Firewalls Active/Passive. +--------+ +--------+

To keep this scenario simple, I'm multihoming to one carrier. I have two Netiron CERs. Each have a eBGP connection to the same peer. The CERs have an iBGP connection to each other. That works all fine and dandy. Feel free to comment, however if you think there is a better way to do this.

Here comes the tricky part. I have two firewalls in an Active/Passive setup. When one fails the other is configured exactly the same and picks up where the other left off. (Yes, all the sessions etc. are actively mirrored between the devices)

I am using OSPFv2 between the CERs and the Firewalls. Failover works just fine, however when I fail an OSPF link that has the active default route, ingress traffic still routes fine and dandy, but egress traffic doesn't. Both Netiron's OSPF are setup to advertise they are the default route.

What I'm wondering is, if OSPF is the right solution for this. How do others solve this problem?

You could also do an eBGP session through the firewall between the outside routers and routers on the inside firewall, passing only the default route to the inside routers.

...

Thanks,

Bret

Note: Since lately ipv6 has been a hot topic, I'll state that after we get the BGP all figured out and working properly, ipv6 is our next project. :)

Owen DeLong

10:44 p.m.

I would suggest running VRRP on the routers towards the firewalls and only use OSPF to advertise the ingress routes. Statically route default to the VRRP group. Implemented as follows: [RA]------[switch]-----[switch]------[RB] | | [AFW] [PFW] Make sense? AFW/PFW advertise OSPF for the interior routes so that RA/RB know how to reach them, but, RA/RB don't have to advertise anything and AFW/PFW have static default routes to a VRRP group address shared between RA/RB. If you want to make OSPF work, then, try making sure you have default-information originate always on both RA and RB. Owen On Jun 22, 2011, at 3:27 PM, Bret Palsson wrote:

...

Here is my current setup in ASCII art. (Please view in a fixed width font.) Below the art I'll write out the setup.

+--------+ +--------+ | Peer A | | Peer A | <-Many carriers. Using 1 carrier +---+----+ +----+---+ for this scenario. |eBGP | eBGP | | +---+----+iBGP+----+---+ | Router +----+ Router | <-Netiron CERs Routers. +-+------+ +------+-+ |A `.P A.' |P <-A/P indicates Active/Passive | `. .' | link. | :: | +-+------+' `+------+-+ |Act. FW | |Pas. FW | <-Firewalls Active/Passive. +--------+ +--------+

To keep this scenario simple, I'm multihoming to one carrier. I have two Netiron CERs. Each have a eBGP connection to the same peer. The CERs have an iBGP connection to each other. That works all fine and dandy. Feel free to comment, however if you think there is a better way to do this.

Here comes the tricky part. I have two firewalls in an Active/Passive setup. When one fails the other is configured exactly the same and picks up where the other left off. (Yes, all the sessions etc. are actively mirrored between the devices)

I am using OSPFv2 between the CERs and the Firewalls. Failover works just fine, however when I fail an OSPF link that has the active default route, ingress traffic still routes fine and dandy, but egress traffic doesn't. Both Netiron's OSPF are setup to advertise they are the default route.

What I'm wondering is, if OSPF is the right solution for this. How do others solve this problem?

Thanks,

Bret

Note: Since lately ipv6 has been a hot topic, I'll state that after we get the BGP all figured out and working properly, ipv6 is our next project. :)

Randy Bush

11:02 p.m.

vrrp?

Ingo Flaschberger

11:07 p.m.

Hi Bret,

...

To keep this scenario simple, I'm multihoming to one carrier. I have two Netiron CERs. Each have a eBGP connection to the same peer. The CERs have an iBGP connection to each other. That works all fine and dandy. Feel free to comment, however if you think there is a better way to do this.

Here comes the tricky part. I have two firewalls in an Active/Passive setup. When one fails the other is configured exactly the same and picks up where the other left off. (Yes, all the sessions etc. are actively mirrored between the devices)

I am using OSPFv2 between the CERs and the Firewalls. Failover works just fine, however when I fail an OSPF link that has the active default route, ingress traffic still routes fine and dandy, but egress traffic doesn't. Both Netiron's OSPF are setup to advertise they are the default route.

Linux firewall? disabled rp-filter?

...

What I'm wondering is, if OSPF is the right solution for this. How do others solve this problem?

I do something similar with freebsd; you always make shure the backbone area 0.0.0.0 does not break into 2 parts, perhaps use an extra link between the 2 firewalls just because of this. Kind regards, Ingo Flaschberger

-Hammer-

11:11 p.m.

Another option would be to insert switches between your routers and FWs. OSPF from the routers to the switches (yes, switches running L3 OSPF) and then HSRP/VRRP/etc. to the FWs. This way routing changes don't affect the FWs. The FWs simply have a default route to the HSRP/VRRP/etc. VIP. Then the primary switch routes to the routers which then route out to their EBGP peers. Only caveat is to make sure you are only redistributing the 0/0 into OSPF. Not the full route table. -Hammer- On 06/22/2011 05:27 PM, Bret Palsson wrote:

...

Here is my current setup in ASCII art. (Please view in a fixed width font.) Below the art I'll write out the setup.

+--------+ +--------+ | Peer A | | Peer A |<-Many carriers. Using 1 carrier +---+----+ +----+---+ for this scenario. |eBGP | eBGP | | +---+----+iBGP+----+---+ | Router +----+ Router |<-Netiron CERs Routers. +-+------+ +------+-+ |A `.P A.' |P<-A/P indicates Active/Passive | `. .' | link. | :: | +-+------+' `+------+-+ |Act. FW | |Pas. FW |<-Firewalls Active/Passive. +--------+ +--------+

To keep this scenario simple, I'm multihoming to one carrier. I have two Netiron CERs. Each have a eBGP connection to the same peer. The CERs have an iBGP connection to each other. That works all fine and dandy. Feel free to comment, however if you think there is a better way to do this.

Here comes the tricky part. I have two firewalls in an Active/Passive setup. When one fails the other is configured exactly the same and picks up where the other left off. (Yes, all the sessions etc. are actively mirrored between the devices)

I am using OSPFv2 between the CERs and the Firewalls. Failover works just fine, however when I fail an OSPF link that has the active default route, ingress traffic still routes fine and dandy, but egress traffic doesn't. Both Netiron's OSPF are setup to advertise they are the default route.

What I'm wondering is, if OSPF is the right solution for this. How do others solve this problem?

Thanks,

Bret

Note: Since lately ipv6 has been a hot topic, I'll state that after we get the BGP all figured out and working properly, ipv6 is our next project. :)

William Cooper

11:22 p.m.

Couple of questions for clarification (inline): On Wed, Jun 22, 2011 at 6:27 PM, Bret Palsson <bret@getjive.com> wrote:

...

Here is my current setup in ASCII art. (Please view in a fixed width font.) Below the art I'll write out the setup.

+--------+ +--------+ | Peer A | | Peer A | <-Many carriers. Using 1 carrier +---+----+ +----+---+ for this scenario. |eBGP | eBGP | | +---+----+iBGP+----+---+ | Router +----+ Router | <-Netiron CERs Routers. +-+------+ +------+-+ |A `.P A.' |P <-A/P indicates Active/Passive | `. .' | link. | :: | +-+------+' `+------+-+ |Act. FW | |Pas. FW | <-Firewalls Active/Passive. +--------+ +--------+

(Tony) What's behind this point?

...

To keep this scenario simple, I'm multihoming to one carrier. I have two Netiron CERs. Each have a eBGP connection to the same peer. The CERs have an iBGP connection to each other. That works all fine and dandy. Feel free to comment, however if you think there is a better way to do this.

Here comes the tricky part. I have two firewalls in an Active/Passive setup. When one fails the other is configured exactly the same and picks up where the other left off. (Yes, all the sessions etc. are actively mirrored between the devices)

I am using OSPFv2 between the CERs and the Firewalls. Failover works just fine, however when I fail an OSPF link that has the active default route, ingress traffic still routes fine and dandy, but egress traffic doesn't. Both Netiron's OSPF are setup to advertise they are the default route.

(Tony) (Apologies for the seemingly dumb question) but by egress, do you mean from behind the FW towards your carrier?

...

What I'm wondering is, if OSPF is the right solution for this. How do others solve this problem?

Thanks,

Bret

Note: Since lately ipv6 has been a hot topic, I'll state that after we get the BGP all figured out and working properly, ipv6 is our next project. :)

Bret Palsson

23 Jun 23 Jun

1:04 a.m.

On Wed, Jun 22, 2011 at 5:22 PM, William Cooper <wcooper02@gmail.com> wrote:

...

Couple of questions for clarification (inline):

On Wed, Jun 22, 2011 at 6:27 PM, Bret Palsson <bret@getjive.com> wrote:

...
Here is my current setup in ASCII art. (Please view in a fixed width font.) Below the art I'll write out the setup.

+--------+ +--------+ | Peer A | | Peer A | <-Many carriers. Using 1 carrier +---+----+ +----+---+ for this scenario. |eBGP | eBGP | | +---+----+iBGP+----+---+ | Router +----+ Router | <-Netiron CERs Routers. +-+------+ +------+-+ |A `.P A.' |P <-A/P indicates Active/Passive | `. .' | link. | :: | +-+------+' `+------+-+ |Act. FW | |Pas. FW | <-Firewalls Active/Passive. +--------+ +--------+

(Tony) What's behind this point?

We have a few gigs of voice (RTP) traffic at any given time of the day. We want/need hitless failover. Currently we provide this, but we use our providers BGP mix. We will be peering with many carriers directly now and are changing our topology to do so. Before we had a HSRP L3 hand-off to two switches in the same vlan. On our juniper SSGs we bonded ports and we use the NSRP for all the RTOs. Which provided hitless fail-over.

...

...
To keep this scenario simple, I'm multihoming to one carrier. I have two Netiron CERs. Each have a eBGP connection to the same peer. The CERs have an iBGP connection to each other. That works all fine and dandy. Feel free to comment, however if you think

there is a better way to do this.

...
Here comes the tricky part. I have two firewalls in an Active/Passive

setup. When one fails the other is configured exactly the same

...
and picks up where the other left off. (Yes, all the sessions etc. are actively mirrored between the devices)

I am using OSPFv2 between the CERs and the Firewalls. Failover works just fine, however when I fail an OSPF link that has the active default route, ingress traffic still routes fine and dandy, but egress traffic doesn't. Both Netiron's OSPF are setup to advertise they are the default route.

(Tony) (Apologies for the seemingly dumb question) but by egress, do you mean from behind the FW towards your carrier?

Yes.

...

...
What I'm wondering is, if OSPF is the right solution for this. How do others solve this problem?

Thanks,

Bret

Note: Since lately ipv6 has been a hot topic, I'll state that after we get the BGP all figured out and working properly, ipv6 is our next project. :)

PC

22 Jun 22 Jun

11:33 p.m.

Who makes the firewall? To make this work and be "hitless", your firewall vendor must support stateful replication of routing protocol data (including OSPF). For example, Cisco didn't support this in their ASA product until version 8.4 of code. Otherwise, a failover requires OSPF to re-converge -- and quite frankly, will likely cause some state of confusion on the upstream OSPF peers, loss of adjacency, and a loss of routing until this occurs. It's like someone just swapped a router with the same IP to the upstream device -- assuming your active/standby vendor's implementation only presents itself as one device. However, once this is succesful your current failover topology should work fine -- even if it takes some time to failover. In my opinion though, unless the firewall is serving as "transit" to downstream routers or other layer 3 elements, and you need to run OSPF to it (And through it) as a result, it's often just easier to static default route out from the firewall(s) and redistribute a static route on the upstream routers for the subnets behind the firewalls. It also helps ensure symmetrical traffic flows, which is important for stateful firewalls and can become moderatly confusing when your firewalls start having many interfaces. On Wed, Jun 22, 2011 at 4:27 PM, Bret Palsson <bret@getjive.com> wrote:

...

Here is my current setup in ASCII art. (Please view in a fixed width font.) Below the art I'll write out the setup.

+--------+ +--------+ | Peer A | | Peer A | <-Many carriers. Using 1 carrier +---+----+ +----+---+ for this scenario. |eBGP | eBGP | | +---+----+iBGP+----+---+ | Router +----+ Router | <-Netiron CERs Routers. +-+------+ +------+-+ |A `.P A.' |P <-A/P indicates Active/Passive | `. .' | link. | :: | +-+------+' `+------+-+ |Act. FW | |Pas. FW | <-Firewalls Active/Passive. +--------+ +--------+

To keep this scenario simple, I'm multihoming to one carrier. I have two Netiron CERs. Each have a eBGP connection to the same peer. The CERs have an iBGP connection to each other. That works all fine and dandy. Feel free to comment, however if you think there is a better way to do this.

Here comes the tricky part. I have two firewalls in an Active/Passive setup. When one fails the other is configured exactly the same and picks up where the other left off. (Yes, all the sessions etc. are actively mirrored between the devices)

I am using OSPFv2 between the CERs and the Firewalls. Failover works just fine, however when I fail an OSPF link that has the active default route, ingress traffic still routes fine and dandy, but egress traffic doesn't. Both Netiron's OSPF are setup to advertise they are the default route.

What I'm wondering is, if OSPF is the right solution for this. How do others solve this problem?

Thanks,

Bret

Note: Since lately ipv6 has been a hot topic, I'll state that after we get the BGP all figured out and working properly, ipv6 is our next project. :)

-Hammer-

11:37 p.m.

Do people really run routing protocols with their public address space on their FWs? I'm not saying right or wrong. Just curious. Seems like the last thing I would want to do would be to have my FW participate in a routing protocol unless is was absolutely necessary. Better to static the FW with a default route? I'd love to hear arguments for or against.... -Hammer- On 06/22/2011 06:33 PM, PC wrote:

...

Who makes the firewall?

To make this work and be "hitless", your firewall vendor must support stateful replication of routing protocol data (including OSPF). For example, Cisco didn't support this in their ASA product until version 8.4 of code.

Otherwise, a failover requires OSPF to re-converge -- and quite frankly, will likely cause some state of confusion on the upstream OSPF peers, loss of adjacency, and a loss of routing until this occurs. It's like someone just swapped a router with the same IP to the upstream device -- assuming your active/standby vendor's implementation only presents itself as one device.

However, once this is succesful your current failover topology should work fine -- even if it takes some time to failover.

In my opinion though, unless the firewall is serving as "transit" to downstream routers or other layer 3 elements, and you need to run OSPF to it (And through it) as a result, it's often just easier to static default route out from the firewall(s) and redistribute a static route on the upstream routers for the subnets behind the firewalls. It also helps ensure symmetrical traffic flows, which is important for stateful firewalls and can become moderatly confusing when your firewalls start having many interfaces.

On Wed, Jun 22, 2011 at 4:27 PM, Bret Palsson<bret@getjive.com> wrote:

...
Here is my current setup in ASCII art. (Please view in a fixed width font.) Below the art I'll write out the setup.

+--------+ +--------+ | Peer A | | Peer A |<-Many carriers. Using 1 carrier +---+----+ +----+---+ for this scenario. |eBGP | eBGP | | +---+----+iBGP+----+---+ | Router +----+ Router |<-Netiron CERs Routers. +-+------+ +------+-+ |A `.P A.' |P<-A/P indicates Active/Passive | `. .' | link. | :: | +-+------+' `+------+-+ |Act. FW | |Pas. FW |<-Firewalls Active/Passive. +--------+ +--------+

To keep this scenario simple, I'm multihoming to one carrier. I have two Netiron CERs. Each have a eBGP connection to the same peer. The CERs have an iBGP connection to each other. That works all fine and dandy. Feel free to comment, however if you think there is a better way to do this.

Here comes the tricky part. I have two firewalls in an Active/Passive setup. When one fails the other is configured exactly the same and picks up where the other left off. (Yes, all the sessions etc. are actively mirrored between the devices)

I am using OSPFv2 between the CERs and the Firewalls. Failover works just fine, however when I fail an OSPF link that has the active default route, ingress traffic still routes fine and dandy, but egress traffic doesn't. Both Netiron's OSPF are setup to advertise they are the default route.

What I'm wondering is, if OSPF is the right solution for this. How do others solve this problem?

Thanks,

Bret

Note: Since lately ipv6 has been a hot topic, I'll state that after we get the BGP all figured out and working properly, ipv6 is our next project. :)

Matt Hite

15 Jul 15 Jul

5:23 a.m.

Sure. Sometimes it's nice/convenient to let firewalls advertise the external blocks they use for NAT translations, etc. Otherwise you need to statically route them to the firewall and redistribute the statics from said routers into your IGP. Also, in some cases, people want to do network-based load balancing (ECMP) to clusters of firewalls. So routing protocols obviously come in handy with that. Additionally, some people just want to avoid layer 2 clustering/HA technologies whenever possible and prefer layer 3 HA solutions. -M On Wed, Jun 22, 2011 at 4:37 PM, -Hammer- <bhmccie@gmail.com> wrote:

...

Do people really run routing protocols with their public address space on their FWs? I'm not saying right or wrong. Just curious. Seems like the last thing I would want to do would be to have my FW participate in a routing protocol unless is was absolutely necessary. Better to static the FW with a default route? I'd love to hear arguments for or against....

-Hammer-

On 06/22/2011 06:33 PM, PC wrote:

...
Who makes the firewall?

To make this work and be "hitless", your firewall vendor must support stateful replication of routing protocol data (including OSPF). For example, Cisco didn't support this in their ASA product until version 8.4 of code.

Otherwise, a failover requires OSPF to re-converge -- and quite frankly, will likely cause some state of confusion on the upstream OSPF peers, loss of adjacency, and a loss of routing until this occurs. It's like someone just swapped a router with the same IP to the upstream device -- assuming your active/standby vendor's implementation only presents itself as one device.

However, once this is succesful your current failover topology should work fine -- even if it takes some time to failover.

In my opinion though, unless the firewall is serving as "transit" to downstream routers or other layer 3 elements, and you need to run OSPF to it (And through it) as a result, it's often just easier to static default route out from the firewall(s) and redistribute a static route on the upstream routers for the subnets behind the firewalls. It also helps ensure symmetrical traffic flows, which is important for stateful firewalls and can become moderatly confusing when your firewalls start having many interfaces.

On Wed, Jun 22, 2011 at 4:27 PM, Bret Palsson<bret@getjive.com> wrote:

...
Here is my current setup in ASCII art. (Please view in a fixed width font.) Below the art I'll write out the setup.

+--------+ +--------+ | Peer A | | Peer A |<-Many carriers. Using 1 carrier +---+----+ +----+---+ for this scenario. |eBGP | eBGP | | +---+----+iBGP+----+---+ | Router +----+ Router |<-Netiron CERs Routers. +-+------+ +------+-+ |A `.P A.' |P<-A/P indicates Active/Passive | `. .' | link. | :: | +-+------+' `+------+-+ |Act. FW | |Pas. FW |<-Firewalls Active/Passive. +--------+ +--------+

To keep this scenario simple, I'm multihoming to one carrier. I have two Netiron CERs. Each have a eBGP connection to the same peer. The CERs have an iBGP connection to each other. That works all fine and dandy. Feel free to comment, however if you think there is a better way to do this.

Here comes the tricky part. I have two firewalls in an Active/Passive setup. When one fails the other is configured exactly the same and picks up where the other left off. (Yes, all the sessions etc. are actively mirrored between the devices)

I am using OSPFv2 between the CERs and the Firewalls. Failover works just fine, however when I fail an OSPF link that has the active default route, ingress traffic still routes fine and dandy, but egress traffic doesn't. Both Netiron's OSPF are setup to advertise they are the default route.

What I'm wondering is, if OSPF is the right solution for this. How do others solve this problem?

Thanks,

Bret

Note: Since lately ipv6 has been a hot topic, I'll state that after we get the BGP all figured out and working properly, ipv6 is our next project. :)

Bret Palsson

23 Jun 23 Jun

1:07 a.m.

On Wed, Jun 22, 2011 at 5:33 PM, PC <paul4004@gmail.com> wrote:

...

Who makes the firewall?

Juniper SSG. We use NSRP and replicate all the RTOs. We have hitless on the Firewalls, have for years. We're now peering with our own carriers vs. using our datacenter's mix. A static route from the junipers to the VIP (VRRP) is probably the way to go. I think. To make this work and be "hitless", your firewall vendor must support

...

stateful replication of routing protocol data (including OSPF). For example, Cisco didn't support this in their ASA product until version 8.4 of code.

Otherwise, a failover requires OSPF to re-converge -- and quite frankly, will likely cause some state of confusion on the upstream OSPF peers, loss of adjacency, and a loss of routing until this occurs. It's like someone just swapped a router with the same IP to the upstream device -- assuming your active/standby vendor's implementation only presents itself as one device.

However, once this is succesful your current failover topology should work fine -- even if it takes some time to failover.

In my opinion though, unless the firewall is serving as "transit" to downstream routers or other layer 3 elements, and you need to run OSPF to it (And through it) as a result, it's often just easier to static default route out from the firewall(s) and redistribute a static route on the upstream routers for the subnets behind the firewalls. It also helps ensure symmetrical traffic flows, which is important for stateful firewalls and can become moderatly confusing when your firewalls start having many interfaces.

On Wed, Jun 22, 2011 at 4:27 PM, Bret Palsson <bret@getjive.com> wrote:

...
Here is my current setup in ASCII art. (Please view in a fixed width font.) Below the art I'll write out the setup.

+--------+ +--------+ | Peer A | | Peer A | <-Many carriers. Using 1 carrier +---+----+ +----+---+ for this scenario. |eBGP | eBGP | | +---+----+iBGP+----+---+ | Router +----+ Router | <-Netiron CERs Routers. +-+------+ +------+-+ |A `.P A.' |P <-A/P indicates Active/Passive | `. .' | link. | :: | +-+------+' `+------+-+ |Act. FW | |Pas. FW | <-Firewalls Active/Passive. +--------+ +--------+

To keep this scenario simple, I'm multihoming to one carrier. I have two Netiron CERs. Each have a eBGP connection to the same peer. The CERs have an iBGP connection to each other. That works all fine and dandy. Feel free to comment, however if you think there is a better way to do this.

Here comes the tricky part. I have two firewalls in an Active/Passive setup. When one fails the other is configured exactly the same and picks up where the other left off. (Yes, all the sessions etc. are actively mirrored between the devices)

I am using OSPFv2 between the CERs and the Firewalls. Failover works just fine, however when I fail an OSPF link that has the active default route, ingress traffic still routes fine and dandy, but egress traffic doesn't. Both Netiron's OSPF are setup to advertise they are the default route.

What I'm wondering is, if OSPF is the right solution for this. How do others solve this problem?

Thanks,

Bret

Note: Since lately ipv6 has been a hot topic, I'll state that after we get the BGP all figured out and working properly, ipv6 is our next project. :)

Jason Roysdon

3:42 a.m.

I second the static routes, specially from a simplicity standpoint. Add in a pair of layer two switches to simplify further: +--------+ +--------+ | Peer A | | Peer A | <-Many carriers. Using 1 carrier +---+----+ +----+---+ for this scenario. |eBGP | eBGP | | +---+----+iBGP+----+---+ | Router + + Router | <- Routers. Not directly connected +-+------+ +------+-+ | | +-+------+ +------+-+ |L2Switch|----|L2Switch| <- Layer 2 switches, can be stacked +--------+ +--------+ | | +-+------+ +------+-+ |Act. FW |----|Pas. FW | <-Firewalls Active/Passive. +--------+ +--------+ You can lose all of the left leg, or all of the right leg, and still be up. If you want to complicate things, you can add crossing links between it all, but again, beyond BGP and VRRP, this is a very simple design you can easily troubleshoot at 3AM. It's also much easier to document the troubleshooting steps (so you can go on vacation and someone else can solve without calling you) and test upgrades. You can nearly evenly split the traffic by having a VRRP VIP on each edge router, with the other router backing up the first. The firewalls can have two static routes, one to each VIP, and this will roughly load-balance the traffic out on a packet basis. As you peer with the same ISP, this will work just fine. If they have an outage, your edge routers will learn, and even if the circuit drops it'll know, and basically the VIP will just redirect traffic to the other router. Now all your firewalls have to do is maintain stateful session information, not OSPF. If you had two different ISPs (especially if they are not roughly evenly connected), then not having intelligence of the BGP paths in your firewalls can cause an extra hop when it hits router with the longer path, which will redirect it to the router with the shorter path. Speaking from a Cisco/HSRP point of view, you could be more intelligent (re:more complicated, and complication means harder troubleshooting and more documentation needed) during problem periods by having the VIP move routers automatically based on the WAN link dropping and/or a route beyond it being lost (others can comment to if VRRP supports this). This would save one hop to the "broken" router when the BGP path or WAN is down. Jason Roysdon On 06/22/2011 06:07 PM, Bret Palsson wrote:

...

On Wed, Jun 22, 2011 at 5:33 PM, PC <paul4004@gmail.com> wrote:

...
Who makes the firewall?

Juniper SSG. We use NSRP and replicate all the RTOs. We have hitless on the Firewalls, have for years. We're now peering with our own carriers vs. using our datacenter's mix.

A static route from the junipers to the VIP (VRRP) is probably the way to go. I think.

To make this work and be "hitless", your firewall vendor must support

...
stateful replication of routing protocol data (including OSPF). For example, Cisco didn't support this in their ASA product until version 8.4 of code.

Otherwise, a failover requires OSPF to re-converge -- and quite frankly, will likely cause some state of confusion on the upstream OSPF peers, loss of adjacency, and a loss of routing until this occurs. It's like someone just swapped a router with the same IP to the upstream device -- assuming your active/standby vendor's implementation only presents itself as one device.

However, once this is succesful your current failover topology should work fine -- even if it takes some time to failover.

In my opinion though, unless the firewall is serving as "transit" to downstream routers or other layer 3 elements, and you need to run OSPF to it (And through it) as a result, it's often just easier to static default route out from the firewall(s) and redistribute a static route on the upstream routers for the subnets behind the firewalls. It also helps ensure symmetrical traffic flows, which is important for stateful firewalls and can become moderatly confusing when your firewalls start having many interfaces.

On Wed, Jun 22, 2011 at 4:27 PM, Bret Palsson <bret@getjive.com> wrote:

...
Here is my current setup in ASCII art. (Please view in a fixed width font.) Below the art I'll write out the setup.

+--------+ +--------+ | Peer A | | Peer A | <-Many carriers. Using 1 carrier +---+----+ +----+---+ for this scenario. |eBGP | eBGP | | +---+----+iBGP+----+---+ | Router +----+ Router | <-Netiron CERs Routers. +-+------+ +------+-+ |A `.P A.' |P <-A/P indicates Active/Passive | `. .' | link. | :: | +-+------+' `+------+-+ |Act. FW | |Pas. FW | <-Firewalls Active/Passive. +--------+ +--------+

To keep this scenario simple, I'm multihoming to one carrier. I have two Netiron CERs. Each have a eBGP connection to the same peer. The CERs have an iBGP connection to each other. That works all fine and dandy. Feel free to comment, however if you think there is a better way to do this.

Here comes the tricky part. I have two firewalls in an Active/Passive setup. When one fails the other is configured exactly the same and picks up where the other left off. (Yes, all the sessions etc. are actively mirrored between the devices)

I am using OSPFv2 between the CERs and the Firewalls. Failover works just fine, however when I fail an OSPF link that has the active default route, ingress traffic still routes fine and dandy, but egress traffic doesn't. Both Netiron's OSPF are setup to advertise they are the default route.

What I'm wondering is, if OSPF is the right solution for this. How do others solve this problem?

Thanks,

Bret

Note: Since lately ipv6 has been a hot topic, I'll state that after we get the BGP all figured out and working properly, ipv6 is our next project. :)

Hank Nussbacher

6:02 a.m.

At 20:42 22/06/2011 -0700, Jason Roysdon wrote: Let me be a bit of a heretic here. How often does your router fail? Or your firewall? In the 25 years I have gone into customers I have found when they did a cross setup as proposed below by Bret and Jason, only one person truly knew the complete setup and if something broke only he was able to fix it. There is never complete printed documentation: routing design, IPs on all interfaces, subnetting schematic, etc. And if there was at one point, after 2 years it was outdated and never updated and only the *1* guy knew the changes in his head. In that kind of situation, when something stopped working they always had to call in the "guru" to fix it. On the other hand, a simple design of only *one* path (pick either left or right side of each of the ASCII arts), made it possible that even junior network engineers as well as technicians called in on emergency with 4 hours notice, were able to fix the situation much more quickly than the "cross" design. And the MTBF on a single path solution, IMHO, is around 3-4 years. And if you need redundancy, keep a spare box on a shelf, completely loaded with the latest config so that it can be hot-swapped in within 15 minutes of failure. This 1-path design is not for everyone. The vendors always recommend the "cross" design since they sell 2x the amount of boxes but I have found that life works fine with just a 1-path design as well. -Hank

...

I second the static routes, specially from a simplicity standpoint. Add in a pair of layer two switches to simplify further:

+--------+ +--------+ | Peer A | | Peer A | <-Many carriers. Using 1 carrier +---+----+ +----+---+ for this scenario. |eBGP | eBGP | | +---+----+iBGP+----+---+ | Router + + Router | <- Routers. Not directly connected +-+------+ +------+-+ | | +-+------+ +------+-+ |L2Switch|----|L2Switch| <- Layer 2 switches, can be stacked +--------+ +--------+ | | +-+------+ +------+-+ |Act. FW |----|Pas. FW | <-Firewalls Active/Passive. +--------+ +--------+

You can lose all of the left leg, or all of the right leg, and still be up. If you want to complicate things, you can add crossing links between it all, but again, beyond BGP and VRRP, this is a very simple design you can easily troubleshoot at 3AM. It's also much easier to document the troubleshooting steps (so you can go on vacation and someone else can solve without calling you) and test upgrades.

You can nearly evenly split the traffic by having a VRRP VIP on each edge router, with the other router backing up the first. The firewalls can have two static routes, one to each VIP, and this will roughly load-balance the traffic out on a packet basis. As you peer with the same ISP, this will work just fine. If they have an outage, your edge routers will learn, and even if the circuit drops it'll know, and basically the VIP will just redirect traffic to the other router.

Now all your firewalls have to do is maintain stateful session information, not OSPF.

If you had two different ISPs (especially if they are not roughly evenly connected), then not having intelligence of the BGP paths in your firewalls can cause an extra hop when it hits router with the longer path, which will redirect it to the router with the shorter path.

Speaking from a Cisco/HSRP point of view, you could be more intelligent (re:more complicated, and complication means harder troubleshooting and more documentation needed) during problem periods by having the VIP move routers automatically based on the WAN link dropping and/or a route beyond it being lost (others can comment to if VRRP supports this). This would save one hop to the "broken" router when the BGP path or WAN is down.

Jason Roysdon

...
On Wed, Jun 22, 2011 at 5:33 PM, PC <paul4004@gmail.com> wrote:

...
Who makes the firewall?

Juniper SSG. We use NSRP and replicate all the RTOs. We have hitless on the Firewalls, have for years. We're now peering with our own carriers vs. using our datacenter's mix.

A static route from the junipers to the VIP (VRRP) is probably the way to go. I think.

To make this work and be "hitless", your firewall vendor must support

...
stateful replication of routing protocol data (including OSPF). For example, Cisco didn't support this in their ASA product until version 8.4 of code.

Otherwise, a failover requires OSPF to re-converge -- and quite frankly, will likely cause some state of confusion on the upstream OSPF peers, loss of adjacency, and a loss of routing until this occurs. It's like someone just swapped a router with the same IP to the upstream device -- assuming your active/standby vendor's implementation only presents itself as one device.

However, once this is succesful your current failover topology should work fine -- even if it takes some time to failover.

In my opinion though, unless the firewall is serving as "transit" to downstream routers or other layer 3 elements, and you need to run OSPF to it (And through it) as a result, it's often just easier to static default route out from the firewall(s) and redistribute a static route on the upstream routers for the subnets behind the firewalls. It also helps ensure symmetrical traffic flows, which is important for stateful firewalls and can become moderatly confusing when your firewalls start having many interfaces.

On Wed, Jun 22, 2011 at 4:27 PM, Bret Palsson <bret@getjive.com> wrote:

...
Here is my current setup in ASCII art. (Please view in a fixed width font.) Below the art I'll write out the setup.

+--------+ +--------+ | Peer A | | Peer A | <-Many carriers. Using 1 carrier +---+----+ +----+---+ for this scenario. |eBGP | eBGP | | +---+----+iBGP+----+---+ | Router +----+ Router | <-Netiron CERs Routers. +-+------+ +------+-+ |A `.P A.' |P <-A/P indicates Active/Passive | `. .' | link. | :: | +-+------+' `+------+-+ |Act. FW | |Pas. FW | <-Firewalls Active/Passive. +--------+ +--------+

To keep this scenario simple, I'm multihoming to one carrier. I have two Netiron CERs. Each have a eBGP connection to the same peer. The CERs have an iBGP connection to each other. That works all fine and dandy. Feel free to comment, however if you think there is a better way to do this.

Here comes the tricky part. I have two firewalls in an Active/Passive setup. When one fails the other is configured exactly the same and picks up where the other left off. (Yes, all the sessions etc. are actively mirrored between the devices)

I am using OSPFv2 between the CERs and the Firewalls. Failover works just fine, however when I fail an OSPF link that has the active default route, ingress traffic still routes fine and dandy, but egress traffic doesn't. Both Netiron's OSPF are setup to advertise they are the default route.

What I'm wondering is, if OSPF is the right solution for this. How do others solve this problem?

Thanks,

Bret

Note: Since lately ipv6 has been a hot topic, I'll state that after we get the BGP all figured out and working properly, ipv6 is our next

On 06/22/2011 06:07 PM, Bret Palsson wrote: project. :)

...
...
...

Bret Palsson

6:07 a.m.

That's fine if you are running a website. When it comes to telecommunications, a 15 minute outage is pretty huge. Especially with certain types of customers: emergency services for example. -Bret On Jun 23, 2011, at 12:02 AM, Hank Nussbacher wrote:

...

At 20:42 22/06/2011 -0700, Jason Roysdon wrote:

Let me be a bit of a heretic here. How often does your router fail? Or your firewall? In the 25 years I have gone into customers I have found when they did a cross setup as proposed below by Bret and Jason, only one person truly knew the complete setup and if something broke only he was able to fix it. There is never complete printed documentation: routing design, IPs on all interfaces, subnetting schematic, etc. And if there was at one point, after 2 years it was outdated and never updated and only the *1* guy knew the changes in his head.

In that kind of situation, when something stopped working they always had to call in the "guru" to fix it. On the other hand, a simple design of only *one* path (pick either left or right side of each of the ASCII arts), made it possible that even junior network engineers as well as technicians called in on emergency with 4 hours notice, were able to fix the situation much more quickly than the "cross" design. And the MTBF on a single path solution, IMHO, is around 3-4 years. And if you need redundancy, keep a spare box on a shelf, completely loaded with the latest config so that it can be hot-swapped in within 15 minutes of failure.

This 1-path design is not for everyone. The vendors always recommend the "cross" design since they sell 2x the amount of boxes but I have found that life works fine with just a 1-path design as well.

-Hank

...
I second the static routes, specially from a simplicity standpoint. Add in a pair of layer two switches to simplify further:

+--------+ +--------+ | Peer A | | Peer A | <-Many carriers. Using 1 carrier +---+----+ +----+---+ for this scenario. |eBGP | eBGP | | +---+----+iBGP+----+---+ | Router + + Router | <- Routers. Not directly connected +-+------+ +------+-+ | | +-+------+ +------+-+ |L2Switch|----|L2Switch| <- Layer 2 switches, can be stacked +--------+ +--------+ | | +-+------+ +------+-+ |Act. FW |----|Pas. FW | <-Firewalls Active/Passive. +--------+ +--------+

You can lose all of the left leg, or all of the right leg, and still be up. If you want to complicate things, you can add crossing links between it all, but again, beyond BGP and VRRP, this is a very simple design you can easily troubleshoot at 3AM. It's also much easier to document the troubleshooting steps (so you can go on vacation and someone else can solve without calling you) and test upgrades.

You can nearly evenly split the traffic by having a VRRP VIP on each edge router, with the other router backing up the first. The firewalls can have two static routes, one to each VIP, and this will roughly load-balance the traffic out on a packet basis. As you peer with the same ISP, this will work just fine. If they have an outage, your edge routers will learn, and even if the circuit drops it'll know, and basically the VIP will just redirect traffic to the other router.

Now all your firewalls have to do is maintain stateful session information, not OSPF.

If you had two different ISPs (especially if they are not roughly evenly connected), then not having intelligence of the BGP paths in your firewalls can cause an extra hop when it hits router with the longer path, which will redirect it to the router with the shorter path.

Speaking from a Cisco/HSRP point of view, you could be more intelligent (re:more complicated, and complication means harder troubleshooting and more documentation needed) during problem periods by having the VIP move routers automatically based on the WAN link dropping and/or a route beyond it being lost (others can comment to if VRRP supports this). This would save one hop to the "broken" router when the BGP path or WAN is down.

Jason Roysdon

On 06/22/2011 06:07 PM, Bret Palsson wrote:

...
On Wed, Jun 22, 2011 at 5:33 PM, PC <paul4004@gmail.com> wrote:

...
Who makes the firewall?

Juniper SSG. We use NSRP and replicate all the RTOs. We have hitless on the Firewalls, have for years. We're now peering with our own carriers vs. using our datacenter's mix.

A static route from the junipers to the VIP (VRRP) is probably the way to go. I think.

To make this work and be "hitless", your firewall vendor must support

...
stateful replication of routing protocol data (including OSPF). For example, Cisco didn't support this in their ASA product until version 8.4 of code.

Otherwise, a failover requires OSPF to re-converge -- and quite frankly, will likely cause some state of confusion on the upstream OSPF peers, loss of adjacency, and a loss of routing until this occurs. It's like someone just swapped a router with the same IP to the upstream device -- assuming your active/standby vendor's implementation only presents itself as one device.

However, once this is succesful your current failover topology should work fine -- even if it takes some time to failover.

In my opinion though, unless the firewall is serving as "transit" to downstream routers or other layer 3 elements, and you need to run OSPF to it (And through it) as a result, it's often just easier to static default route out from the firewall(s) and redistribute a static route on the upstream routers for the subnets behind the firewalls. It also helps ensure symmetrical traffic flows, which is important for stateful firewalls and can become moderatly confusing when your firewalls start having many interfaces.

On Wed, Jun 22, 2011 at 4:27 PM, Bret Palsson <bret@getjive.com> wrote:

...
Here is my current setup in ASCII art. (Please view in a fixed width font.) Below the art I'll write out the setup.

+--------+ +--------+ | Peer A | | Peer A | <-Many carriers. Using 1 carrier +---+----+ +----+---+ for this scenario. |eBGP | eBGP | | +---+----+iBGP+----+---+ | Router +----+ Router | <-Netiron CERs Routers. +-+------+ +------+-+ |A `.P A.' |P <-A/P indicates Active/Passive | `. .' | link. | :: | +-+------+' `+------+-+ |Act. FW | |Pas. FW | <-Firewalls Active/Passive. +--------+ +--------+

To keep this scenario simple, I'm multihoming to one carrier. I have two Netiron CERs. Each have a eBGP connection to the same peer. The CERs have an iBGP connection to each other. That works all fine and dandy. Feel free to comment, however if you think there is a better way to do this.

Here comes the tricky part. I have two firewalls in an Active/Passive setup. When one fails the other is configured exactly the same and picks up where the other left off. (Yes, all the sessions etc. are actively mirrored between the devices)

I am using OSPFv2 between the CERs and the Firewalls. Failover works just fine, however when I fail an OSPF link that has the active default route, ingress traffic still routes fine and dandy, but egress traffic doesn't. Both Netiron's OSPF are setup to advertise they are the default route.

What I'm wondering is, if OSPF is the right solution for this. How do others solve this problem?

Thanks,

Bret

Note: Since lately ipv6 has been a hot topic, I'll state that after we get the BGP all figured out and working properly, ipv6 is our next project. :)

-Hammer-

12:44 p.m.

Agreed. At an enterprise level, there is no need to risk extended downtime to save a buck or two. Redundant hardware is always a good way to keep Murphy out of the equation. And as far as hardware failures go, it's not that common. Nowadays it's the bugs in overly complicated code on your gear that get you first. I miss IOS 11.3..... -Hammer- On 06/23/2011 01:07 AM, Bret Palsson wrote:

...

That's fine if you are running a website. When it comes to telecommunications, a 15 minute outage is pretty huge. Especially with certain types of customers: emergency services for example.

-Bret

On Jun 23, 2011, at 12:02 AM, Hank Nussbacher wrote:

...
At 20:42 22/06/2011 -0700, Jason Roysdon wrote:

Let me be a bit of a heretic here. How often does your router fail? Or your firewall? In the 25 years I have gone into customers I have found when they did a cross setup as proposed below by Bret and Jason, only one person truly knew the complete setup and if something broke only he was able to fix it. There is never complete printed documentation: routing design, IPs on all interfaces, subnetting schematic, etc. And if there was at one point, after 2 years it was outdated and never updated and only the *1* guy knew the changes in his head.

In that kind of situation, when something stopped working they always had to call in the "guru" to fix it. On the other hand, a simple design of only *one* path (pick either left or right side of each of the ASCII arts), made it possible that even junior network engineers as well as technicians called in on emergency with 4 hours notice, were able to fix the situation much more quickly than the "cross" design. And the MTBF on a single path solution, IMHO, is around 3-4 years. And if you need redundancy, keep a spare box on a shelf, completely loaded with the latest config so that it can be hot-swapped in within 15 minutes of failure.

This 1-path design is not for everyone. The vendors always recommend the "cross" design since they sell 2x the amount of boxes but I have found that life works fine with just a 1-path design as well.

-Hank

...
I second the static routes, specially from a simplicity standpoint. Add in a pair of layer two switches to simplify further:

+--------+ +--------+ | Peer A | | Peer A |<-Many carriers. Using 1 carrier +---+----+ +----+---+ for this scenario. |eBGP | eBGP | | +---+----+iBGP+----+---+ | Router + + Router |<- Routers. Not directly connected +-+------+ +------+-+ | | +-+------+ +------+-+ |L2Switch|----|L2Switch|<- Layer 2 switches, can be stacked +--------+ +--------+ | | +-+------+ +------+-+ |Act. FW |----|Pas. FW |<-Firewalls Active/Passive. +--------+ +--------+

You can lose all of the left leg, or all of the right leg, and still be up. If you want to complicate things, you can add crossing links between it all, but again, beyond BGP and VRRP, this is a very simple design you can easily troubleshoot at 3AM. It's also much easier to document the troubleshooting steps (so you can go on vacation and someone else can solve without calling you) and test upgrades.

You can nearly evenly split the traffic by having a VRRP VIP on each edge router, with the other router backing up the first. The firewalls can have two static routes, one to each VIP, and this will roughly load-balance the traffic out on a packet basis. As you peer with the same ISP, this will work just fine. If they have an outage, your edge routers will learn, and even if the circuit drops it'll know, and basically the VIP will just redirect traffic to the other router.

Now all your firewalls have to do is maintain stateful session information, not OSPF.

If you had two different ISPs (especially if they are not roughly evenly connected), then not having intelligence of the BGP paths in your firewalls can cause an extra hop when it hits router with the longer path, which will redirect it to the router with the shorter path.

Speaking from a Cisco/HSRP point of view, you could be more intelligent (re:more complicated, and complication means harder troubleshooting and more documentation needed) during problem periods by having the VIP move routers automatically based on the WAN link dropping and/or a route beyond it being lost (others can comment to if VRRP supports this). This would save one hop to the "broken" router when the BGP path or WAN is down.

Jason Roysdon

On 06/22/2011 06:07 PM, Bret Palsson wrote:

...
On Wed, Jun 22, 2011 at 5:33 PM, PC<paul4004@gmail.com> wrote:

...
Who makes the firewall?

Juniper SSG. We use NSRP and replicate all the RTOs. We have hitless on the Firewalls, have for years. We're now peering with our own carriers vs. using our datacenter's mix.

A static route from the junipers to the VIP (VRRP) is probably the way to go. I think.

To make this work and be "hitless", your firewall vendor must support

...
stateful replication of routing protocol data (including OSPF). For example, Cisco didn't support this in their ASA product until version 8.4 of code.

Otherwise, a failover requires OSPF to re-converge -- and quite frankly, will likely cause some state of confusion on the upstream OSPF peers, loss of adjacency, and a loss of routing until this occurs. It's like someone just swapped a router with the same IP to the upstream device -- assuming your active/standby vendor's implementation only presents itself as one device.

However, once this is succesful your current failover topology should work fine -- even if it takes some time to failover.

In my opinion though, unless the firewall is serving as "transit" to downstream routers or other layer 3 elements, and you need to run OSPF to it (And through it) as a result, it's often just easier to static default route out from the firewall(s) and redistribute a static route on the upstream routers for the subnets behind the firewalls. It also helps ensure symmetrical traffic flows, which is important for stateful firewalls and can become moderatly confusing when your firewalls start having many interfaces.

On Wed, Jun 22, 2011 at 4:27 PM, Bret Palsson<bret@getjive.com> wrote:

...
Here is my current setup in ASCII art. (Please view in a fixed width font.) Below the art I'll write out the setup.

+--------+ +--------+ | Peer A | | Peer A |<-Many carriers. Using 1 carrier +---+----+ +----+---+ for this scenario. |eBGP | eBGP | | +---+----+iBGP+----+---+ | Router +----+ Router |<-Netiron CERs Routers. +-+------+ +------+-+ |A `.P A.' |P<-A/P indicates Active/Passive | `. .' | link. | :: | +-+------+' `+------+-+ |Act. FW | |Pas. FW |<-Firewalls Active/Passive. +--------+ +--------+

To keep this scenario simple, I'm multihoming to one carrier. I have two Netiron CERs. Each have a eBGP connection to the same peer. The CERs have an iBGP connection to each other. That works all fine and dandy. Feel free to comment, however if you think there is a better way to do this.

Here comes the tricky part. I have two firewalls in an Active/Passive setup. When one fails the other is configured exactly the same and picks up where the other left off. (Yes, all the sessions etc. are actively mirrored between the devices)

I am using OSPFv2 between the CERs and the Firewalls. Failover works just fine, however when I fail an OSPF link that has the active default route, ingress traffic still routes fine and dandy, but egress traffic doesn't. Both Netiron's OSPF are setup to advertise they are the default route.

What I'm wondering is, if OSPF is the right solution for this. How do others solve this problem?

Thanks,

Bret

Note: Since lately ipv6 has been a hot topic, I'll state that after we get the BGP all figured out and working properly, ipv6 is our next project. :)

Valdis.Kletnieks＠vt.edu

1:59 p.m.

On Thu, 23 Jun 2011 07:44:33 CDT, -Hammer- said:

...

Agreed. At an enterprise level, there is no need to risk extended downtime to save a buck or two. Redundant hardware is always a good way to keep Murphy out of the equation. And as far as hardware failures go, it's not that common. Nowadays it's the bugs in overly complicated code on your gear that get you first. I miss IOS 11.3.....

So what you're saying is we're more likely to take an outage due to tripping over a bug, so we should go for the simplest non-crossover config to minimize the chances of hitting a bug. ;)

-Hammer-

3:44 p.m.

HaHa! I agree with keeping it simple. I keep my routers simple. I keep my switches simple. Sometimes it's not as easy on a Layer 7 FW or a load balancer. So plan accordingly. :) -Hammer- On 06/23/2011 08:59 AM, Valdis.Kletnieks@vt.edu wrote:

...

On Thu, 23 Jun 2011 07:44:33 CDT, -Hammer- said:

...
Agreed. At an enterprise level, there is no need to risk extended downtime to save a buck or two. Redundant hardware is always a good way to keep Murphy out of the equation. And as far as hardware failures go, it's not that common. Nowadays it's the bugs in overly complicated code on your gear that get you first. I miss IOS 11.3.....

So what you're saying is we're more likely to take an outage due to tripping over a bug, so we should go for the simplest non-crossover config to minimize the chances of hitting a bug. ;)

Owen DeLong

7:21 p.m.

On Jun 23, 2011, at 6:59 AM, Valdis.Kletnieks@vt.edu wrote:

...

On Thu, 23 Jun 2011 07:44:33 CDT, -Hammer- said:

...
Agreed. At an enterprise level, there is no need to risk extended downtime to save a buck or two. Redundant hardware is always a good way to keep Murphy out of the equation. And as far as hardware failures go, it's not that common. Nowadays it's the bugs in overly complicated code on your gear that get you first. I miss IOS 11.3.....

So what you're saying is we're more likely to take an outage due to tripping over a bug, so we should go for the simplest non-crossover config to minimize the chances of hitting a bug. ;)

It's certainly worthy of consideration. Owen

Owen DeLong

7:15 p.m.

Except in those (becoming less rare than hardware failure) instances where the software controlling the failover process is the actual cause of the outage. Owen On Jun 23, 2011, at 5:44 AM, -Hammer- wrote:

...

Agreed. At an enterprise level, there is no need to risk extended downtime to save a buck or two. Redundant hardware is always a good way to keep Murphy out of the equation. And as far as hardware failures go, it's not that common. Nowadays it's the bugs in overly complicated code on your gear that get you first. I miss IOS 11.3.....

-Hammer-

On 06/23/2011 01:07 AM, Bret Palsson wrote:

...
That's fine if you are running a website. When it comes to telecommunications, a 15 minute outage is pretty huge. Especially with certain types of customers: emergency services for example.

-Bret

On Jun 23, 2011, at 12:02 AM, Hank Nussbacher wrote:

...
At 20:42 22/06/2011 -0700, Jason Roysdon wrote:

Let me be a bit of a heretic here. How often does your router fail? Or your firewall? In the 25 years I have gone into customers I have found when they did a cross setup as proposed below by Bret and Jason, only one person truly knew the complete setup and if something broke only he was able to fix it. There is never complete printed documentation: routing design, IPs on all interfaces, subnetting schematic, etc. And if there was at one point, after 2 years it was outdated and never updated and only the *1* guy knew the changes in his head.

In that kind of situation, when something stopped working they always had to call in the "guru" to fix it. On the other hand, a simple design of only *one* path (pick either left or right side of each of the ASCII arts), made it possible that even junior network engineers as well as technicians called in on emergency with 4 hours notice, were able to fix the situation much more quickly than the "cross" design. And the MTBF on a single path solution, IMHO, is around 3-4 years. And if you need redundancy, keep a spare box on a shelf, completely loaded with the latest config so that it can be hot-swapped in within 15 minutes of failure.

This 1-path design is not for everyone. The vendors always recommend the "cross" design since they sell 2x the amount of boxes but I have found that life works fine with just a 1-path design as well.

-Hank

...
I second the static routes, specially from a simplicity standpoint. Add in a pair of layer two switches to simplify further:

+--------+ +--------+ | Peer A | | Peer A |<-Many carriers. Using 1 carrier +---+----+ +----+---+ for this scenario. |eBGP | eBGP | | +---+----+iBGP+----+---+ | Router + + Router |<- Routers. Not directly connected +-+------+ +------+-+ | | +-+------+ +------+-+ |L2Switch|----|L2Switch|<- Layer 2 switches, can be stacked +--------+ +--------+ | | +-+------+ +------+-+ |Act. FW |----|Pas. FW |<-Firewalls Active/Passive. +--------+ +--------+

You can lose all of the left leg, or all of the right leg, and still be up. If you want to complicate things, you can add crossing links between it all, but again, beyond BGP and VRRP, this is a very simple design you can easily troubleshoot at 3AM. It's also much easier to document the troubleshooting steps (so you can go on vacation and someone else can solve without calling you) and test upgrades.

You can nearly evenly split the traffic by having a VRRP VIP on each edge router, with the other router backing up the first. The firewalls can have two static routes, one to each VIP, and this will roughly load-balance the traffic out on a packet basis. As you peer with the same ISP, this will work just fine. If they have an outage, your edge routers will learn, and even if the circuit drops it'll know, and basically the VIP will just redirect traffic to the other router.

Now all your firewalls have to do is maintain stateful session information, not OSPF.

If you had two different ISPs (especially if they are not roughly evenly connected), then not having intelligence of the BGP paths in your firewalls can cause an extra hop when it hits router with the longer path, which will redirect it to the router with the shorter path.

Speaking from a Cisco/HSRP point of view, you could be more intelligent (re:more complicated, and complication means harder troubleshooting and more documentation needed) during problem periods by having the VIP move routers automatically based on the WAN link dropping and/or a route beyond it being lost (others can comment to if VRRP supports this). This would save one hop to the "broken" router when the BGP path or WAN is down.

Jason Roysdon

On 06/22/2011 06:07 PM, Bret Palsson wrote:

...
On Wed, Jun 22, 2011 at 5:33 PM, PC<paul4004@gmail.com> wrote:

...
Who makes the firewall?

Juniper SSG. We use NSRP and replicate all the RTOs. We have hitless on the Firewalls, have for years. We're now peering with our own carriers vs. using our datacenter's mix.

A static route from the junipers to the VIP (VRRP) is probably the way to go. I think.

To make this work and be "hitless", your firewall vendor must support

...
stateful replication of routing protocol data (including OSPF). For example, Cisco didn't support this in their ASA product until version 8.4 of code.

Otherwise, a failover requires OSPF to re-converge -- and quite frankly, will likely cause some state of confusion on the upstream OSPF peers, loss of adjacency, and a loss of routing until this occurs. It's like someone just swapped a router with the same IP to the upstream device -- assuming your active/standby vendor's implementation only presents itself as one device.

However, once this is succesful your current failover topology should work fine -- even if it takes some time to failover.

In my opinion though, unless the firewall is serving as "transit" to downstream routers or other layer 3 elements, and you need to run OSPF to it (And through it) as a result, it's often just easier to static default route out from the firewall(s) and redistribute a static route on the upstream routers for the subnets behind the firewalls. It also helps ensure symmetrical traffic flows, which is important for stateful firewalls and can become moderatly confusing when your firewalls start having many interfaces.

On Wed, Jun 22, 2011 at 4:27 PM, Bret Palsson<bret@getjive.com> wrote:

> Here is my current setup in ASCII art. (Please view in a fixed width > font.) Below the art I'll write out the setup. > > > +--------+ +--------+ > | Peer A | | Peer A |<-Many carriers. Using 1 carrier > +---+----+ +----+---+ for this scenario. > |eBGP | eBGP > | | > +---+----+iBGP+----+---+ > | Router +----+ Router |<-Netiron CERs Routers. > +-+------+ +------+-+ > |A `.P A.' |P<-A/P indicates Active/Passive > | `. .' | link. > | :: | > +-+------+' `+------+-+ > |Act. FW | |Pas. FW |<-Firewalls Active/Passive. > +--------+ +--------+ > > > To keep this scenario simple, I'm multihoming to one carrier. > I have two Netiron CERs. Each have a eBGP connection to the same peer. > The CERs have an iBGP connection to each other. > That works all fine and dandy. Feel free to comment, however if you think > there is a better way to do this. > > Here comes the tricky part. I have two firewalls in an Active/Passive > setup. When one fails the other is configured exactly the same > and picks up where the other left off. (Yes, all the sessions etc. are > actively mirrored between the devices) > > I am using OSPFv2 between the CERs and the Firewalls. Failover works just > fine, however when I fail an OSPF link that has the active default route, > ingress traffic still routes fine and dandy, but egress traffic doesn't. > Both Netiron's OSPF are setup to advertise they are the default route. > > What I'm wondering is, if OSPF is the right solution for this. How do > others solve this problem? > > > Thanks, > > Bret > > > Note: Since lately ipv6 has been a hot topic, I'll state that after we get > the BGP all figured out and working properly, ipv6 is our next project. :) > > > >

-Hammer-

7:19 p.m.

True True. I've seen that before as well. Actually I've seen it more with various vendors implementations of VRRP than I have with Cisco HSRP or Juniper NSRP. But it seems to me more or less that most issues we deal with these days are software related bugs as opposed to hardware related issues. Maybe that's why I get a quarterly bug scrub from Cisco on my gear as opposed to a quarterly hardware analysis. :) To each their own.... -Hammer- On 06/23/2011 02:15 PM, Owen DeLong wrote:

...

Except in those (becoming less rare than hardware failure) instances where the software controlling the failover process is the actual cause of the outage.

Owen

On Jun 23, 2011, at 5:44 AM, -Hammer- wrote:

...
Agreed. At an enterprise level, there is no need to risk extended downtime to save a buck or two. Redundant hardware is always a good way to keep Murphy out of the equation. And as far as hardware failures go, it's not that common. Nowadays it's the bugs in overly complicated code on your gear that get you first. I miss IOS 11.3.....

-Hammer-

On 06/23/2011 01:07 AM, Bret Palsson wrote:

...
That's fine if you are running a website. When it comes to telecommunications, a 15 minute outage is pretty huge. Especially with certain types of customers: emergency services for example.

-Bret

On Jun 23, 2011, at 12:02 AM, Hank Nussbacher wrote:

...
At 20:42 22/06/2011 -0700, Jason Roysdon wrote:

Let me be a bit of a heretic here. How often does your router fail? Or your firewall? In the 25 years I have gone into customers I have found when they did a cross setup as proposed below by Bret and Jason, only one person truly knew the complete setup and if something broke only he was able to fix it. There is never complete printed documentation: routing design, IPs on all interfaces, subnetting schematic, etc. And if there was at one point, after 2 years it was outdated and never updated and only the *1* guy knew the changes in his head.

In that kind of situation, when something stopped working they always had to call in the "guru" to fix it. On the other hand, a simple design of only *one* path (pick either left or right side of each of the ASCII arts), made it possible that even junior network engineers as well as technicians called in on emergency with 4 hours notice, were able to fix the situation much more quickly than the "cross" design. And the MTBF on a single path solution, IMHO, is around 3-4 years. And if you need redundancy, keep a spare box on a shelf, completely loaded with the latest config so that it can be hot-swapped in within 15 minutes of failure.

This 1-path design is not for everyone. The vendors always recommend the "cross" design since they sell 2x the amount of boxes but I have found that life works fine with just a 1-path design as well.

-Hank

...
I second the static routes, specially from a simplicity standpoint. Add in a pair of layer two switches to simplify further:

+--------+ +--------+ | Peer A | | Peer A |<-Many carriers. Using 1 carrier +---+----+ +----+---+ for this scenario. |eBGP | eBGP | | +---+----+iBGP+----+---+ | Router + + Router |<- Routers. Not directly connected +-+------+ +------+-+ | | +-+------+ +------+-+ |L2Switch|----|L2Switch|<- Layer 2 switches, can be stacked +--------+ +--------+ | | +-+------+ +------+-+ |Act. FW |----|Pas. FW |<-Firewalls Active/Passive. +--------+ +--------+

You can lose all of the left leg, or all of the right leg, and still be up. If you want to complicate things, you can add crossing links between it all, but again, beyond BGP and VRRP, this is a very simple design you can easily troubleshoot at 3AM. It's also much easier to document the troubleshooting steps (so you can go on vacation and someone else can solve without calling you) and test upgrades.

You can nearly evenly split the traffic by having a VRRP VIP on each edge router, with the other router backing up the first. The firewalls can have two static routes, one to each VIP, and this will roughly load-balance the traffic out on a packet basis. As you peer with the same ISP, this will work just fine. If they have an outage, your edge routers will learn, and even if the circuit drops it'll know, and basically the VIP will just redirect traffic to the other router.

Now all your firewalls have to do is maintain stateful session information, not OSPF.

If you had two different ISPs (especially if they are not roughly evenly connected), then not having intelligence of the BGP paths in your firewalls can cause an extra hop when it hits router with the longer path, which will redirect it to the router with the shorter path.

Speaking from a Cisco/HSRP point of view, you could be more intelligent (re:more complicated, and complication means harder troubleshooting and more documentation needed) during problem periods by having the VIP move routers automatically based on the WAN link dropping and/or a route beyond it being lost (others can comment to if VRRP supports this). This would save one hop to the "broken" router when the BGP path or WAN is down.

Jason Roysdon

On 06/22/2011 06:07 PM, Bret Palsson wrote:

...
On Wed, Jun 22, 2011 at 5:33 PM, PC<paul4004@gmail.com> wrote:

> Who makes the firewall? > > > > Juniper SSG. We use NSRP and replicate all the RTOs. We have hitless on the Firewalls, have for years. We're now peering with our own carriers vs. using our datacenter's mix.

A static route from the junipers to the VIP (VRRP) is probably the way to go. I think.

To make this work and be "hitless", your firewall vendor must support

> stateful replication of routing protocol data (including OSPF). For > example, Cisco didn't support this in their ASA product until version 8.4 of > code. > > Otherwise, a failover requires OSPF to re-converge -- and quite frankly, > will likely cause some state of confusion on the upstream OSPF peers, loss > of adjacency, and a loss of routing until this occurs. It's like someone > just swapped a router with the same IP to the upstream device -- assuming > your active/standby vendor's implementation only presents itself as one > device. > > However, once this is succesful your current failover topology should work > fine -- even if it takes some time to failover. > > In my opinion though, unless the firewall is serving as "transit" to > downstream routers or other layer 3 elements, and you need to run OSPF to it > (And through it) as a result, it's often just easier to static default route > out from the firewall(s) and redistribute a static route on the upstream > routers for the subnets behind the firewalls. It also helps ensure > symmetrical traffic flows, which is important for stateful firewalls and can > become moderatly confusing when your firewalls start having many interfaces. > > > > > On Wed, Jun 22, 2011 at 4:27 PM, Bret Palsson<bret@getjive.com> wrote: > > > >> Here is my current setup in ASCII art. (Please view in a fixed width >> font.) Below the art I'll write out the setup. >> >> >> +--------+ +--------+ >> | Peer A | | Peer A |<-Many carriers. Using 1 carrier >> +---+----+ +----+---+ for this scenario. >> |eBGP | eBGP >> | | >> +---+----+iBGP+----+---+ >> | Router +----+ Router |<-Netiron CERs Routers. >> +-+------+ +------+-+ >> |A `.P A.' |P<-A/P indicates Active/Passive >> | `. .' | link. >> | :: | >> +-+------+' `+------+-+ >> |Act. FW | |Pas. FW |<-Firewalls Active/Passive. >> +--------+ +--------+ >> >> >> To keep this scenario simple, I'm multihoming to one carrier. >> I have two Netiron CERs. Each have a eBGP connection to the same peer. >> The CERs have an iBGP connection to each other. >> That works all fine and dandy. Feel free to comment, however if you think >> there is a better way to do this. >> >> Here comes the tricky part. I have two firewalls in an Active/Passive >> setup. When one fails the other is configured exactly the same >> and picks up where the other left off. (Yes, all the sessions etc. are >> actively mirrored between the devices) >> >> I am using OSPFv2 between the CERs and the Firewalls. Failover works just >> fine, however when I fail an OSPF link that has the active default route, >> ingress traffic still routes fine and dandy, but egress traffic doesn't. >> Both Netiron's OSPF are setup to advertise they are the default route. >> >> What I'm wondering is, if OSPF is the right solution for this. How do >> others solve this problem? >> >> >> Thanks, >> >> Bret >> >> >> Note: Since lately ipv6 has been a hot topic, I'll state that after we get >> the BGP all figured out and working properly, ipv6 is our next project. :) >> >> >> >> >> > >

Jason Roysdon

24 Jun 24 Jun

5:36 a.m.

The config I propose is really not complicated beyond BGP and HSRP/VRRP. It doesn't take a CCIE for this, and the documentation isn't that hard to set up and maintain. It's just a procedural thing that any config change automatically requires a document review/update. You should have as-builds documenting all changes before it ever is made in production. If you can't manage that, you've got bigger problems. Second, if you're support bench isn't deep enough, this is why you have support contracts with people who do know how to maintain things. If it's that important, get the support lined up ahead of time with guaranteed response times and make sure everyone knows the method to contact them, and test it (monthly, quarterly, whatever makes you comfortable). I agree that code/bugs are the biggest problems. However, you cannot just ignore updates when they affect your specific implementations, which means you will have to patch sometimes. I review each update that is released for the major.minor versions we are on and determine if the security or other bugs affect us enough to outweigh the risk of some unknown bug affecting us. It seems to be running about 50% right now (about half the time we skip an update). With redundancy in place, you can take down one of the node during your slowest times, upgrade, and then watch for whatever time you feel is sufficient (couple of days, week, two weeks), and then upgrade the other mate. Further, if you really care about downtime, you'll have an identical setup in test and be vetting all changes and updates before hand. No change or command would ever be issued on a production device that you didn't vet on the test gear. Granted, this may not catch the corner cases, but if you can simulate and stress the gear the same as the production gear, you'll catch most of the problems. When the 9s count, you do the math and do it right. On 06/23/2011 05:44 AM, -Hammer- wrote:

...

Agreed. At an enterprise level, there is no need to risk extended downtime to save a buck or two. Redundant hardware is always a good way to keep Murphy out of the equation. And as far as hardware failures go, it's not that common. Nowadays it's the bugs in overly complicated code on your gear that get you first. I miss IOS 11.3.....

-Hammer-

On 06/23/2011 01:07 AM, Bret Palsson wrote:

...
That's fine if you are running a website. When it comes to telecommunications, a 15 minute outage is pretty huge. Especially with certain types of customers: emergency services for example.

-Bret

On Jun 23, 2011, at 12:02 AM, Hank Nussbacher wrote:

...
At 20:42 22/06/2011 -0700, Jason Roysdon wrote:

Let me be a bit of a heretic here. How often does your router fail? Or your firewall? In the 25 years I have gone into customers I have found when they did a cross setup as proposed below by Bret and Jason, only one person truly knew the complete setup and if something broke only he was able to fix it. There is never complete printed documentation: routing design, IPs on all interfaces, subnetting schematic, etc. And if there was at one point, after 2 years it was outdated and never updated and only the *1* guy knew the changes in his head.

In that kind of situation, when something stopped working they always had to call in the "guru" to fix it. On the other hand, a simple design of only *one* path (pick either left or right side of each of the ASCII arts), made it possible that even junior network engineers as well as technicians called in on emergency with 4 hours notice, were able to fix the situation much more quickly than the "cross" design. And the MTBF on a single path solution, IMHO, is around 3-4 years. And if you need redundancy, keep a spare box on a shelf, completely loaded with the latest config so that it can be hot-swapped in within 15 minutes of failure.

This 1-path design is not for everyone. The vendors always recommend the "cross" design since they sell 2x the amount of boxes but I have found that life works fine with just a 1-path design as well.

-Hank

...
I second the static routes, specially from a simplicity standpoint. Add in a pair of layer two switches to simplify further:

+--------+ +--------+ | Peer A | | Peer A |<-Many carriers. Using 1 carrier +---+----+ +----+---+ for this scenario. |eBGP | eBGP | | +---+----+iBGP+----+---+ | Router + + Router |<- Routers. Not directly connected +-+------+ +------+-+ | | +-+------+ +------+-+ |L2Switch|----|L2Switch|<- Layer 2 switches, can be stacked +--------+ +--------+ | | +-+------+ +------+-+ |Act. FW |----|Pas. FW |<-Firewalls Active/Passive. +--------+ +--------+

You can lose all of the left leg, or all of the right leg, and still be up. If you want to complicate things, you can add crossing links between it all, but again, beyond BGP and VRRP, this is a very simple design you can easily troubleshoot at 3AM. It's also much easier to document the troubleshooting steps (so you can go on vacation and someone else can solve without calling you) and test upgrades.

You can nearly evenly split the traffic by having a VRRP VIP on each edge router, with the other router backing up the first. The firewalls can have two static routes, one to each VIP, and this will roughly load-balance the traffic out on a packet basis. As you peer with the same ISP, this will work just fine. If they have an outage, your edge routers will learn, and even if the circuit drops it'll know, and basically the VIP will just redirect traffic to the other router.

Now all your firewalls have to do is maintain stateful session information, not OSPF.

If you had two different ISPs (especially if they are not roughly evenly connected), then not having intelligence of the BGP paths in your firewalls can cause an extra hop when it hits router with the longer path, which will redirect it to the router with the shorter path.

Speaking from a Cisco/HSRP point of view, you could be more intelligent (re:more complicated, and complication means harder troubleshooting and more documentation needed) during problem periods by having the VIP move routers automatically based on the WAN link dropping and/or a route beyond it being lost (others can comment to if VRRP supports this). This would save one hop to the "broken" router when the BGP path or WAN is down.

Jason Roysdon

On 06/22/2011 06:07 PM, Bret Palsson wrote:

...
On Wed, Jun 22, 2011 at 5:33 PM, PC<paul4004@gmail.com> wrote:

...
Who makes the firewall?

Juniper SSG. We use NSRP and replicate all the RTOs. We have hitless on the Firewalls, have for years. We're now peering with our own carriers vs. using our datacenter's mix.

A static route from the junipers to the VIP (VRRP) is probably the way to go. I think.

To make this work and be "hitless", your firewall vendor must support

...
stateful replication of routing protocol data (including OSPF). For example, Cisco didn't support this in their ASA product until version 8.4 of code.

Otherwise, a failover requires OSPF to re-converge -- and quite frankly, will likely cause some state of confusion on the upstream OSPF peers, loss of adjacency, and a loss of routing until this occurs. It's like someone just swapped a router with the same IP to the upstream device -- assuming your active/standby vendor's implementation only presents itself as one device.

However, once this is succesful your current failover topology should work fine -- even if it takes some time to failover.

In my opinion though, unless the firewall is serving as "transit" to downstream routers or other layer 3 elements, and you need to run OSPF to it (And through it) as a result, it's often just easier to static default route out from the firewall(s) and redistribute a static route on the upstream routers for the subnets behind the firewalls. It also helps ensure symmetrical traffic flows, which is important for stateful firewalls and can become moderatly confusing when your firewalls start having many interfaces.

On Wed, Jun 22, 2011 at 4:27 PM, Bret Palsson<bret@getjive.com> wrote:

> Here is my current setup in ASCII art. (Please view in a fixed width > font.) Below the art I'll write out the setup. > > > +--------+ +--------+ > | Peer A | | Peer A |<-Many carriers. Using 1 carrier > +---+----+ +----+---+ for this scenario. > |eBGP | eBGP > | | > +---+----+iBGP+----+---+ > | Router +----+ Router |<-Netiron CERs Routers. > +-+------+ +------+-+ > |A `.P A.' |P<-A/P indicates Active/Passive > | `. .' | link. > | :: | > +-+------+' `+------+-+ > |Act. FW | |Pas. FW |<-Firewalls Active/Passive. > +--------+ +--------+ > > > To keep this scenario simple, I'm multihoming to one carrier. > I have two Netiron CERs. Each have a eBGP connection to the same > peer. > The CERs have an iBGP connection to each other. > That works all fine and dandy. Feel free to comment, however if > you think > there is a better way to do this. > > Here comes the tricky part. I have two firewalls in an > Active/Passive > setup. When one fails the other is configured exactly the same > and picks up where the other left off. (Yes, all the sessions > etc. are > actively mirrored between the devices) > > I am using OSPFv2 between the CERs and the Firewalls. Failover > works just > fine, however when I fail an OSPF link that has the active > default route, > ingress traffic still routes fine and dandy, but egress traffic > doesn't. > Both Netiron's OSPF are setup to advertise they are the default > route. > > What I'm wondering is, if OSPF is the right solution for this. > How do > others solve this problem? > > > Thanks, > > Bret > > > Note: Since lately ipv6 has been a hot topic, I'll state that > after we get > the BGP all figured out and working properly, ipv6 is our next > project. :) > > > >

PC

23 Jun 23 Jun

4:04 a.m.

A quick google search says you should be ok with screenos 6.0 or later for the routing protocol replication. I'm looking at your diagram again though. You will want a switch in the middle of your Firewalls and routers, as the firewalls are in an active/standby mode and do not independently run OSPF. And in this case, throw them all on one vlan, and let them peer with each other (2x1). This could actually be your problem. None the less, I agree, why involve it in OSPF and make it complex if there's no real need to? I think your static route idea is the best way to do, given the FW supports presenting itself as a "single" entity. On Wed, Jun 22, 2011 at 7:07 PM, Bret Palsson <bret@getjive.com> wrote:

...

On Wed, Jun 22, 2011 at 5:33 PM, PC <paul4004@gmail.com> wrote:

...
Who makes the firewall?

Juniper SSG. We use NSRP and replicate all the RTOs. We have hitless on the Firewalls, have for years. We're now peering with our own carriers vs. using our datacenter's mix.

A static route from the junipers to the VIP (VRRP) is probably the way to go. I think.

To make this work and be "hitless", your firewall vendor must support

...
stateful replication of routing protocol data (including OSPF). For example, Cisco didn't support this in their ASA product until version 8.4 of code.

Otherwise, a failover requires OSPF to re-converge -- and quite frankly, will likely cause some state of confusion on the upstream OSPF peers, loss of adjacency, and a loss of routing until this occurs. It's like someone just swapped a router with the same IP to the upstream device -- assuming your active/standby vendor's implementation only presents itself as one device.

However, once this is succesful your current failover topology should work fine -- even if it takes some time to failover.

In my opinion though, unless the firewall is serving as "transit" to downstream routers or other layer 3 elements, and you need to run OSPF to it (And through it) as a result, it's often just easier to static default route out from the firewall(s) and redistribute a static route on the upstream routers for the subnets behind the firewalls. It also helps ensure symmetrical traffic flows, which is important for stateful firewalls and can become moderatly confusing when your firewalls start having many interfaces.

On Wed, Jun 22, 2011 at 4:27 PM, Bret Palsson <bret@getjive.com> wrote:

...
Here is my current setup in ASCII art. (Please view in a fixed width font.) Below the art I'll write out the setup.

+--------+ +--------+ | Peer A | | Peer A | <-Many carriers. Using 1 carrier +---+----+ +----+---+ for this scenario. |eBGP | eBGP | | +---+----+iBGP+----+---+ | Router +----+ Router | <-Netiron CERs Routers. +-+------+ +------+-+ |A `.P A.' |P <-A/P indicates Active/Passive | `. .' | link. | :: | +-+------+' `+------+-+ |Act. FW | |Pas. FW | <-Firewalls Active/Passive. +--------+ +--------+

To keep this scenario simple, I'm multihoming to one carrier. I have two Netiron CERs. Each have a eBGP connection to the same peer. The CERs have an iBGP connection to each other. That works all fine and dandy. Feel free to comment, however if you think there is a better way to do this.

Here comes the tricky part. I have two firewalls in an Active/Passive setup. When one fails the other is configured exactly the same and picks up where the other left off. (Yes, all the sessions etc. are actively mirrored between the devices)

I am using OSPFv2 between the CERs and the Firewalls. Failover works just fine, however when I fail an OSPF link that has the active default route, ingress traffic still routes fine and dandy, but egress traffic doesn't. Both Netiron's OSPF are setup to advertise they are the default route.

What I'm wondering is, if OSPF is the right solution for this. How do others solve this problem?

Thanks,

Bret

Note: Since lately ipv6 has been a hot topic, I'll state that after we get the BGP all figured out and working properly, ipv6 is our next project. :)

William Herrin

22 Jun 22 Jun

11:42 p.m.

On Wed, Jun 22, 2011 at 6:27 PM, Bret Palsson <bret@getjive.com> wrote:

...

I am using OSPFv2 between the CERs and the Firewalls. Failover works just fine, however when I fail an OSPF link that has the active default route, ingress traffic still routes fine and dandy, but egress traffic doesn't. Both Netiron's OSPF are setup to advertise they are the default route.

Hi Bret, I have a setup that is almost identical except there is a pair of simple switches between the routers and firewalls interconnecting all into a LAN and I'm working with Cisco 2811's instead of Netiron CERs. Can you expand on the interface addressing and what the firewalls see via OSPF during your failure scenario?

...

What I'm wondering is, if OSPF is the right solution for this. How do others solve this problem?

My failover firewall also connects to the switches (inside and out) and turns down ports which connect to the primary firewall. During a failure, the primary can't be depended on to completely take itself out of line. If it was in a working state that could be depended on, it wouldn't have failed. Regards, Bill Herrin -- William D. Herrin ................ herrin@dirtside.com bill@herrin.us 3005 Crane Dr. ...................... Web: <http://bill.herrin.us/> Falls Church, VA 22042-3004

George Bonser

23 Jun 23 Jun

5:04 p.m.

...

I am using OSPFv2 between the CERs and the Firewalls. Failover works just fine, however when I fail an OSPF link that has the active default route, ingress traffic still routes fine and dandy, but egress traffic doesn't. Both Netiron's OSPF are setup to advertise they are the default route.

What I'm wondering is, if OSPF is the right solution for this. How do others solve this problem?

Thanks,

Bret

Man, I would have a lot of questions. The CER's are a layer2/3 switch. What is the topology and how are you "failing" the link? Are the links to the firewalls on a vlan with the interfaces being a ve on the CERs or are the interfaces to the firewalls "route-only"? Is that vlan trunked across on the link between the two switches? How are you failing it over? There are lots of "failover" things you could be doing (turning off the left router, turning off the left firewall, disabling the primary port from the left router to the left firewall). When you say it doesn't work are you saying that it doesn't work if you disable the port from the left router to the left firewall or are you saying it doesn't work with the right firewall takes over from the left or what. There are so many subtle configuration possibilities with these units that just given a wiring diagram without also seeing the config makes it hard to help. I am guessing that the connections to the firewalls are not MCT cluster trunks because you can't run layer3 routing protocols with MCT (yet) on the CERs. Is it link failover or device failover that isn't working?

5103

Age (days ago)

5126

Last active (days ago)

List overview

Download

24 comments

14 participants

participants (14)

-Hammer-
Brant I. Stevens
Bret Palsson
George Bonser
Hank Nussbacher
Ingo Flaschberger
Jason Roysdon
Matt Hite
Owen DeLong
PC
Randy Bush
Valdis.Kletnieks＠vt.edu
William Cooper
William Herrin