carping about CARP

older
Re: When an ISP should run their...

Robert E. Seastrom

30 Nov 2012 30 Nov '12

5:52 a.m.

I can't seem to recall anyone griping about this here on our august little list but google finds that I'm by no means the first to have been burned by an unholy interaction between VRRP and CARP. Let's skip the protocol discussions (same protocol number and uses multicast) [*] and go straight to the behavioral observations. I turned on VRRP this evening on a pair of routers. All of a sudden a CARP instance between a pair of pfSense boxes in the rack (which I didn't even know was there) invited itself to the party and started flailing all over the place and causing oscillating packet loss for anything that was going off-segment. Note that the Ciscos didn't exhibit any untoward behavior, and there were "passwords" on the VRRP sessions too. Meanwhile, the pfSense box spazzed out and filled its dmesg logs with stuff like: arp: 192.0.2.1 moved from 00:00:0c:xx:xx:01 to 00:00:5e:xx:xx:01 on em1 arp: 192.0.2.1 moved from 00:00:5e:xx:xx:01 to 00:00:0c:xx:xx:01 on em1 (no other hosts on the segment were logging such activity) Looks like CARP is a bit loose about believing stuff coming in over the wire. Seems a bit out of character for OpenBSD, but maybe these days it's considered all good so long as such a malfunction only causes an outage, not a core dump. Anyway, word to the wise, CARP and VRRP is a bit of a dangerous mixture. -r [*] The OpenBSD side of the story can be read at http://en.wikipedia.org/wiki/Common_Address_Redundancy_Protocol#No_official_... Seems that there is a lesson to be learned here: "o hai, we wrote this software but can not be bothered to follow your process or formally write up the protocol, plz to be giving us a protocol number" ain't gonna fly.

Show replies by date

Christopher Morrow

30 Nov 30 Nov

6:56 a.m.

On Fri, Nov 30, 2012 at 12:52 AM, Robert E. Seastrom <rs@seastrom.com> wrote:

...

Note that the Ciscos didn't exhibit any untoward behavior, and there were "passwords" on the VRRP sessions too.

case of the same situation all[1] 'software md5 tcp' implementations have? sign but never verify... -chris [1]: solaris's md5 and I believe the linux one do this :(

Randy Bush

9:02 a.m.

...

case of the same situation all[1] 'software md5 tcp' implementations have? sign but never verify...

and freebsd :(

Stuart Henderson

1:38 p.m.

On 2012-11-30, Randy Bush <randy@psg.com> wrote:

...

...
case of the same situation all[1] 'software md5 tcp' implementations have? sign but never verify...

and freebsd :(

openbsd verifies these, btw.

David Walker

8:59 a.m.

On 30/11/2012, Robert E. Seastrom <rs@seastrom.com> wrote:

...

[*] The OpenBSD side of the story can be read at http://en.wikipedia.org/wiki/Common_Address_Redundancy_Protocol#No_official_...

Seems that there is a lesson to be learned here:

"o hai, we wrote this software but can not be bothered to follow your process or formally write up the protocol, plz to be giving us a protocol number" ain't gonna fly.

This tells me pretty much everything I need to know about this: https://datatracker.ietf.org/ipr/19/ Theo's comments in context here: http://marc.info/?l=openbsd-misc&m=133832434412686&w=2 The article in question: http://queue.acm.org/detail.cfm?id=2090149 I recommend reading the comments.

...

From where I stand, the OpenBSD project has been consistent on insulating itself against future legal issues, no matter how remote, with the idea that your security should not be restrained by anyone other than you. I believe that idea has legs regardless of practical considerations and stands on it's own.

Besides, I won't discount OpenBSD out of hand for forging ahead, withstanding practical issues, considering the runs they've got on the board and the many facepalm fails we see in the diametrically opposed corporate world. It might be a very good thing they've bothered to take the time on this. Best wishes.

Robert E. Seastrom

12:44 p.m.

David Walker <davidianwalker@gmail.com> writes:

...

[ patent fight recap ]

Thanks for posting those. I recall the discussions surrounding the HSRP patents well, but it's been a while and I have proportionally more gray hair (and less overall) now. My problem is not with Theo nor with the IETF. My problem is with a crappy and credulous implementation. When an outage is caused by redundancy software that comes from an organization that prides itself on well-written code, the irony meter goes off the scale.

...

From where I stand, the OpenBSD project has been consistent on insulating itself against future legal issues, no matter how remote, with the idea that your security should not be restrained by anyone other than you.

What is "security" though and what it its aim? To my way of thinking, what happened to me last night wherein a box misbehaved and caused indigestion on an entire broadcast domain was a non-trivial security and availability incident. On the scale of badness, it's somewhat worse than a "magic packet causes this box to reboot" flaw, but not as bad as a "box gets owned, sensitive data gets divulged" incident. In my world, at least, security and availability are intimately intertwined. Were they not, one could easily "win" the security "game" by the simple expedient of turning the host off. Mission accomplished!

...

I believe that idea has legs regardless of practical considerations and stands on it's own.

Besides, I won't discount OpenBSD out of hand for forging ahead, withstanding practical issues, considering the runs they've got on the board and the many facepalm fails we see in the diametrically opposed corporate world.

It might be a very good thing they've bothered to take the time on this.

The problem here is "insufficient paranoia about packets that come flying in over the transom, based on naive contemporaneous belief that a particular protocol number was not in use". I mean, gosh, who would ever send packets on an unused protocol number? And who other than us would get frustrated with the process and decide to forge ahead on their own. Most of us here are familiar with Postel's oft-quoted RFC793 robustness principle ("be conservative in what you do, be liberal in what you accept from others"). Yet, when one is engaging in an off-label use of any protocol, identifier, etc. it is incumbent on the protocol designer to mark their traffic in a particular way so that it is easy to identify, both for themselves and for others. Sure, one could argue that this is merely abstracting away the semantics of the protocol number field (hopefully to a field with more data space) but the whole point is to not accidentally interoperate with something with which you are not prepared to interoperate. Stated another way, nothing is keeping me from using udp/139 for something else so long as my packets aren't misinterpreted by SMB servers out there as being SMB, and so long as I don't accidentally eat someone else's SMB and do something stupid. Would you eat food that someone left on your doorstep with no note and no hint as to who it came from? Obviously from your mom, right? I mean who else would leave food on your doorstep? How about Halloween candy with open wrappers? The comparisons in the messages you cited to a four year old may not be that far off. -r

Henning Brauer

1:08 p.m.

* Robert E. Seastrom <rs@seastrom.com> [2012-11-30 13:46]:

...

My problem is not with Theo nor with the IETF. My problem is with a crappy and credulous implementation. When an outage is caused by redundancy software that comes from an organization that prides itself on well-written code, the irony meter goes off the scale.

vrrp and carp share the vhid space. you have to use unique vhids per network segment, that's about it. the openbsd box was nice enough to tell you about the mac address conflict, the other's didn't. if you looked at the carp boxes you had seen that carp had continued to work just fine. the mac address (which is basically "fixed prefix + vhid) conflict is your "outage". there's nothing we could do about that. and re IANA, they made it clear they would not give us a proto number no matter what; we didn't have a choice but to ignore that industry-money-driven committee. -- Henning Brauer, hb@bsws.de, henning@openbsd.org BS Web Services, http://bsws.de Full-Service ISP - Secure Hosting, Mail and DNS Services Dedicated Servers, Rootservers, Application Hosting

Robert E. Seastrom

2:35 p.m.

Henning Brauer <hb-nanog@bsws.de> writes:

...

* Robert E. Seastrom <rs@seastrom.com> [2012-11-30 13:46]:

...
My problem is not with Theo nor with the IETF. My problem is with a crappy and credulous implementation. When an outage is caused by redundancy software that comes from an organization that prides itself on well-written code, the irony meter goes off the scale.

vrrp and carp share the vhid space. you have to use unique vhids per network segment, that's about it.

the openbsd box was nice enough to tell you about the mac address conflict, the other's didn't.

pfSense is FreeBSD, but who's counting? The problem is magnified when ill-behaved software ends up in appliances. Good thing we were able to get a shell on the box.

...

if you looked at the carp boxes you had seen that carp had continued to work just fine. the mac address (which is basically "fixed prefix + vhid) conflict is your "outage". there's nothing we could do about that.

and re IANA, they made it clear they would not give us a proto number no matter what; we didn't have a choice but to ignore that industry-money-driven committee.

Between choosing an Ethernet OUI which was assigned to IANA by IEEE (another "industry-money-driven committee") and choosing protocol 112 (odds of coincidence 1 in what, 120 or so at the time?), "ignore" is not the word I would have chosen here. -r

David Conrad

4:48 p.m.

On Nov 30, 2012, at 5:08 AM, Henning Brauer <hb-nanog@bsws.de> wrote:

...

and re IANA, they made it clear they would not give us a proto number

As they should have. IANA abides by the rules laid down for it by the IETF/IESG/IAB. The openbsd folks couldn't be bothered to even write up a draft and chose to squat on a protocol number.

...

no matter what;

BS. If the openbsd folks followed the rules, they'd have gotten the number(s) they requested (assuming they were justified). There is no grand persecution here. There is management of a limited resource.

...

we didn't have a choice but to ignore that industry-money-driven committee.

Which 'industry-money-driven committee' would that be? Regards, -drc

Doug Barton

6:34 p.m.

This issue came up originally during my tenure at IANA, and FWIW I concur with David. I have a vague memory of engaging directly with some folks from OpenBSD and letting them know that I was sympathetic with their situation, but IANA has strict rules to follow, and unless they followed procedure my hands were tied. Re the "industry-money-driven committee" bit, at the time (and in fact, up until recently) I was a FreeBSD committer myself, so if anything I was *more* inclined to be sympathetic to those from the OS community who submitted applications. I can also assure you that we did assign code points to a non-trivial number of open source applicants _who followed the documented procedures_. Doug On 11/30/2012 10:48 AM, David Conrad wrote:

...

On Nov 30, 2012, at 5:08 AM, Henning Brauer <hb-nanog@bsws.de> wrote:

...
and re IANA, they made it clear they would not give us a proto number

As they should have. IANA abides by the rules laid down for it by the IETF/IESG/IAB. The openbsd folks couldn't be bothered to even write up a draft and chose to squat on a protocol number.

...
no matter what;

BS. If the openbsd folks followed the rules, they'd have gotten the number(s) they requested (assuming they were justified). There is no grand persecution here. There is management of a limited resource.

...
we didn't have a choice but to ignore that industry-money-driven committee.

Which 'industry-money-driven committee' would that be?

Regards, -drc

Claudio Jeker

9:01 p.m.

On Fri, Nov 30, 2012 at 08:48:48AM -0800, David Conrad wrote:

...

On Nov 30, 2012, at 5:08 AM, Henning Brauer <hb-nanog@bsws.de> wrote:

...
and re IANA, they made it clear they would not give us a proto number

As they should have. IANA abides by the rules laid down for it by the IETF/IESG/IAB. The openbsd folks couldn't be bothered to even write up a draft and chose to squat on a protocol number.

...
no matter what;

BS. If the openbsd folks followed the rules, they'd have gotten the number(s) they requested (assuming they were justified). There is no grand persecution here. There is management of a limited resource.

IETF already decided that VRRP was the way to go. So an alternative implementation would not have been accepted. The result would be a draft that would never be adopted and so it is back to start. Still carp packets can coexist with vrrp packets. They use a different version numbers. Also you need to use a different vhid but the same thing is true if you have 2 groups of vrrp on the same lan. If you configure something like VRRP you should run a quick tcpdump first and check if there are not unexpected packets showing up. This is especially important for any protocol that does a link local multicast or broadcast. This is basic network admin best practice (at least I expect that from a network engineer).

...

...
we didn't have a choice but to ignore that industry-money-driven committee.

Which 'industry-money-driven committee' would that be?

Did you ever read any of the IETF mailing lists and looked at the email addresses of those people pushing the hardest? At least in the ones I'm subscribed to the bias is obvious. -- :wq Claudio

Andrew Sullivan

11:23 p.m.

On Fri, Nov 30, 2012 at 10:01:54PM +0100, Claudio Jeker wrote:

...

implementation would not have been accepted. The result would be a draft that would never be adopted and so it is back to start.

"Adopted" by whom? The procedure, even at the time, did not require in any way IETF consensus. Getting a number requires that you tell others what is going on, not that you justify the going on itself.

...

Did you ever read any of the IETF mailing lists and looked at the email addresses of those people pushing the hardest? At least in the ones I'm subscribed to the bias is obvious.

I think that _ad hominem_ arguments are fallacious, and should be dismissed as such. A -- Andrew Sullivan Dyn Labs asullivan@dyn.com

Nick Hilliard

11:39 p.m.

On 30/11/2012 21:01, Claudio Jeker wrote:

...

Still carp packets can coexist with vrrp packets. They use a different version numbers.

And the same mac address pool, which means that if you use the same vhid as vrrp group number, you will trash both your carp and vrrp virtual IPs. Carp was coded explicitly knowing that this would happen, because it squats the VRRP mac address ranges as well as protocol 112. This isn't documented anywhere on the openbsd web site. Not in the man pages, not in the FAQs and not in the pf documentation. It would be real nice to think that this was an oversight. Basic developer best practice involves not deliberately creating loss-of-service style pitfalls for users to fall into, and the least I expect from a developer is that if they're going to do write a "different" protocol which poops all over a similar protocol and takes down peoples' networks during deployment because it's squatting someone else's registered space, the least they could do is document the problem clearly. Regardless of any faux innocent pieties expressed by the openbsd people, this protocol behaviour is astoundingly obnoxious. Nick Also you need to use a different vhid but the same thing

...

is true if you have 2 groups of vrrp on the same lan. If you configure something like VRRP you should run a quick tcpdump first and check if there are not unexpected packets showing up. This is especially important for any protocol that does a link local multicast or broadcast. This is basic network admin best practice (at least I expect that from a network engineer).

...
...
we didn't have a choice but to ignore that industry-money-driven committee.

Which 'industry-money-driven committee' would that be?

Did you ever read any of the IETF mailing lists and looked at the email addresses of those people pushing the hardest? At least in the ones I'm subscribed to the bias is obvious.

David Walker

3:35 p.m.

Comments inline ... as best I can. On 30/11/2012, Robert E. Seastrom <rs@seastrom.com> wrote:

...

David Walker <davidianwalker@gmail.com> writes:

...
[ patent fight recap ]

Thanks for posting those. I recall the discussions surrounding the HSRP patents well, but it's been a while and I have proportionally more gray hair (and less overall) now.

My problem is not with Theo nor with the IETF. My problem is with a crappy and credulous implementation. When an outage is caused by redundancy software that comes from an organization that prides itself on well-written code, the irony meter goes off the scale.

You should hammer on OpenBSD. However, as yet this is an unknown. http://openbsd.org/report.html As far irony goes, there is some here but I'm not sure what you've got is countable yet.

...

...
From where I stand, the OpenBSD project has been consistent on insulating itself against future legal issues, no matter how remote, with the idea that your security should not be restrained by anyone other than you.

What is "security" though and what it its aim? To my way of thinking, what happened to me last night wherein a box misbehaved and caused indigestion on an entire broadcast domain was a non-trivial security and availability incident.

Of course.

...

On the scale of badness, it's somewhat worse than a "magic packet causes this box to reboot" flaw, but not as bad as a "box gets owned, sensitive data gets divulged" incident. In my world, at least, security and availability are intimately intertwined. Were they not, one could easily "win" the security "game" by the simple expedient of turning the host off. Mission accomplished!

The phrase you're looking for is denial of service, a known security phenomena.

...

...
I believe that idea has legs regardless of practical considerations and stands on it's own.

Besides, I won't discount OpenBSD out of hand for forging ahead, withstanding practical issues, considering the runs they've got on the board and the many facepalm fails we see in the diametrically opposed corporate world.

It might be a very good thing they've bothered to take the time on this.

The problem here is "insufficient paranoia about packets that come flying in over the transom, based on naive contemporaneous belief that a particular protocol number was not in use". I mean, gosh, who would ever send packets on an unused protocol number? And who other than us would get frustrated with the process and decide to forge ahead on their own.

As far as not using the same protocol number, that's neither here nor there. Something I've noticed looking at information security is the taxonomy of Confidentiality, Integrity, Availability - which addresses your previous points. Something else I've noticed is the notion of security through obscurity and how it cedes the initative to the attacker. Experience tells me this is not lost on the OpenBSD folks. Translation, it's commonly understood that secure protocols shouldn't rely on trusting others to obey the rules ... and whether or not it's OpenBSD or Johnny Black Hat that's on 122 or whatever, if that causes issues then it's either down to the protocol or the administrator. I have no doubt OpenBSD understood all this. If I take Theo's word for it, he employed a mechanism available in the rfc (i.e. VRRP) to allow traffic to be differentiated. Regardless, if a competing implementation can cause a DoS or any other issue that's either a design failure that should be addressed in a subsequent rfc or if it's a design limitation, then it's a failure to concommittantly secure the network. Blaming OpenBSD for protocol number won't fly. If I'm to take Stuart's cue then somebody hasn't read the documentation. Simple.

...

Most of us here are familiar with Postel's oft-quoted RFC793 robustness principle ("be conservative in what you do, be liberal in what you accept from others"). Yet, when one is engaging in an off-label use of any protocol, identifier, etc. it is incumbent on the protocol designer to mark their traffic in a particular way so that it is easy to identify, both for themselves and for others. Sure, one could argue that this is merely abstracting away the semantics of the protocol number field (hopefully to a field with more data space) but the whole point is to not accidentally interoperate with something with which you are not prepared to interoperate.

At a casual reading, looking at the security considerations of for example ... http://tools.ietf.org/rfc/rfc3768.txt ... suggests to me that there are exploitable vectors inherent to this protocol. I'll say it again, I'm no subject matter expert. I'd be happy for you point me in the right direction, otherwise you're going to have to wait for me to get up to speed. Otherwise see previous, if there are no mechanisms to secure VRRP or CARP then either the network or the machine needs to be secure or the protocol shouldn't be in service or relied upon.

...

Stated another way, nothing is keeping me from using udp/139 for something else so long as my packets aren't misinterpreted by SMB servers out there as being SMB, and so long as I don't accidentally eat someone else's SMB and do something stupid.

No matter what protocol we look at, ultimately that comes down to protocol design. After that is network design. If a protocol is open to attack by unauthenticated users then it's up to me to secure the network against unauthenticated users. Expecting only legitimate traffic no matter what the door or window we're looking at is not the right way to do it. The bad guys certainly don't care either way whether you want malformed packets or not or complimentary looking implementations or not.

...

Would you eat food that someone left on your doorstep with no note and no hint as to who it came from? Obviously from your mom, right? I mean who else would leave food on your doorstep? How about Halloween candy with open wrappers? The comparisons in the messages you cited to a four year old may not be that far off.

-r

Andrew Sullivan

11:19 p.m.

On Sat, Dec 01, 2012 at 02:05:14AM +1030, David Walker wrote:

...

As far as not using the same protocol number, that's neither here nor there.

Horse pucky. On the Internet, the secure and reliable players co-ordinate their protocol actions through the IANA, using the published IANA rules for how you get a protocol identifier. This case is a straightforward example of a bunch of people angry at things not going their way, and treading all over a well-defined, open process becuse they didn't like the actions of some of the participants. I don't like those actions either, but if proponents cannot bother to publish an Internet-Draft describing CARP, it's pretty hard to take CARP seriously as anything like a "protocol". It's just rude behaviour on someone else's well-defined port. A -- Andrew Sullivan Dyn Labs asullivan@dyn.com

Owen DeLong

8:04 p.m.

...

...
I believe that idea has legs regardless of practical considerations and stands on it's own.

Besides, I won't discount OpenBSD out of hand for forging ahead, withstanding practical issues, considering the runs they've got on the board and the many facepalm fails we see in the diametrically opposed corporate world.

It might be a very good thing they've bothered to take the time on this.

The problem here is "insufficient paranoia about packets that come flying in over the transom, based on naive contemporaneous belief that a particular protocol number was not in use". I mean, gosh, who would ever send packets on an unused protocol number? And who other than us would get frustrated with the process and decide to forge ahead on their own.

Perhaps we should ask IETF/IANA to allocate a group of protocol numbers to "the wild west". A protocol-number equivalent of RFC-1918 or private ASNs. You can use these for whatever you want, but so can anyone else and if you do, you do so at your own risk. This won't entirely solve the problem, but at least it would provide some level of shield for protocol numbers that are registered to particular purposes through the IETF/IANA process. Owen

Adrian Farrel

2 Dec 2 Dec

10:28 p.m.

Far be it from me to get involved in a private pissing match, but... Owen wrote:

...

Perhaps we should ask IETF/IANA to allocate a group of protocol numbers to "the wild west". A protocol-number equivalent of RFC-1918 or private ASNs. You can use these for whatever you want, but so can anyone else and if you do, you do so at your own risk.

This won't entirely solve the problem, but at least it would provide some level of shield for protocol numbers that are registered to particular purposes through the IETF/IANA process.

Would that be 253 and 254 "Use for experimentation and testing" per RFC 3692? Of course, no-one like to see their pet protocol designated as an experiment (unless they really believe it is something that should be carefully researched and tried out in a controlled environment), but the garden-walling that you describe seems to fit exactly within the 3692 definitions. Adrian

Jay

3 Dec 3 Dec

2:48 a.m.

On 12/2/2012 5:28 PM, Adrian Farrel wrote:

...

Far be it from me to get involved in a private pissing match, but...

Owen wrote:

...
Perhaps we should ask IETF/IANA to allocate a group of protocol numbers to "the wild west". A protocol-number equivalent of RFC-1918 or private ASNs. You can use these for whatever you want, but so can anyone else and if you do, you do so at your own risk.

This won't entirely solve the problem, but at least it would provide some level of shield for protocol numbers that are registered to particular purposes through the IETF/IANA process.

Would that be 253 and 254 "Use for experimentation and testing" per RFC 3692?

Of course, no-one like to see their pet protocol designated as an experiment (unless they really believe it is something that should be carefully researched and tried out in a controlled environment), but the garden-walling that you describe seems to fit exactly within the 3692 definitions.

Adrian

RFC 3692, section 1.1: "Values reserved for experimental use are never to be made permanent; permanent assignments should be obtained through standard processes. As described above, experimental numbers are intended for experimentation and testing and are not intended for wide or general deployments. When protocols that use experimental numbers are included in products, the shipping versions of the products must disable recognition of protocol experimental numbers by default -- that is, the end user of the product must explicitly "turn on" the experimental protocol functionality. In most cases, a product implementation must require the end user to configure the value explicitly prior to enabling its usage. Should a product not have a user interface for such end user configuration, the product must require explicit re-programming (e.g., a special firmware download, or installation of a feature card) to configure the experimental number(s) of the protocol(s) implicitly." Of course the use of 'must' or 'must not' in an RFC never stopped anyone from doing the exact opposite. Jay

Jussi Peltola

30 Nov 30 Nov

9:13 a.m.

The amount of detail in the original posting is rather disappointing, with absolutely no hope of anyone being able to reproduce the problem with the data given. Did the vhid and vrrp group overlap? Were there duplicate IP addresses?

Robert E. Seastrom

2:45 p.m.

Jussi Peltola <pelzi@pelzi.net> writes:

...

The amount of detail in the original posting is rather disappointing, with absolutely no hope of anyone being able to reproduce the problem with the data given.

It was not intended as a bug report, instead merely an expression of disappointment and an advsory to fellow travelers to watch their backs. Sometimes a report of muggings in a locale is useful, even without a detailed description of the attacker.

...

Did the vhid and vrrp group overlap? Were there duplicate IP addresses?

Yes, "vrrp 1" turned out to be a bad plan here. Turned off vrrp on the router and went with HSRP. There is enough documentation on HSRP vs VRRP around (heck, even Wikipedia) to surmise that something that interacted poorly with VRRP would likely not do the same to HSRP. Docs on CARP are thin on the ground. Never even an I-D. Didn't have time to read the source code when the network was acting up. -r

Nick Hilliard

12:44 p.m.

On 30/11/2012 05:52, Robert E. Seastrom wrote:

...

[*] The OpenBSD side of the story can be read at http://en.wikipedia.org/wiki/Common_Address_Redundancy_Protocol#No_official_...

Seems that there is a lesson to be learned here:

"o hai, we wrote this software but can not be bothered to follow your process or formally write up the protocol, plz to be giving us a protocol number" ain't gonna fly.

Which is fair enough from the ietf's point of view. Having said that, 1. patent US5473599 is pretty general and I can't see why it wouldn't apply to any host running CARP for router NHRP - although it's not clear that it would apply to e.g. host service high availability addressing. IOW, it's not at all clear that this is a legally unencumbered protocol, despite the bleatings from the openbsd camp. 2. the original patent is due to expire in about 18 months (april 2014) and i can't immediately see any cip applications which might extend it. This will render the debate substantially redundant. Regarding the ruckus between the openbsd camp and the ietf, the ietf's position is here: http://www.ietf.org/mail-archive/web/vrrp/current/msg00350.html It looks like there wasn't any serious attempt on the part of the openbsd people to engage with the ietf. There were no drafts, barely any mailing list postings to either the vrrp (now concluded) or routing discussion WGs, and apparently only a single presentation at a single ietf meeting. Maybe I've missed something though - I haven't checked the openbsd mailing lists because apparently their archives aren't publicly accessible. It's not at all clear why the openbsd people expected that a "petition" to IANA would result in them being assigned an official protocol number for CARP. There are only 254 of these available so it's not unreasonable to decline to register them unless there is a strong written case to do so. There's a policy in place for this (rfc5237), and it's in place for a good reason. As for the openbsd position on the choice of protocol number: "Consequently we were forced to choose a protocol number which would not conflict with anything else of value, and decided to place CARP at IP protocol 112" My goodness, what a co-incidence that they happened to choose the same protocol number as VRRP. http://www.ietf.org/mail-archive/web/ietf/current/msg48988.html Good thing this wasn't ever going to cause people trouble. Nick

Stuart Henderson

1:37 p.m.

On 2012-11-30, Robert E. Seastrom <rs@seastrom.com> wrote:

...

I can't seem to recall anyone griping about this here on our august little list but google finds that I'm by no means the first to have been burned by an unholy interaction between VRRP and CARP.

Let's skip the protocol discussions (same protocol number and uses multicast) [*] and go straight to the behavioral observations.

I turned on VRRP this evening on a pair of routers. All of a sudden a CARP instance between a pair of pfSense boxes in the rack (which I didn't even know was there) invited itself to the party and started flailing all over the place and causing oscillating packet loss for anything that was going off-segment.

Note that the Ciscos didn't exhibit any untoward behavior, and there were "passwords" on the VRRP sessions too. Meanwhile, the pfSense box spazzed out and filled its dmesg logs with stuff like:

arp: 192.0.2.1 moved from 00:00:0c:xx:xx:01 to 00:00:5e:xx:xx:01 on em1 arp: 192.0.2.1 moved from 00:00:5e:xx:xx:01 to 00:00:0c:xx:xx:01 on em1

(no other hosts on the segment were logging such activity)

All this shows is that the IP address is flip-flopping between a Cisco MAC address and a CARP/VRRP unicast MAC address. I would double check the vrrp config and make sure that the vrrp IP address is *only* configured on vrrp, not ethernet interfaces.

...

Looks like CARP is a bit loose about believing stuff coming in over the wire. Seems a bit out of character for OpenBSD, but maybe these days it's considered all good so long as such a malfunction only causes an outage, not a core dump.

I don't see anything here indicating that it's to do with CARP believing things sent over the wire, I suspect the problem would still occur if CARP were disabled on the pfSense box. (Do people really run CARP in the wild without authentication anyway?)

Robert E. Seastrom

2:47 p.m.

Stuart Henderson <stu@spacehopper.org> writes:

...

I don't see anything here indicating that it's to do with CARP believing things sent over the wire, I suspect the problem would still occur if CARP were disabled on the pfSense box. (Do people really run CARP in the wild without authentication anyway?)

1) it did not. 2) standard, out of the box pfSense distribution. Haven't run that codebase lately myself, and not sufficiently interested this morning to dig through the code. Just watch your back, that's all. :) -r

4607

Age (days ago)

4610

Last active (days ago)

List overview

Download

22 comments

15 participants

participants (15)

Adrian Farrel
Andrew Sullivan
Christopher Morrow
Claudio Jeker
David Conrad
David Walker
Doug Barton
Henning Brauer
Jay
Jussi Peltola
Nick Hilliard
Owen DeLong
Randy Bush
Robert E. Seastrom
Stuart Henderson