JunOS/FRR/Nokia et al BGP critical issue
Ran across this article today and haven't seen posts about it so i figured I would share: https://blog.benjojo.co.uk/post/bgp-path-attributes-grave-error-handling?fbclid=IwAR13ePY43Vf3u4X8PDyCDT39DtyXczAKkv6CGXOQbcQv90Y3aIAmTkJxn7k_aem_Ad0hzj2Mh_WlbFZug-vGdlJJdXr2Xo0RFIsPwAU2GviPz6xZDib76YHwFuzU7E0_sJk&mibextid=Zxz2cZ Curious if anyone on the list is running VyOS and has experienced any problems? Cheers, Mike -- Mike Lyon mike.lyon@gmail.com http://www.linkedin.com/in/mlyon
Thanks for sharing this, Mike. I saw it on lobste.rs yesterday and figured everyone would be ahead. I'm running VyOS in a volunteer WISP but not with BGP peering... I'm thinking to test it now as we'll likely swap in VyOS for it soon. I saw this PR as a reply on Mastodon: https://github.com/FRRouting/frr/pull/14290 Warm regards, Mark
FRR fix went into 9.0 and has been back ported to 8.5 and 8.4 , Cumulus 5.6 will include the fix. Cheers, Jeff
On Aug 30, 2023, at 5:32 AM, Mark Prosser <mark@zealnetworks.ca> wrote:
Thanks for sharing this, Mike. I saw it on lobste.rs yesterday and figured everyone would be ahead.
I'm running VyOS in a volunteer WISP but not with BGP peering... I'm thinking to test it now as we'll likely swap in VyOS for it soon.
I saw this PR as a reply on Mastodon:
https://github.com/FRRouting/frr/pull/14290
Warm regards,
Mark
On Wed, Aug 30, 2023 at 4:50 AM Mike Lyon <mike.lyon@gmail.com> wrote:
Ran across this article today and haven't seen posts about it so i figured I would share:
https://blog.benjojo.co.uk/post/bgp-path-attributes-grave-error-handling
Can you imagine, as the origin of a route, troubleshooting a connectivity issue in which Internet BGP routers far from your control have trouble with attributes attached by their peers and then "did their best" with your route instead of dropping the session and essentially demanding intervention by the network operator? Dumping the session may seem extreme, but there's a good reason for it. Regards, Bill Herrin p.s. you don't need to copy the Facebook tracking token (that ?fbclid= bit) when you share URLs. -- William Herrin bill@herrin.us https://bill.herrin.us/
On Wed, Aug 30, 2023 at 4:04 PM William Herrin <bill@herrin.us> wrote:
On Wed, Aug 30, 2023 at 4:50 AM Mike Lyon <mike.lyon@gmail.com> wrote:
Ran across this article today and haven't seen posts about it so i figured I would share:
https://blog.benjojo.co.uk/post/bgp-path-attributes-grave-error-handling
Can you imagine, as the origin of a route, troubleshooting a connectivity issue in which Internet BGP routers far from your control have trouble with attributes attached by their peers and then "did their best" with your route instead of dropping the session and essentially demanding intervention by the network operator?
Dumping the session may seem extreme, but there's a good reason for it.
Or do the sensible thing and just drop the announcement and log the problem. This might be a problem in a DFZ environment, but albeit a small one. Or, drop the invalid attribute and treat the announcement as a regular one.
Or do the sensible thing and just drop the announcement and log the problem.
Which is exactly what an RFC7606 compliant device will do for an unknown path attribute. https://www.rfc-editor.org/rfc/rfc7606#page-5 o Treat-as-withdraw: In this approach, the UPDATE message containing the path attribute in question MUST be treated as though all contained routes had been withdrawn just as if they had been listed in the WITHDRAWN ROUTES field (or in the MP_UNREACH_NLRI attribute if appropriate) of the UPDATE message, thus causing them to be removed from the Adj-RIB-In according to the procedures of [RFC4271 <https://www.rfc-editor.org/rfc/rfc4271>]. d. If any of the well-known mandatory attributes are not present in an UPDATE message, then "treat-as-withdraw" MUST be used. (Note that [RFC4760] reclassifies NEXT_HOP as what is effectively discretionary.) e. "Treat-as-withdraw" MUST be used for the cases that specify a session reset and involve any of the attributes ORIGIN, AS_PATH, NEXT_HOP, MULTI_EXIT_DISC, or LOCAL_PREF. On Wed, Aug 30, 2023 at 10:01 AM Eugeniu Patrascu <eugen@imacandi.net> wrote:
On Wed, Aug 30, 2023 at 4:04 PM William Herrin <bill@herrin.us> wrote:
On Wed, Aug 30, 2023 at 4:50 AM Mike Lyon <mike.lyon@gmail.com> wrote:
Ran across this article today and haven't seen posts about it so i figured I would share:
https://blog.benjojo.co.uk/post/bgp-path-attributes-grave-error-handling
Can you imagine, as the origin of a route, troubleshooting a connectivity issue in which Internet BGP routers far from your control have trouble with attributes attached by their peers and then "did their best" with your route instead of dropping the session and essentially demanding intervention by the network operator?
Dumping the session may seem extreme, but there's a good reason for it.
Or do the sensible thing and just drop the announcement and log the problem. This might be a problem in a DFZ environment, but albeit a small one. Or, drop the invalid attribute and treat the announcement as a regular one.
Mike Lyon <mike.lyon@gmail.com> writes:
Sounds familiar. https://supportportal.juniper.net/s/article/BGP-Malformed-AS-4-Byte-Transiti... You'd think a lot of thought has gone into error handling for optional transitive attributes since then, but... Bjørn
Bjørn Mork wrote on 01/09/2023 08:17:
Sounds familiar.
https://supportportal.juniper.net/s/article/BGP-Malformed-AS-4-Byte-Transiti...
You'd think a lot of thought has gone into error handling for optional transitive attributes since then, but...
A good deal of thought has gone into the problem, and this is where rfc7606 came from. Treat-as-withdraw for the NLRI in question is the default option with this approach, and should be deployed universally. Nick
Nick Hilliard <nick@foobar.org> writes:
Bjørn Mork wrote on 01/09/2023 08:17:
Sounds familiar. https://supportportal.juniper.net/s/article/BGP-Malformed-AS-4-Byte-Transiti... You'd think a lot of thought has gone into error handling for optional transitive attributes since then, but...
A good deal of thought has gone into the problem, and this is where rfc7606 came from. Treat-as-withdraw for the NLRI in question is the default option with this approach, and should be deployed universally.
Yes. But there's obviously not been enough thought applied to realize that optional transitive attributes must be considered evil by default. They can only be used after extremely careful parsing. This is the BGP version of select * from mytable where field = $unvalidated_user_input; I was hoping we'd moved past that point in the software development history. Bjørn
On Fri, Sep 1, 2023 at 12:56 PM Bjørn Mork <bjorn@mork.no> wrote:
Nick Hilliard <nick@foobar.org> writes:
Bjørn Mork wrote on 01/09/2023 08:17:
Sounds familiar.
https://supportportal.juniper.net/s/article/BGP-Malformed-AS-4-Byte-Transiti...
You'd think a lot of thought has gone into error handling for optional transitive attributes since then, but...
A good deal of thought has gone into the problem, and this is where rfc7606 came from. Treat-as-withdraw for the NLRI in question is the default option with this approach, and should be deployed universally.
Yes.
But there's obviously not been enough thought applied to realize that optional transitive attributes must be considered evil by default. They can only be used after extremely careful parsing.
Yeah, no. The logic is that if you understand them, you treat them according to whatever routing policy you have and then pass them along. If you don't, you just pass them along and that's it. Nothing more, nothing less.
This is the BGP version of
select * from mytable where field = $unvalidated_user_input;
No here as well. Because passing along a transitive attribute you don't understand does not affect you in any way. -e
Eugeniu Patrascu <eugen@imacandi.net> writes:
On Fri, Sep 1, 2023 at 12:56 PM Bjørn Mork <bjorn@mork.no> wrote:
But there's obviously not been enough thought applied to realize that optional transitive attributes must be considered evil by default. They can only be used after extremely careful parsing.
Yeah, no. The logic is that if you understand them, you treat them according to whatever routing policy you have and then pass them along.
That's where you get into problems, depending on your defintition of "understand" and "treat". This implies parsing unvalidated input. Understanding RFC compliant attributes is a minor part of that task. The real problem is dealing with absolutely anything, including values specifically designed to attack your parser, or policy, or whatever logic you apply to the attribute value.
If you don't, you just pass them along and that's it. Nothing more, nothing less.
This is obviously not a problem. You may be acting as a proxy for attacking your peers, but there's nothing you can do about that without breaking the protocol.
This is the BGP version of
select * from mytable where field = $unvalidated_user_input;
No here as well. Because passing along a transitive attribute you don't understand does not affect you in any way.
I didn't say so either. Those who _believe_ they understand it is the problem. And I'm slowly starting to see why we still have so many implementations where the optional transitive problem has been pretty much ignored. Bjørn
Bjørn Mork wrote on 01/09/2023 10:52:
But there's obviously not been enough thought applied to realize that optional transitive attributes must be considered evil by default. They can only be used after extremely careful parsing.
This is the BGP version of
select * from mytable where field = $unvalidated_user_input;
it's not really. If the receiving BGP stack understands the attribute, then it should be parsed as default, i.e. carefully. Unfortunately, junos slipped up on this and didn't validate the input correctly, which is a parsing bug. Param validation bugs happen. They shouldn't happen, but they do. If an intermediate router doesn't understand a transitive attribute, it should be ignored, and life should move on. The problems arise in two situations: 1. malformed attribute, i.e. this situation. 2. vendors squatting path attribute values which are then assigned for other purposes. This is a subset of #1, but is messy and difficult to rectify when it happens. Great for fuzzing, not so good for production networks. Nick
On Fri, Sep 01, 2023 at 11:54:57AM +0100, Nick Hilliard wrote:
it's not really. If the receiving BGP stack understands the attribute, then it should be parsed as default, i.e. carefully. Unfortunately, junos slipped up on this and didn't validate the input correctly, which is a parsing bug. Param validation bugs happen. They shouldn't happen, but they do.
If an intermediate router doesn't understand a transitive attribute, it should be ignored, and life should move on.
+1 to what Nick stated. Rob Shakir did a great job describing the various 'scopes' in which BGP errors can appear and what that means for the 'blast radius'. I recommend everyone to read section 3 of this draft document: https://datatracker.ietf.org/doc/html/draft-ietf-grow-ops-reqs-for-bgp-error... Kind regards, Job
But there's obviously not been enough thought applied to realize that optional transitive attributes must be considered evil by default. They can only be used after extremely careful parsing.
...
I was hoping we'd moved past that point in the software development history.
There have been plenty of nerd debates about the evilness of optional transitive attributes. Talk to any of the folks from IETF who have worked on the BGP RFCs. This isn't a 'software development problem', nor is this current thing some new, unknown 'vulnerability'. BGP speakers closing a session with this specific attribute is exactly what RFC4271 says it's supposed to do. Not closing the session and handling it differently depending on the problem is doing exactly what RFC7606 says it's supposed to do. I haven't looked at every vendor in the original post here, but I don't recall seeing any of them doing something against spec. On Fri, Sep 1, 2023 at 5:54 AM Bjørn Mork <bjorn@mork.no> wrote:
Nick Hilliard <nick@foobar.org> writes:
Bjørn Mork wrote on 01/09/2023 08:17:
Sounds familiar.
https://supportportal.juniper.net/s/article/BGP-Malformed-AS-4-Byte-Transiti...
You'd think a lot of thought has gone into error handling for optional transitive attributes since then, but...
A good deal of thought has gone into the problem, and this is where rfc7606 came from. Treat-as-withdraw for the NLRI in question is the default option with this approach, and should be deployed universally.
Yes.
But there's obviously not been enough thought applied to realize that optional transitive attributes must be considered evil by default. They can only be used after extremely careful parsing.
This is the BGP version of
select * from mytable where field = $unvalidated_user_input;
I was hoping we'd moved past that point in the software development history.
Bjørn
participants (9)
-
Bjørn Mork
-
Eugeniu Patrascu
-
Jeff Tantsura
-
Job Snijders
-
Mark Prosser
-
Mike Lyon
-
Nick Hilliard
-
Tom Beecher
-
William Herrin