Why do ROV-ASes announce some invalid route?
We learned from Cloudflare's https://isbgpsafeyet.com/ that some ASes have deployed RPKI Origin Validation (ROV). However, we downloaded BGP collection data from RouteViews and RipeRis platforms and found that some ROV-ASes can announce some invalid routes. For example, from RIB data at 2022-10-31 00:00:00, 13 out of 17 ASes which declared to deploy ROV announced invalid routes, and we list the number of related prefixes for each AS below. ASN 33561299174291469393257645334919002551179221333516509 pref#723314361152731625617105 As a comparison, we count the invalid routes the non-ROV ASes (also declared in https://isbgpsafeyet.com/) announces, as below: ASN67626461127312956123892048570174739009 pref#59760358711161162559492380 We can see that ROV ASes announced apparently fewer invalid routes compared to the non-ROV ASes, though they did not filter all the invalids. AS6939 announced apparently more invalid routes compared with other ROV-ASes. We learned from the discussions two years ago (https://mailman.nanog.org/pipermail/nanog/2020-June/108309.html) that AS6939 uses reactive ROV. I.e., route collectors identify invalid routes, write them into scripts and send to routers, who then send "withdrawals" of the invalids based on the scripts. However, for the BGP collection time 2022-10-31 00:00:00, we downloaded the two-hour updates afterwards, and found very few withdrawals from AS6939 about those invalid routes in the first hour. In the second hour, AS6939 withdraws hundreds of invalid prefixes, but most of these withdraws are followed by another invalid announcement with the same prefix and same invalid origin AS. Can anyone help us to correctly interpret this case? Thank you very much.
Dear 孙乐童, On Mon, Nov 07, 2022 at 08:40:57PM +0800, 孙乐童 wrote:
We learned from Cloudflare's https://isbgpsafeyet.com/ that some ASes have deployed RPKI Origin Validation (ROV). However, we downloaded BGP collection data from RouteViews and RipeRis platforms and found that some ROV-ASes can announce some invalid routes. For example, from RIB data at 2022-10-31 00:00:00, 13 out of 17 ASes which declared to deploy ROV announced invalid routes, and we list the number of related prefixes for each AS below.
[snip]
As a comparison, we count the invalid routes the non-ROV ASes (also declared in https://isbgpsafeyet.com/) announces, as below:
We can see that ROV ASes announced apparently fewer invalid routes compared to the non-ROV ASes, though they did not filter all the invalids.
[snip]
Can anyone help us to correctly interpret this case? Thank you very much.
You ask great questions! I hope an answer to your questions can be found in a message I sent a year ago: https://mailman.nanog.org/pipermail/nanog/2021-April/213346.html The summary: in any sufficiently large network, chances are not 100% of all equipment supports RPKI-based BGP Route Origin Validation; in such cases a handful of invalid routes may still percolate through the system. Another contributing factor might be certain types of software upgrades; where ROV temporarily is disabled on one or more devices. Or perhaps an ISP made a handful of exceptions for test/beacon invalid routes to propagate. Kind regards, Job
Hello Job, Thank you very much for your reply! I got that no AS can actually filter all the invalids. Yet I was trying to figure out why we couldn't see reasonable amount of withdrawals from AS6939 about invalid prefixes, as they explained how they implement ROV (https://mailman.nanog.org/pipermail/nanog/2020-June/108309.html). Perhaps we need to learn their detailed implementations. Thank you very much! Best wishes, Sun Letong 在2022-11-08 00:11:24,Job Snijders<job@fastly.com>写道:
Dear 孙乐童,
On Mon, Nov 07, 2022 at 08:40:57PM +0800, 孙乐童 wrote:
We learned from Cloudflare's https://isbgpsafeyet.com/ that some ASes have deployed RPKI Origin Validation (ROV). However, we downloaded BGP collection data from RouteViews and RipeRis platforms and found that some ROV-ASes can announce some invalid routes. For example, from RIB data at 2022-10-31 00:00:00, 13 out of 17 ASes which declared to deploy ROV announced invalid routes, and we list the number of related prefixes for each AS below.
[snip]
As a comparison, we count the invalid routes the non-ROV ASes (also declared in https://isbgpsafeyet.com/) announces, as below:
We can see that ROV ASes announced apparently fewer invalid routes compared to the non-ROV ASes, though they did not filter all the invalids.
[snip]
Can anyone help us to correctly interpret this case? Thank you very much.
You ask great questions! I hope an answer to your questions can be found in a message I sent a year ago:
https://mailman.nanog.org/pipermail/nanog/2021-April/213346.html
The summary: in any sufficiently large network, chances are not 100% of all equipment supports RPKI-based BGP Route Origin Validation; in such cases a handful of invalid routes may still percolate through the system. Another contributing factor might be certain types of software upgrades; where ROV temporarily is disabled on one or more devices. Or perhaps an ISP made a handful of exceptions for test/beacon invalid routes to propagate.
Kind regards,
Job
<note I didn't look at the RV data for this> There are 2 sides to the bgp conversation for any ASN, and then really 4 sides. customer -> RAS -> peer (settlement-free) peer(sfp) -> RAS -> customer customer -> ras -> transit transit -> ras -> customer Depending on the RAS's capabilities or status in their journey to 'fully RAS', it's possible that they may have: o "We OV all customer sessions" (notably not SFP peers) o "We OV all sessions(*)" (noting not all, and maybe depending on platform specifics) There are a bunch of ways this goes wrong :( This also doesn't really tell what sort of peering the RAS has set up with RouteViews (customer? peer? partial peer?) Also, also, possibly the output path on the session(s) here is not filtering in an OV fashion. On Thu, Nov 10, 2022 at 9:13 AM 孙乐童 <slt20@mails.tsinghua.edu.cn> wrote:
Hello Job, Thank you very much for your reply! I got that no AS can actually filter all the invalids. Yet I was trying to figure out why we couldn't see reasonable amount of withdrawals from AS6939 about invalid prefixes, as they explained how they implement ROV (https://mailman.nanog.org/pipermail/nanog/2020-June/108309.html). Perhaps we need to learn their detailed implementations. Thank you very much!
Best wishes, Sun Letong
在2022-11-08 00:11:24,Job Snijders<job@fastly.com>写道:
Dear 孙乐童,
On Mon, Nov 07, 2022 at 08:40:57PM +0800, 孙乐童 wrote:
We learned from Cloudflare's https://isbgpsafeyet.com/ that some ASes have deployed RPKI Origin Validation (ROV). However, we downloaded BGP collection data from RouteViews and RipeRis platforms and found that some ROV-ASes can announce some invalid routes. For example, from RIB data at 2022-10-31 00:00:00, 13 out of 17 ASes which declared to deploy ROV announced invalid routes, and we list the number of related prefixes for each AS below.
[snip]
As a comparison, we count the invalid routes the non-ROV ASes (also declared in https://isbgpsafeyet.com/) announces, as below:
We can see that ROV ASes announced apparently fewer invalid routes compared to the non-ROV ASes, though they did not filter all the invalids.
[snip]
Can anyone help us to correctly interpret this case? Thank you very much.
You ask great questions! I hope an answer to your questions can be found in a message I sent a year ago:
https://mailman.nanog.org/pipermail/nanog/2021-April/213346.html
The summary: in any sufficiently large network, chances are not 100% of all equipment supports RPKI-based BGP Route Origin Validation; in such cases a handful of invalid routes may still percolate through the system. Another contributing factor might be certain types of software upgrades; where ROV temporarily is disabled on one or more devices. Or perhaps an ISP made a handful of exceptions for test/beacon invalid routes to propagate.
Kind regards,
Job
On Fri, 11 Nov 2022 at 14:00, Christopher Morrow <morrowc.lists@gmail.com> wrote:
Also, also, possibly the output path on the session(s) here is not filtering in an OV fashion.
ROV belongs on the input path, let's not ROV on the output towards customers / route collectors. Announcing bigger, ROV valid/unkown aggregates, while really routing based on possibly ROV-invalid more specifics in the FIB is akin to actively obscuring routing security, "cheating" your way to a RAS. Yes, there are some very specific situations where output ROV is beneficial (a peering box not supporting ROV and you ask your peer to ROV their output), but let's not normalize ROV on the output path. Thanks, Lukas
FYI, Huawei routers support Egress ROV.
-----Original Message----- From: NANOG [mailto:nanog-bounces+zhuangshunwan=huawei.com@nanog.org] On Behalf Of Randy Bush Sent: Saturday, November 12, 2022 12:49 AM To: Lukas Tribus <lukas@ltri.eu> Cc: 孙乐童 <slt20@mails.tsinghua.edu.cn>; shixg@cernet.edu.cn; nanog@nanog.org Subject: Re: Why do ROV-ASes announce some invalid route?
ROV belongs on the input path, let's not ROV on the output towards customers / route collectors.
8893
randy
On Fri, Nov 11, 2022 at 8:49 AM Lukas Tribus <lukas@ltri.eu> wrote:
On Fri, 11 Nov 2022 at 14:00, Christopher Morrow <morrowc.lists@gmail.com> wrote:
Also, also, possibly the output path on the session(s) here is not filtering in an OV fashion.
ROV belongs on the input path, let's not ROV on the output towards customers / route collectors.
sure. This assumes a 100% coverage for all inputs to the rib-out on the customer port we're talking about, though. If you don't have 100% coverage you'll end up with the leaks seen/reported by the OP. I don't mean to say/imply: "Hey, everyone(anyone) should do OV on output" I mean to say that: "Hey, if you see OV failures leaking, this is probably a side effect of the behavior/design choices a network made." (not doing OV filtering on one of peer/customer/transit type peerings." -chris
aside from technical reasons for an ROV-supporting AS (RAS) to announce an ROV invalid prefix, there is an administrative one. the RAS's customers *pay* RAS to announce the customers' prefixes. so RAS is configured to propagate their customers' announcements without dropping invalids. randy
participants (6)
-
Christopher Morrow
-
Job Snijders
-
Lukas Tribus
-
Randy Bush
-
Zhuangshunwan
-
孙乐童