deploying RPKI based Origin Validation

older
Re: 2nd try: YANG daemeon for Linux

Job Snijders

12 Jul 2018 12 Jul '18

5:50 p.m.

Hi all, I wanted to share with you that a ton of activity is taking place in the Dutch networker community to deploy RPKI based BGP Origin Validation. The mantra is "invalid == reject" on all EBGP sessions. What's of note here is that we're now seeing the first commercial ISPs doing Origin Validation. This is a significant step forward compared to what we observed so far (it seemed OV was mostly limited to academic institutions & toy networks). But six months ago Amsio (https://www.amsio.com/en/) made the jump, and today Fusix deployed (https://fusix.nl/deploying-rpki/). We've also seen an uptake of Origin Validation at Internet Exchange route servers: AMS-IX and FranceIX have already deployed. I've read that RPKI OV is under consideration at a number of other exchanges. Other cool news is that Cloudflare launched a Certificate Transparency initiative to help keep everyone honest. Announcement at: https://twitter.com/grittygrease/status/1017224762542587907 Certificate Transparency is a fascinating tool, really a necessity to build confidence in any PKI systems. Anyone here working to deploy RPKI based Origin Validation in their network and reject invalid announcements? Anything of note to share? Kind regards, Job

Show replies by date

Mark Tinka

13 Jul 13 Jul

12:37 p.m.

On 12/Jul/18 19:50, Job Snijders wrote:

...

Anyone here working to deploy RPKI based Origin Validation in their network and reject invalid announcements? Anything of note to share?

It's great to hear that this is catching up. To be honest, I haven't kept up with the latest goings-on in this space for almost 1 year now, so I hadn't heard about anyone implementing OV in an "Invalid = Drop" manner. Between 2014 - 2016, we (SEACOM, AS37100) deployed and operated (what was then rpki.net) Dragon Research's RPKI CA and RP tools. I believe the project has since moved over to GitHub: https://github.com/dragonresearch/rpki.net Anyway, the operational issue we ran into was due to our aggressive policy to drop Invalids. The reason this became an issue was networks that ROA'd just their aggregates, but either forgot to or decided not to ROA the longer prefixes that were children of the aggregate. So suddenly, you have a customer who is multi-homed; one connection was to us, SEACOM, and the other was to another ISP not doing OV. Our policy meant the customer was receiving far fewer routes from SEACOM vs. the other provider, which took traffic away from us (and consequently, $$); not to mention the NOC-related headache. After 2 years of struggling to get community traction with OV based on this policy, we decided to maintain the validation, but remove any actions being ran against the validation result. OV where Invalids are dropped is something that all (major, and some regional, at the very least) ISP's need to enable for it to make a real difference in terms of: * Encouraging all networks to participate, and * Reaching the desired outcome, i.e., to mitigate against route leaks and hijacks We also discovered a number of bugs that myself, Randy, Rob, Philip, Fakrul, Matthias, Jay and Andreas, and a few others at Cisco and Juniper helped fix in code that shipped between 2016 - 2017. I would really be keen to hear feedback from you or others that have decided to deploy OV and drop Invalids. I'm more than happy to get back on to this wagon in the interest of cleaning up the global BGP table. But it needs mass... Mark.

Job Snijders

12:43 p.m.

On Fri, Jul 13, 2018 at 02:37:32PM +0200, Mark Tinka wrote:

...

Anyway, the operational issue we ran into was due to our aggressive policy to drop Invalids. The reason this became an issue was networks that ROA'd just their aggregates, but either forgot to or decided not to ROA the longer prefixes that were children of the aggregate.

So suddenly, you have a customer who is multi-homed; one connection was to us, SEACOM, and the other was to another ISP not doing OV. Our policy meant the customer was receiving far fewer routes from SEACOM vs. the other provider, which took traffic away from us (and consequently, $$); not to mention the NOC-related headache.

After 2 years of struggling to get community traction with OV based on this policy, we decided to maintain the validation, but remove any actions being ran against the validation result.

Have you considered applying "invalid == reject" on just transit/peering sessions rather than customer sessions as an intermediate step? I bet most misconfigurations or hijacks didn't come in via your customers.

...

I would really be keen to hear feedback from you or others that have decided to deploy OV and drop Invalids. I'm more than happy to get back on to this wagon in the interest of cleaning up the global BGP table. But it needs mass...

Cool! Kind regards, Job

Mark Tinka

12:53 p.m.

On 13/Jul/18 14:43, Job Snijders wrote:

...

Have you considered applying "invalid == reject" on just transit/peering sessions rather than customer sessions as an intermediate step? I bet most misconfigurations or hijacks didn't come in via your customers.

Yes, we did. The issue is some of our customers did ROA their aggregates, but not the more-specifics. We didn't want to get into a situation where we had to custom-design templates depending on what RPKI mood the customer was in :-). But yes, the majority of the issue was with routes learned from peers and transit. That, though, still leaves the problem where you end up providing a partial routing table to your customers, while your competitors in the same market aren't. Most customers that aren't keen on IPv6 or DNSSEC treat RPKI the same way - as a nuisance. So trying to speak sense into them would be a more treacherous road to take than just turning it off until we get wider support within the BGP operational community. Mark.

Grant Taylor

3:18 p.m.

On 07/13/2018 06:53 AM, Mark Tinka wrote:

...

But yes, the majority of the issue was with routes learned from peers and transit. That, though, still leaves the problem where you end up providing a partial routing table to your customers, while your competitors in the same market aren't.

Ouch.

...

Most customers that aren't keen on IPv6 or DNSSEC treat RPKI the same way - as a nuisance.

I can see that. :-(

...

So trying to speak sense into them would be a more treacherous road to take than just turning it off until we get wider support within the BGP operational community.

Please forgive the n00b question: But isn't that where carrying the prefixes through your network and conditionally advertising them to customers comes into play? Or does that run into complications where you must also have the prefixes which don't validate routed in your core? The reading I did on RPKI / OV yesterday made me think that it is possible to have validated routes preferred over unknown routes which are preferred over invalid routes. So I'd think that you could still have the routes through your core but conditionally advertise the prefixes to customers based on their desires. I would appreciate it if someone would be kind enough to explain what I'm misunderstanding. Or better, point me to some better documentation to read myself. Thank you from the peanut gallery. -- Grant. . . . unix || die

Christopher Morrow

4:25 p.m.

On Fri, Jul 13, 2018 at 11:19 AM Grant Taylor via NANOG <nanog@nanog.org> wrote:

...

The reading I did on RPKI / OV yesterday made me think that it is possible to have validated routes preferred over unknown routes which are preferred over invalid routes. So I'd think that you could still have the routes through your core but conditionally advertise the prefixes to customers based on their desires.

you get the option at input (from transit/peering edge say) to evaluate the 'rpki status' of a particular route, then set normal bgp attributes based on that evaluation, so yes you can: valid == localopref 1000 && community-A unknown == localpref 80 && community-B invalid == localpref 1 && community-Z but given: 192.168.0.0/16 - valid 192.168.0.0/17 - unknown 192.168.0.0/24 - invalid your routing system will still forward toward the 192.168.0.0/24 prefix because 'longest prefix match'. Job's plan, I think, is that you reject/drop/do-not-accept the 'invalid' prefix(es) and hope that you follow another / proper path. Perhaps Mark could send along ONLY the valid/unknown routes to his customer, or some mix of the set based on what type of customer: super-sekure-customer - valid only sorta-sekure-customer - valid/unknown wild-wild-west-customer - all it sounded like Mark didn't want to deal with that complexity in his network, until more deployment and more requests from customers like; Customer: "Hey, why did my traffic get hijacked to paY(omlut)pal.com yesterday?" Mark: "because you didn't ask for 'super-sekure-customer config? sorry?" I could have misunderstood either mark or job or you.. of course.

...

I would appreciate it if someone would be kind enough to explain what I'm misunderstanding. Or better, point me to some better documentation to read myself.

Thank you from the peanut gallery.

-- Grant. . . . unix || die

Job Snijders

4:37 p.m.

On Fri, Jul 13, 2018 at 4:25 PM, Christopher Morrow <morrowc.lists@gmail.com> wrote:

...

On Fri, Jul 13, 2018 at 11:19 AM Grant Taylor via NANOG <nanog@nanog.org> wrote:

...
The reading I did on RPKI / OV yesterday made me think that it is possible to have validated routes preferred over unknown routes which are preferred over invalid routes. So I'd think that you could still have the routes through your core but conditionally advertise the prefixes to customers based on their desires.

you get the option at input (from transit/peering edge say) to evaluate the 'rpki status' of a particular route, then set normal bgp attributes based on that evaluation, so yes you can: valid == localopref 1000 && community-A unknown == localpref 80 && community-B invalid == localpref 1 && community-Z

but given: 192.168.0.0/16 - valid 192.168.0.0/17 - unknown 192.168.0.0/24 - invalid

your routing system will still forward toward the 192.168.0.0/24 prefix because 'longest prefix match'. Job's plan, I think, is that you reject/drop/do-not-accept the 'invalid' prefix(es) and hope that you follow another / proper path.

That is exactly what I mean. Because of the golden rule "most-specific always wins" (and parts of the AS_PATH are pretty easy to spoof) it only makes sense to me to completely reject invalid routes. Kind regards, Job In Junos speak: [...] set policy-options policy-statement IMPORT:PEER term RPKI-DROP-INVALID from protocol bgp set policy-options policy-statement IMPORT:PEER term RPKI-DROP-INVALID from validation-database invalid set policy-options policy-statement IMPORT:PEER term RPKI-DROP-INVALID then validation-state invalid set policy-options policy-statement IMPORT:PEER term RPKI-DROP-INVALID then reject [...] ~

Mark Tinka

14 Jul 14 Jul

4:54 a.m.

On 13/Jul/18 18:37, Job Snijders wrote:

...

That is exactly what I mean. Because of the golden rule "most-specific always wins" (and parts of the AS_PATH are pretty easy to spoof) it only makes sense to me to completely reject invalid routes.

Exactly my preference, and exactly what we did for 2 years. But in practice, customers don't really like this, nor does your CFO. We need mass deployment for this to work effectively, and also a bit more education for those that sign aggregates but not the more-specifics. Mark.

Grant Taylor

13 Jul 13 Jul

4:37 p.m.

On 07/13/2018 10:25 AM, Christopher Morrow wrote:

...

you get the option at input (from transit/peering edge say) to evaluate the 'rpki status' of a particular route, then set normal bgp attributes based on that evaluation, so yes you can:

valid == localopref 1000 && community-A unknown == localpref 80 && community-B invalid == localpref 1 && community-Z

ACK

...

but given: 192.168.0.0/16 - valid 192.168.0.0/17 - unknown 192.168.0.0/24 - invalid

your routing system will still forward toward the 192.168.0.0/24 prefix because 'longest prefix match'.

*facePALM* Thank you. So the information would be carried across the network, but it still suffers from the same problem.

...

Job's plan, I think, is that you reject/drop/do-not-accept the 'invalid' prefix(es) and hope that you follow another / proper path.

Yep. You would almost need separate logical networks / VRF to be able to prevent the longest prefix match winning issue that you reminded me of.

...

Perhaps Mark could send along ONLY the valid/unknown routes to his customer, or some mix of the set based on what type of customer:

super-sekure-customer - valid only sorta-sekure-customer - valid/unknown wild-wild-west-customer - all

Yep. That's what I was thinking of.

...

it sounded like Mark didn't want to deal with that complexity in his network, until more deployment and more requests from customers like;

Fair.

...

Customer: "Hey, why did my traffic get hijacked to paY(omlut)pal.com yesterday?" Mark: "because you didn't ask for 'super-sekure-customer config? sorry?"

I could have misunderstood either mark or job or you.. of course.

You understood me correctly. Thank you for explaining what I was missing. -- Grant. . . . unix || die

Christopher Morrow

5:28 p.m.

On Fri, Jul 13, 2018 at 12:41 PM Grant Taylor via NANOG <nanog@nanog.org> wrote:

...

On 07/13/2018 10:25 AM, Christopher Morrow wrote:

...
but given: 192.168.0.0/16 - valid 192.168.0.0/17 - unknown 192.168.0.0/24 - invalid

your routing system will still forward toward the 192.168.0.0/24 prefix because 'longest prefix match'.

*facePALM*

Thank you.

So the information would be carried across the network, but it still suffers from the same problem.

well, consider the situation where Mark's mythical customer(s) are: custA: dual-homed + accept default (from both providers) custB: dual-homed (and live in the 'total sekure world' TSW (tm)) CustA may not see the invalid /24 (nor the /17) but have no other path and "randomly" choose Mark and his /17 + /24 world :( CustB simply drops packets (aka: what Job wants - again, I think) So... if we had more CustB and less CustA ... maybe everywhere it's OK for 'Large Mark Providers' - LMP (tm) to provide such services? I've not looked in a 'long time', but when I worked at a large ISP ~30-35% of our customers did BGP with us, of that ~70+% just did it with us (dual / redundant links to us). I think 'almost all' took a default from us too.. whether they used that default I can't say. I think getting to Job's world is a goal, I think living in Mark's is a reality for a bit still. (yes, you could ALSO do some game playing where the customer ports for TSW were in a VRF with no 'bad' routes, but.. complexity)

...

...
Job's plan, I think, is that you reject/drop/do-not-accept the 'invalid' prefix(es) and hope that you follow another / proper path.

Yep.

You would almost need separate logical networks / VRF to be able to prevent the longest prefix match winning issue that you reminded me of.

yup, yea... complexity though :(

...

...
Perhaps Mark could send along ONLY the valid/unknown routes to his customer, or some mix of the set based on what type of customer:

super-sekure-customer - valid only sorta-sekure-customer - valid/unknown wild-wild-west-customer - all

Yep. That's what I was thinking of.

...
it sounded like Mark didn't want to deal with that complexity in his network, until more deployment and more requests from customers like;

Fair.

...
Customer: "Hey, why did my traffic get hijacked to paY(omlut)pal.com yesterday?" Mark: "because you didn't ask for 'super-sekure-customer config? sorry?"

I could have misunderstood either mark or job or you.. of course.

You understood me correctly.

Thank you for explaining what I was missing.

sure thing! (err, this rpki/secure-routing business isn't really super intuitive :( )

...

-- Grant. . . . unix || die

Mark Tinka

14 Jul 14 Jul

5:06 a.m.

On 13/Jul/18 19:28, Christopher Morrow wrote:

...

I think getting to Job's world is a goal, I think living in Mark's is a reality for a bit still. (yes, you could ALSO do some game playing where the customer ports for TSW were in a VRF with no 'bad' routes, but.. complexity)

This summarizes the current status of affairs quite accurately. I'd like to get to the point where RPKI is widely deployed so that we can all run a cleaner BGP. I don't think that waiting for all BGP operators to enable RPKI and drop Invalids will be the solution. So if the top 7 global operators decided to do it, and perhaps suffer the pain of the effects for a few months, the rest of the community will be inclined to follow suit. Kind of like how only a few major operators really use RPSL, which forces all BGP operators to keep some kind of updated IRR, even if they, themselves, may not be RPSL users.

...

sure thing! (err, this rpki/secure-routing business isn't really super intuitive :( )

As always, the difficult bit is done, i.e., the protocol spec. is clearly defined, there is code in routing software, and there are plenty of options for Route Validation software. But as always, the hard part is getting the community to implement, as we've seen with IPv6 and DNSSEC. Mark.

Baldur Norddahl

7:11 a.m.

In the RIPE part of the world there is no excuse for not getting RPKI correct because RIPE made it so easy. Perhaps the industry could agree on enabling RPKI validation on all european circuits for a start? Regards Baldur

Mark Tinka

9:03 a.m.

On 14/Jul/18 09:11, Baldur Norddahl wrote:

...

In the RIPE part of the world there is no excuse for not getting RPKI correct because RIPE made it so easy. Perhaps the industry could agree on enabling RPKI validation on all european circuits for a start?

I think the first step (and what I'd consider to be a quick win) is if we determined all the prefixes that are being designated Invalid, and nail down how many of those are Invalid due to the fact that they are more-specifics announced without a ROA, vs. the parent aggregate which is ROA'd. We would then ask the operators of those prefixes to either withdraw them (easier, but unlikely) or sign them in the RPKI and create ROA's for them (more work, but more likely). Going for the latter. Once that is fixed, and even though the entire BGP world is not running RPKI, those that are and are dropping Invalids would be 100% certain that those Invalids are either leaks or hijacks. I think that will get us 50% of the way there, with the other 50% would now just be growing community participation in RPKI. Thankfully, I believe all (or most) of the RIR's support a simple "click of a button" to say "All prefixes up to a /24 or a /48 of the aggregate should automatically be ROA'd if the aggregate, itself, is ROA'd". So it shouldn't be a lot of work to get what is currently broken fixed. And the beauty, we don't need everyone to participate in the RPKI today for those that want the benefit right now to enjoy it so. Mark.

Job Snijders

16 Jul 16 Jul

3:26 p.m.

On Sat, Jul 14, 2018 at 11:03:16AM +0200, Mark Tinka wrote:

...

On 14/Jul/18 09:11, Baldur Norddahl wrote:

...
In the RIPE part of the world there is no excuse for not getting RPKI correct because RIPE made it so easy. Perhaps the industry could agree on enabling RPKI validation on all european circuits for a start?

I think the first step (and what I'd consider to be a quick win) is if we determined all the prefixes that are being designated Invalid, and nail down how many of those are Invalid due to the fact that they are more-specifics announced without a ROA, vs. the parent aggregate which is ROA'd.

I calculated this here few days ago http://instituut.net/~job/rpki-report-2018.07.12.txt Markus Weber from KPN is generating a daily report here and drew similar conclusions: https://as286.net/data/ana-invalids.txt Markus scrapes all routes from the AS 286 PEs and marks the routes for which no valid or unknown alternative exists as "altpfx=NONE".

...

We would then ask the operators of those prefixes to either withdraw them (easier, but unlikely) or sign them in the RPKI and create ROA's for them (more work, but more likely). Going for the latter.

Or delete the incorrect RPKI ROA. Either way is fine.

...

Once that is fixed, and even though the entire BGP world is not running RPKI, those that are and are dropping Invalids would be 100% certain that those Invalids are either leaks or hijacks.

I think that will get us 50% of the way there, with the other 50% would now just be growing community participation in RPKI.

Thankfully, I believe all (or most) of the RIR's support a simple "click of a button" to say "All prefixes up to a /24 or a /48 of the aggregate should automatically be ROA'd if the aggregate, itself, is ROA'd". So it shouldn't be a lot of work to get what is currently broken fixed. And the beauty, we don't need everyone to participate in the RPKI today for those that want the benefit right now to enjoy it so.

Perhaps the RIRs should start an outreach program to proactively inform the owners of those 2,200 invalid route announcements to get them to either fix or delete the RPKI ROA. Kind regards, Job

Mark Tinka

17 Jul 17 Jul

11:27 a.m.

On 16/Jul/18 17:26, Job Snijders wrote:

...

I calculated this here few days ago http://instituut.net/~job/rpki-report-2018.07.12.txt

Markus Weber from KPN is generating a daily report here and drew similar conclusions: https://as286.net/data/ana-invalids.txt Markus scrapes all routes from the AS 286 PEs and marks the routes for which no valid or unknown alternative exists as "altpfx=NONE".

Thanks. Protein. So the numbers are not that far off from when I last checked this back in 2016, i.e., less than 1% of the total IPv4 routing table. Do you have numbers for IPv6, out of interest?

...

Or delete the incorrect RPKI ROA. Either way is fine.

That would work, but the risk with that then is trying to get those networks back into RPKI would be more difficult, and if they do, chances are that the folk that were pushing it would have since left the company, making our education efforts a lot more difficult. So I'd be for pushing these folk to ROA the more-specifics, which is just a click of a button in their RIR's system.

...

Perhaps the RIRs should start an outreach program to proactively inform the owners of those 2,200 invalid route announcements to get them to either fix...

I would be in support of this, and would certainly work very closely with AFRINIC to fix our side of things. Happy to also do a co-preso paper with you during EPF in Athens for the RIPE side of things. If you'll be in Vancouver, we can do the same for the ARIN side. I'm at MyNOG next week, and can speak to the folk from APNIC that will be showing up about this as well. That leaves LACNIC. Mark.

Job Snijders

5:55 p.m.

On Tue, Jul 17, 2018 at 01:27:09PM +0200, Mark Tinka wrote:

...

...
Markus Weber from KPN is generating a daily report here and drew similar conclusions: https://as286.net/data/ana-invalids.txt Markus scrapes all routes from the AS 286 PEs and marks the routes for which no valid or unknown alternative exists as "altpfx=NONE".

Thanks. Protein.

So the numbers are not that far off from when I last checked this back in 2016, i.e., less than 1% of the total IPv4 routing table.

Do you have numbers for IPv6, out of interest?

There are ~ 330 IPv6 invalids in the DFZ, and for 70 of those no alternative covering prefix exists. Kind regards, Job

George Michaelson

6:11 p.m.

I don't want to over-state it, but 'number of prefices' slways feels to me like a potential mis-measure. Not that you don't want to know it, but % of announced space for a given origin-as feels like it might be closer to the story, because there can be so many different ways to announce it as dis- and super aggregates. -G On Tue, Jul 17, 2018 at 1:55 PM, Job Snijders <job@ntt.net> wrote:

...

On Tue, Jul 17, 2018 at 01:27:09PM +0200, Mark Tinka wrote:

...
...
Markus Weber from KPN is generating a daily report here and drew similar conclusions: https://as286.net/data/ana-invalids.txt Markus scrapes all routes from the AS 286 PEs and marks the routes for which no valid or unknown alternative exists as "altpfx=NONE".

Thanks. Protein.

So the numbers are not that far off from when I last checked this back in 2016, i.e., less than 1% of the total IPv4 routing table.

Do you have numbers for IPv6, out of interest?

There are ~ 330 IPv6 invalids in the DFZ, and for 70 of those no alternative covering prefix exists.

Kind regards,

Job

Mark Tinka

18 Jul 18 Jul

12:06 p.m.

On 17/Jul/18 19:55, Job Snijders wrote:

...

There are ~ 330 IPv6 invalids in the DFZ, and for 70 of those no alternative covering prefix exists.

Thanks, Job. Mark.

Michel Py

17 Jul 17 Jul

6:33 p.m.

...

Job Snijders wrote : I calculated this here few days ago http://instituut.net/~job/rpki-report-2018.07.12.txt Markus Weber from KPN is generating a daily report here and drew similar conclusions: https://as286.net/data/ana-invalids.txt Markus scrapes all routes from the AS 286 PEs and marks the routes for which no valid or unknown alternative exists as "altpfx=NONE".

If I understand this correctly, I have a suggestion : update these files at a regular interval (15/20 min) and make them available for download with a fixed name (not containing the date). Even better : have a route server that announces these prefixes with a :666 community so people could use it as a blackhole. This would not remove the invalid prefixes from one's router, but at leat would prevent traffic from/to these prefixes. In other words : a route server of prefixes that are RPKI invalid with no alternative that people could use without having an RPKI setup. This would even work with people who have chosen do accept a default route from their upstream. I understand this is not ideal; blacklisting a prefix that is RPKI invalid may actually help the hijacker, but blacklisting a prefix that is RPKI invalid AND that has no alternative could be useful ? Should be considered a bogon. Regards, Michel. TSI Disclaimer: This message and any files or text attached to it are intended only for the recipients named above and contain information that may be confidential or privileged. If you are not the intended recipient, you must not forward, copy, use or otherwise disclose this communication or the information contained herein. In the event you have received this message in error, please notify the sender immediately by replying to this message, and then delete all copies of it from your system. Thank you!...

Mark Tinka

18 Jul 18 Jul

12:10 p.m.

On 17/Jul/18 20:33, Michel Py wrote:

...

If I understand this correctly, I have a suggestion : update these files at a regular interval (15/20 min) and make them available for download with a fixed name (not containing the date). Even better : have a route server that announces these prefixes with a :666 community so people could use it as a blackhole.

This would not remove the invalid prefixes from one's router, but at leat would prevent traffic from/to these prefixes. In other words : a route server of prefixes that are RPKI invalid with no alternative that people could use without having an RPKI setup. This would even work with people who have chosen do accept a default route from their upstream.

I understand this is not ideal; blacklisting a prefix that is RPKI invalid may actually help the hijacker, but blacklisting a prefix that is RPKI invalid AND that has no alternative could be useful ? Should be considered a bogon.

Hmmh - I suppose if you want to do this in-house, that is fine. But I would not recommend this at large for the entire BGP community. At any rate, the result is the same, i.e., the route is taken out of the FIB. The difference is you are proposing a mechanism that uses existing infrastructure within almost all ISP's (the BGP Community) in lieu of deploying RPKI. I can't quite imagine the effort needed to implement your suggestion, but I'd rather direct it toward deploying RPKI. At the very least, one just needs reputable RV software, and router code that support RPKI RV. Mark.

Michel Py

7:30 p.m.

Mark,

...

...
Michel Py wrote: If I understand this correctly, I have a suggestion : update these files at a regular interval (15/20 min) and make them available for download with a fixed name (not containing the date). Even better : have a route server that announces these prefixes with a :666 community so people could use it as a blackhole. This would not remove the invalid prefixes from one's router, but at leat would prevent traffic from/to these prefixes. In other words : a route server of prefixes that are RPKI invalid with no alternative that people could use without having an RPKI setup. This would even work with people who have chosen do accept a default route from their upstream. I understand this is not ideal; blacklisting a prefix that is RPKI invalid may actually help the hijacker, but blacklisting a prefix that is RPKI invalid AND that has no alternative could be useful ? Should be considered a bogon.

...

Mark Tinka wrote : Hmmh - I suppose if you want to do this in-house, that is fine. But I would not recommend this at large for the entire BGP community.

Agree; was trying to to this is the spirit of this: http://arneill-py.sacramento.ca.us/cbbc/ As any blocklist, it should not be default and should be left to the end user to choose if they use it or not.

...

The difference is you are proposing a mechanism that uses existing infrastructure within almost all ISP's (the BGP Community) in lieu of deploying RPKI.

Not in lieu, but when deploying RPKI is not (yet) possible. My routers are not RPKI capable, upgrading will take years (I'm not going to upgrade just because I want RPKI). My upstreams don't do RPKI, I'm trying to convince them but I'm talking to deaf ears. What do I have left : using a subset of RPKI as a blackhole :-(

...

I can't quite imagine the effort needed to implement your suggestion,

Not much at all, I was actually trying you do do the RPKI part for me ;-) This script you wrote, to produce the list of prefixes that are RPKI invalid AND that do not have any alternative, make it run every x minutes on a fixed url (no date/time in name). I will fetch it, inject it in ExaBGP that feeds my iGP and voila, done. Who wants to use it can, not trying to impose it on the entire BGP community.

...

but I'd rather direct it toward deploying RPKI. At the very least, one just needs reputable RV software, and router code that support RPKI RV.

We probably have to wait until attrition brings us routers that have said code. Michel. TSI Disclaimer: This message and any files or text attached to it are intended only for the recipients named above and contain information that may be confidential or privileged. If you are not the intended recipient, you must not forward, copy, use or otherwise disclose this communication or the information contained herein. In the event you have received this message in error, please notify the sender immediately by replying to this message, and then delete all copies of it from your system. Thank you!...

Job Snijders

7:47 p.m.

On Wed, Jul 18, 2018 at 07:30:48PM +0000, Michel Py wrote:

...

Not in lieu, but when deploying RPKI is not (yet) possible. My routers are not RPKI capable, upgrading will take years (I'm not going to upgrade just because I want RPKI).

Can you elaborate what routers with what software you are using? It surprises me a bit to find routers anno 2018 which can't do OV in some shape or form.

...

What do I have left : using a subset of RPKI as a blackhole :-(

If you implement 'invalid == blackhole', and cannot do normal OV - it seems to me that you'll be blackholing the actual victim of a BGP hijack? That would seem counter-productive. Kind regards, Job

Michel Py

8:16 p.m.

...

Job Snijders wrote : Can you elaborate what routers with what software you are using? It surprises me a bit to find routers anno 2018 which can't do OV in some shape or form.

They're not anno 2018 ! Cisco 3900 with 4 Gigs. Good enough for me, with the current growth of the DFZ I may have 10 years left before I need to upgrade. Probably will upgrade before that caused to bandwidth, but as of now works good enough for me and upgrading just to get OV is going to be a tough sell.

...

...
What do I have left : using a subset of RPKI as a blackhole :-( If you implement 'invalid == blackhole', and cannot do normal OV - it seems to me that you'll be blackholing the actual victim of a BGP hijack? That would seem counter-productive.

I would indeed, but the intent was a subset of invalid : the invalid prefixes that nobody _but_ the hijacker anounces, so blackholing does not hurt the real owner. In other words : un-announced prefixes that have been hijacked. These are not into bogon lists because they are real. Now I have no illusions : this is not going to solve the world's problems, how many of these are actually announced and how will that play in the longer term are questionable, but would not that be worth a quick shot at it ? Michel.

Randy Bush

9:55 p.m.

...

Can you elaborate what routers with what software you are using? It surprises me a bit to find routers anno 2018 which can't do OV in some shape or form.

depends on how picky you are about "some shape or form." draft-ietf-sidrops-ov-clarify was not written because it is usefully implemented by many vendors. randy

Job Snijders

11:21 p.m.

On Wed, Jul 18, 2018 at 05:55:23PM -0400, Randy Bush wrote:

...

...
Can you elaborate what routers with what software you are using? It surprises me a bit to find routers anno 2018 which can't do OV in some shape or form.

depends on how picky you are about "some shape or form."

I was thinking along the lines of "perhaps the box can't do RTR but allows for ROAs to be configured in the run-time config"

...

draft-ietf-sidrops-ov-clarify was not written because it is usefully implemented by many vendors.

@ all - It would be good if operators ask their vendors if they can get behind this I-D https://tools.ietf.org/html/draft-ietf-sidrops-ov-clarify Kind regards, Job

Mark Tinka

19 Jul 19 Jul

5:46 a.m.

On 19/Jul/18 01:21, Job Snijders wrote:

...

@ all - It would be good if operators ask their vendors if they can get behind this I-D https://tools.ietf.org/html/draft-ietf-sidrops-ov-clarify

I'm actually glad to see this (Randy, you've abandoned me, hehe). We actually hit and troubleshot both these issues together with Randy and a bunch of many good folk in the operator and vendor community back in 2016/2017, where we discovered that Cisco were marking all iBGP routes as Valid by default, and automatically applying RPKI policy on routes without actual operator input. The latter issue was actually officially documented as part of how the implementation works over at Cisco-land, but the former was a direct violation of the RFC. These issues were eventually fixed later in 2017, but glad to see that there is an I-D that proposes this more firmly! Thanks, Randy! Mark.

Mark Tinka

5:25 a.m.

On 18/Jul/18 21:30, Michel Py wrote:

...

Not much at all, I was actually trying you do do the RPKI part for me ;-) This script you wrote, to produce the list of prefixes that are RPKI invalid AND that do not have any alternative, make it run every x minutes on a fixed url (no date/time in name). I will fetch it, inject it in ExaBGP that feeds my iGP and voila, done.

Just to clarify, Job wrote that script, not me :-).

...

Who wants to use it can, not trying to impose it on the entire BGP community.

Which is fine, but I want to be cautious about encouraging a parallel stream that slows down the deployment of RPKI.

...

We probably have to wait until attrition brings us routers that have said code.

We generally use typical service provider routers to deliver services. So I'm not sure whether the 3900's you run support it or not. Mark.

Michel Py

7:47 p.m.

...

Mark Tinka wrote : but I want to be cautious about encouraging a parallel stream that slows down the deployment of RPKI.

I understand that; if there is an easier way to do RPKI, people are going to use it instead of the right way. However, I think that the blacklist targets a different kind of customer : the end user. We want the enterprise to certify their prefixes with RPKI and put pressure on their upstreams to deploy it, the more noise we make the better. What I want is my upstreams to give me a clean routing tables without invalids, but it does not happen so in the meantime I'm trying to do what I can with my limited resources.

...

We generally use typical service provider routers to deliver services. So I'm not sure whether the 3900's you run support it or not.

The picture from the enterprise is quite different. There is a lot of stuff out there that does not get upgraded, that is not even under a maintenance contract to get the new software, or that is on EOL/EOS hardware. Michel. TSI Disclaimer: This message and any files or text attached to it are intended only for the recipients named above and contain information that may be confidential or privileged. If you are not the intended recipient, you must not forward, copy, use or otherwise disclose this communication or the information contained herein. In the event you have received this message in error, please notify the sender immediately by replying to this message, and then delete all copies of it from your system. Thank you!...

Mark Tinka

9:04 p.m.

On 19/Jul/18 21:47, Michel Py wrote:

...

I understand that; if there is an easier way to do RPKI, people are going to use it instead of the right way. However, I think that the blacklist targets a different kind of customer : the end user. We want the enterprise to certify their prefixes with RPKI and put pressure on their upstreams to deploy it, the more noise we make the better. What I want is my upstreams to give me a clean routing tables without invalids, but it does not happen so in the meantime I'm trying to do what I can with my limited resources.

The script that Job wrote is neat, but I'm sure neither he nor I would run it in production in lieu of the actual RPKI infrastructure. Even though you're my competitor, I'd caution against this. But, your network, your rules.

...

The picture from the enterprise is quite different. There is a lot of stuff out there that does not get upgraded, that is not even under a maintenance contract to get the new software, or that is on EOL/EOS hardware.

So don't re-invent this wheel; that is what Delegated RPKI is for. Several RPKI tools out there support CA functionality, as much as they support the RP side as well. Let's not create something totally out of scope to mimic specs and tools already exist. If you really want to participate in the RPKI, then you seriously need to consider supporting software that implements it. If not, use your ISP's CA tools to sign your IP addresses, and then rely on them to have clean FIB's when you use them for transit. RPSL got complicated enough with all its good intentions, and we know how that turned out. Let's not muddy the RPKI waters. Mark.

Alex Band

26 Jul 26 Jul

7:09 p.m.

...

On 19 Jul 2018, at 23:04, Mark Tinka <mark.tinka@seacom.mu> wrote:

On 19/Jul/18 21:47, Michel Py wrote:

...
I understand that; if there is an easier way to do RPKI, people are going to use it instead of the right way. However, I think that the blacklist targets a different kind of customer : the end user. We want the enterprise to certify their prefixes with RPKI and put pressure on their upstreams to deploy it, the more noise we make the better. What I want is my upstreams to give me a clean routing tables without invalids, but it does not happen so in the meantime I'm trying to do what I can with my limited resources.

The script that Job wrote is neat, but I'm sure neither he nor I would run it in production in lieu of the actual RPKI infrastructure.

Even though you're my competitor, I'd caution against this. But, your network, your rules.

...
The picture from the enterprise is quite different. There is a lot of stuff out there that does not get upgraded, that is not even under a maintenance contract to get the new software, or that is on EOL/EOS hardware.

So don't re-invent this wheel; that is what Delegated RPKI is for. Several RPKI tools out there support CA functionality, as much as they support the RP side as well.

To add to the genetic diversity, NLnet Labs recently committed to building a full RPKI Toolset, including a (Delegated) Certificate Authority, a Publication Server and Relying Party software. As an RP implementation was the easiest way to get going, we now have some running code – in Rust – here: https://github.com/NLnetLabs/routinator Ou mission is to offer a toolset that on par with our other projects such as NSD and Unbound, in terms of quality, feature set and update frequency. We’re looking forward to your feedback; in the mean time we’re getting started with the CA and Publication Server. Cheers, Alex

Mark Tinka

14 Jul 14 Jul

4:57 a.m.

On 13/Jul/18 18:37, Grant Taylor via NANOG wrote:

...

Yep.

You would almost need separate logical networks / VRF to be able to prevent the longest prefix match winning issue that you reminded me of.

Oooh, complexity - things we want to avoid :-). Then again, we don't run the Internet in a VRF, so... Mark.

Mark Tinka

4:51 a.m.

On 13/Jul/18 18:25, Christopher Morrow wrote:

...

it sounded like Mark didn't want to deal with that complexity in his network, until more deployment and more requests from customers like; Customer: "Hey, why did my traffic get hijacked to paY(omlut)pal.com yesterday?" Mark: "because you didn't ask for 'super-sekure-customer config? sorry?"

I could have misunderstood either mark or job or you.. of course.

I didn't want to pass on Invalid routes at all, to ensure that the source operator of that route correctly signs it in the RPKI. However, one can't make the horse drink. Using LOCAL_PREF to determine the preference between Valid, Unknown and Invalid routes is just pussy-footing around the feature, if I'm being honest. What's the saying... "Go big, or go home" :-). Mark.

Mark Tinka

4:46 a.m.

On 13/Jul/18 17:18, Grant Taylor via NANOG wrote:

...

Please forgive the n00b question: But isn't that where carrying the prefixes through your network and conditionally advertising them to customers comes into play?

Or does that run into complications where you must also have the prefixes which don't validate routed in your core?

Carrying prefixes in the network is not an issue, valid or otherwise. If you act on them as they enter the network in an aggressive manner, then the other end of an eBGP session will not receive them. That's the issue. Of course, that's how RPKI is supposed to work, but when you're the only one doing it, you're shooting your own foot.

...

The reading I did on RPKI / OV yesterday made me think that it is possible to have validated routes preferred over unknown routes which are preferred over invalid routes. So I'd think that you could still have the routes through your core but conditionally advertise the prefixes to customers based on their desires.

Using LOCAL_PREF to (de)prefer routes based on their validation status is an idea that has been used since 2014. But for me, it defeats the purpose if you are going to go soft when trying to implement something that requires this much resolve to clean up the Internet. Mark.

Job Snijders

12:04 p.m.

On Fri, Jul 13, 2018 at 02:53:30PM +0200, Mark Tinka wrote:

...

That, though, still leaves the problem where you end up providing a partial routing table to your customers, while your competitors in the same market aren't.

I actually view it as a competitive advantage to carry a cleaner set of routes compared to the providers with a more permissive (or lack of) filtering strategy. Sometimes less is more. Kind regards, Job

Saku Ytti

1:43 p.m.

On Sat, 14 Jul 2018 at 15:07, Job Snijders <job@ntt.net> wrote:

...

I actually view it as a competitive advantage to carry a cleaner set of routes compared to the providers with a more permissive (or lack of) filtering strategy. Sometimes less is more.

* When you consider your addressable market 'clueful customers'. -- ++ytti

Mark Tinka

7:56 p.m.

On 14/Jul/18 14:04, Job Snijders wrote:

...

I actually view it as a competitive advantage to carry a cleaner set of routes compared to the providers with a more permissive (or lack of) filtering strategy. Sometimes less is more.

Typically, I wouldn't disagree. In practice, most customers only care about reachability, and not their contribution to the Internet hygiene. A case of "They will look after it", where "they" is not "me". Mark.

Paolo Lucente

16 Jul 16 Jul

5 p.m.

Hi Job, All, It is definitely great to see progress on the deployment side! I realize that there may be some gaps in the network operator toolchain, and this may be something i'd like to contribute to. For network operators to better understand the impact of BGP hijacks in terms of revenue or volumes of traffic that went missing, it makes perfect sense if network monitoring tools are aware of which BGP announcements are invalid or not. I will look into adding support for the RTR protocol (RFC 6810, RFC 8210) to pmacct ( https://github.com/pmacct/pmacct , http://pmacct.net/ ) and expose the validation state through an extra field (when collecting routing tables) and primitive (when accounting traffic and correlating it with BGP data). Updating the telemetry tools to be fully aware of RPKI validation states should come in handy! Paolo On Thu, Jul 12, 2018 at 05:50:29PM +0000, Job Snijders wrote:

...

Hi all,

I wanted to share with you that a ton of activity is taking place in the Dutch networker community to deploy RPKI based BGP Origin Validation. The mantra is "invalid == reject" on all EBGP sessions.

What's of note here is that we're now seeing the first commercial ISPs doing Origin Validation. This is a significant step forward compared to what we observed so far (it seemed OV was mostly limited to academic institutions & toy networks). But six months ago Amsio (https://www.amsio.com/en/) made the jump, and today Fusix deployed (https://fusix.nl/deploying-rpki/).

We've also seen an uptake of Origin Validation at Internet Exchange route servers: AMS-IX and FranceIX have already deployed. I've read that RPKI OV is under consideration at a number of other exchanges.

Other cool news is that Cloudflare launched a Certificate Transparency initiative to help keep everyone honest. Announcement at: https://twitter.com/grittygrease/status/1017224762542587907 Certificate Transparency is a fascinating tool, really a necessity to build confidence in any PKI systems.

Anyone here working to deploy RPKI based Origin Validation in their network and reject invalid announcements? Anything of note to share?

Kind regards,

Job

Mark Tinka

19 Jul 19 Jul

3:02 p.m.

On 16/Jul/18 19:00, Paolo Lucente wrote:

...

Hi Job, All,

It is definitely great to see progress on the deployment side! I realize that there may be some gaps in the network operator toolchain, and this may be something i'd like to contribute to.

For network operators to better understand the impact of BGP hijacks in terms of revenue or volumes of traffic that went missing, it makes perfect sense if network monitoring tools are aware of which BGP announcements are invalid or not.

I will look into adding support for the RTR protocol (RFC 6810, RFC 8210) to pmacct ( https://github.com/pmacct/pmacct , http://pmacct.net/ ) and expose the validation state through an extra field (when collecting routing tables) and primitive (when accounting traffic and correlating it with BGP data).

Updating the telemetry tools to be fully aware of RPKI validation states should come in handy!

Sounds great, Paolo. Many thanks for this. Mark.

2546

Age (days ago)

2560

Last active (days ago)

List overview

Download

37 comments

12 participants

participants (12)

Alex Band
Baldur Norddahl
Christopher Morrow
George Michaelson
Grant Taylor
Job Snijders
Job Snijders
Mark Tinka
Michel Py
Paolo Lucente
Randy Bush
Saku Ytti