Hello
I noticed that we regressed and started failing the test at
https://isbgpsafeyet.com/. Investigating I found that we apparently had some routes in the validation state "unknown" that should have been either invalid or valid. Including the test prefix which was received via NL-IX (and Cogent on IPv6).
We do however have plenty of prefixes that are validated and received from the same sources.
This is a Juniper MX204 router running 20.1R1.11. I tried a few things including "clear bgp neighbor xxx soft-inbound" (supposed to rerun the import policy where RPKI marking and check happens) which did not fix it. Doing a "clear bgp neighbor xxx", which disconnects the peer and reconnects after a slight delay, did however fix the issue. But I have to do that for every peer we received the prefix from and potentially we could have trouble with every peer we have :-(
This router was software upgraded and rebooted two days ago. I suspect a race condition. What if the router started BGP sessions before it was able to communicate with the RPKI validation server or before the RPKI database was synchronized?
I find it a bit disappointing that we this easily ended up with a bad validation state and apparently there is little I can do about it, except for walking through all our peers and BGP reset them. Which frankly is an unacceptable disruption of traffic flow.
Regards,
Baldur