On 1/Oct/18 10:26, John Curran wrote:
Of course, this presumes correct routing configuration by the ISP when setting up RPKI route validation; while one would hope that the vast majority handle this situation correctly, there is no assurance that will be true without exception. If RPKI routing validation is widely deployed, tens of thousands of ISPs will be setting up such a configuration, with customer impact during an RPKI CA outage occurring for those who somehow failure to fall back to using NotFound routes. If only a small percentage get this wrong, it will still represent dozens of ISPs going dark as a result.
It is equally important to understand how vendors have interpreted the RFC for default treatment of RPKI data. When we started testing IOS and IOS XE back in 2014/2015, we hit an issue where Cisco were automatically applying policy to RPKI state without configuration from the operator. This was fixed in later code, but goes to show that one should not assume that vendors are always doing the right thing, or at the very least, fully understand what their view on RPKI might be on the wider Internet, in real production. So before deploying network-wide, I encourage operators to test what their equipment will do when RPKI is enabled but without any manual policy applied.
Indeed… Hence the question of liability during a RIR CA outage, should the liability for misconfigured ISPs (those handful of ISPs who do not properly fall back to using state NotFound routes) be the responsibility of each ISP, or perhaps those who announce ROAs, or should be with the RIR?
Any equipment misconfigurations should be the responsibility of the operator. Responsibility for ROA's should lie with the resource holder, in ensuring that not only is the information true, but that also all announced prefixes are covered by a ROA. An RIR CA outage would, in my mind, be the responsibility of the RIR. But this comes back to my question of how this handled with an "all resource" TA. Mark.