Hello, On Wed, 28 Oct 2020 at 16:58, Randy Bush <randy@psg.com> wrote:
tl;dr: diagnosed by comcast. see our short paper to be presented at imc tomorrow https://archive.psg.com/200927.imc-rp.pdf
lesson: route origin relying party software may cause as much damage as it ameliorates
There is a myth that ROV is inherently fail-safe (it isn't if your production routers have stale VRP's) which leads to the assumption that proper monitoring is neglectable. I'm working on a shell script using rtrdump to detect stale RTR servers (based on serial changes and the actual data). Of course this would never detect partial failures that affect only some child-CAs, but it does detect a hung RTR server (or a standalone RTR server where the validator validates no more). lukas