DNSSEC failing when querying zones under ca. randomly but only from some regions
Subject says it all, and I’m in 500-mile email mystery here. Anybody want to take a guess what it could be? Synopsis: Queries to .ca zones randomly fail with a DNSSEC validation error but it appears to be region dependent and zone dependent. Anycast verifying resolvers seem most prone to trigger the failure mode. I can’t trigger it running a local verifying resolver (unbound). I tried raising this with CIRA on Friday morning, and have observed it since Wednesday, but nothing back from CIRA yet. Being geo-dependent, I’m guessing the resolver that’s shortest path to me might have an issue, but I can trigger it on multiple services usually (Google + CloudFlare at least) so I can’t see that being the issue. I can trigger this from the Google DNS Web page as well, but not reliably. I can trigger this on domains I am not authoritive for (random domains I found while browsing) but I’ll use two authoritive domains here. $ dig seattle.mediadrive.ca @8.8.8.8 ; <<>> DiG 9.10.6 <<>> seattle.mediadrive.ca ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 56149 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ; OPT=15: 00 07 45 78 70 69 72 65 64 20 52 52 53 49 47 20 66 6f 75 6e 64 20 66 6f 72 20 70 75 66 35 32 6b 70 68 36 75 30 71 35 67 68 73 69 6c 72 33 68 63 31 64 37 65 6c 62 61 68 67 33 2e 63 61 2f 6e 73 65 63 33 20 28 6b 65 79 74 61 67 3d 35 36 38 31 36 29 ("..Expired RRSIG found for puf52kph6u0q5ghsilr3hc1d7elbahg3.ca/nsec3 (keytag=56816)") ;; QUESTION SECTION: ;seattle.mediadrive.ca. IN A ;; Query time: 67 msec ;; SERVER: 8.8.8.8#53(8.8.8.8) ;; WHEN: Sat Jul 27 16:26:37 EDT 2024 ;; MSG SIZE rcvd: 136 $ dig adamdaniels.ca @1.1.1.1 ; <<>> DiG 9.10.6 <<>> adamdaniels.ca @1.1.1.1 ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 4638 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1232 ; OPT=15: 00 0a 66 61 69 6c 65 64 20 74 6f 20 76 65 72 69 66 79 20 73 69 67 6e 61 74 75 72 65 73 20 66 6f 72 20 61 64 61 6d 64 61 6e 69 65 6c 73 2e 63 61 2e 20 6f 70 74 2d 6f 75 74 20 70 72 6f 6f 66 ("..failed to verify signatures for adamdaniels.ca. opt-out proof") ;; QUESTION SECTION: ;adamdaniels.ca. IN A ;; Query time: 50 msec ;; SERVER: 1.1.1.1#53(1.1.1.1) ;; WHEN: Sat Jul 27 17:27:28 EDT 2024 ;; MSG SIZE rcvd: 110 If I let it run long enough, I can trigger it on canada.ca but not with any frequency. Performing the queries from my home is incredibly reliable for reproducing this, but I can trigger it from a facility I colocate with in Toronto as well. == MTR from Toronto, Canada Start: 2024-07-27T17:30:07-0400 HOST: manager Loss% Snt Last Avg Best Wrst StDev 1.|-- _gateway 0.0% 10 1.2 1.0 0.8 1.2 0.1 2.|-- 198.55.53.14 90.0% 10 0.5 0.5 0.5 0.5 0.0 3.|-- i.cr003.ca1-01.yyz.as1100.net 0.0% 10 0.3 0.4 0.2 0.8 0.2 4.|-- i.rogers.ca1-01.yyz.as1100.net 0.0% 10 0.4 0.5 0.3 0.7 0.2 5.|-- 99.209.203.17 0.0% 10 0.6 0.6 0.5 0.9 0.1 6.|-- 24.153.31.130 0.0% 10 1.4 1.6 1.3 1.9 0.2 7.|-- 3021-cgw01.mtnk.asr9k.rmgt.net.rogers.com 0.0% 10 1.5 1.8 1.5 2.1 0.2 8.|-- 209.148.235.222 0.0% 10 2.9 4.7 2.6 14.6 3.9 9.|-- ??? 100.0 10 0.0 0.0 0.0 0.0 0.0 10.|-- 192.178.99.39 0.0% 10 2.4 2.2 2.0 2.4 0.1 11.|-- 216.239.50.119 0.0% 10 3.2 3.1 2.9 3.3 0.1 12.|-- dns.google 0.0% 10 2.0 2.0 1.9 2.2 0.1 == MTR from my home (Niagara region, Canada) Start: 2024-07-27T17:30:21-0400 HOST: Adams-MacBook-Air.local Loss% Snt Last Avg Best Wrst StDev 1.|-- 192.168.1.1 0.0% 10 2.7 3.1 2.7 3.4 0.2 2.|-- 10.202.100.1 0.0% 10 74.9 20.2 9.0 74.9 21.7 3.|-- ??? 100.0 10 0.0 0.0 0.0 0.0 0.0 4.|-- c8.tpia.start.ca 0.0% 10 43.5 27.1 14.8 58.8 16.8 5.|-- 72.14.198.214 0.0% 10 41.7 22.9 16.4 41.7 7.6 6.|-- 192.178.99.31 0.0% 10 37.8 20.0 14.2 37.8 7.6 7.|-- 216.239.41.175 0.0% 10 41.2 19.3 15.3 41.2 7.8 8.|-- dns.google 0.0% 10 19.2 17.8 14.7 21.5 2.2 I’ve tried the same queries from NYC and Seattle but do not trigger any failures. Thoughts? Adam
We were actually having this same issue earlier tonight, though across .com and .is domains as well. Interesting to see it's not just us. Kind regards, Peter Potvin On Sun, Jul 28, 2024 at 9:32 PM Adam Daniels <adam@mediadrive.ca> wrote:
Subject says it all, and I’m in 500-mile email mystery here. Anybody want to take a guess what it could be?
Synopsis: Queries to .ca zones randomly fail with a DNSSEC validation error but it appears to be region dependent and zone dependent. Anycast verifying resolvers seem most prone to trigger the failure mode. I can’t trigger it running a local verifying resolver (unbound).
I tried raising this with CIRA on Friday morning, and have observed it since Wednesday, but nothing back from CIRA yet. Being geo-dependent, I’m guessing the resolver that’s shortest path to me might have an issue, but I can trigger it on multiple services usually (Google + CloudFlare at least) so I can’t see that being the issue.
I can trigger this from the Google DNS Web page as well, but not reliably. I can trigger this on domains I am not authoritive for (random domains I found while browsing) but I’ll use two authoritive domains here.
$ dig seattle.mediadrive.ca @8.8.8.8 ; <<>> DiG 9.10.6 <<>> seattle.mediadrive.ca ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 56149 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 512 ; OPT=15: 00 07 45 78 70 69 72 65 64 20 52 52 53 49 47 20 66 6f 75 6e 64 20 66 6f 72 20 70 75 66 35 32 6b 70 68 36 75 30 71 35 67 68 73 69 6c 72 33 68 63 31 64 37 65 6c 62 61 68 67 33 2e 63 61 2f 6e 73 65 63 33 20 28 6b 65 79 74 61 67 3d 35 36 38 31 36 29 ("..Expired RRSIG found for puf52kph6u0q5ghsilr3hc1d7elbahg3.ca/nsec3 (keytag=56816)") ;; QUESTION SECTION: ;seattle.mediadrive.ca. IN A ;; Query time: 67 msec ;; SERVER: 8.8.8.8#53(8.8.8.8) ;; WHEN: Sat Jul 27 16:26:37 EDT 2024 ;; MSG SIZE rcvd: 136
$ dig adamdaniels.ca @1.1.1.1 ; <<>> DiG 9.10.6 <<>> adamdaniels.ca @1.1.1.1 ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 4638 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1232 ; OPT=15: 00 0a 66 61 69 6c 65 64 20 74 6f 20 76 65 72 69 66 79 20 73 69 67 6e 61 74 75 72 65 73 20 66 6f 72 20 61 64 61 6d 64 61 6e 69 65 6c 73 2e 63 61 2e 20 6f 70 74 2d 6f 75 74 20 70 72 6f 6f 66 ("..failed to verify signatures for adamdaniels.ca. opt-out proof") ;; QUESTION SECTION: ;adamdaniels.ca. IN A
;; Query time: 50 msec ;; SERVER: 1.1.1.1#53(1.1.1.1) ;; WHEN: Sat Jul 27 17:27:28 EDT 2024 ;; MSG SIZE rcvd: 110
If I let it run long enough, I can trigger it on canada.ca but not with any frequency.
Performing the queries from my home is incredibly reliable for reproducing this, but I can trigger it from a facility I colocate with in Toronto as well.
== MTR from Toronto, Canada Start: 2024-07-27T17:30:07-0400 HOST: manager Loss% Snt Last Avg Best Wrst StDev 1.|-- _gateway 0.0% 10 1.2 1.0 0.8 1.2 0.1 2.|-- 198.55.53.14 90.0% 10 0.5 0.5 0.5 0.5 0.0 3.|-- i.cr003.ca1-01.yyz.as1100.net 0.0% 10 0.3 0.4 0.2 0.8 0.2 4.|-- i.rogers.ca1-01.yyz.as1100.net 0.0% 10 0.4 0.5 0.3 0.7 0.2 5.|-- 99.209.203.17 0.0% 10 0.6 0.6 0.5 0.9 0.1 6.|-- 24.153.31.130 0.0% 10 1.4 1.6 1.3 1.9 0.2 7.|-- 3021-cgw01.mtnk.asr9k.rmgt.net.rogers.com 0.0% 10 1.5 1.8 1.5 2.1 0.2 8.|-- 209.148.235.222 0.0% 10 2.9 4.7 2.6 14.6 3.9 9.|-- ??? 100.0 10 0.0 0.0 0.0 0.0 0.0 10.|-- 192.178.99.39 0.0% 10 2.4 2.2 2.0 2.4 0.1 11.|-- 216.239.50.119 0.0% 10 3.2 3.1 2.9 3.3 0.1 12.|-- dns.google 0.0% 10 2.0 2.0 1.9 2.2 0.1
== MTR from my home (Niagara region, Canada) Start: 2024-07-27T17:30:21-0400 HOST: Adams-MacBook-Air.local Loss% Snt Last Avg Best Wrst StDev 1.|-- 192.168.1.1 0.0% 10 2.7 3.1 2.7 3.4 0.2 2.|-- 10.202.100.1 0.0% 10 74.9 20.2 9.0 74.9 21.7 3.|-- ??? 100.0 10 0.0 0.0 0.0 0.0 0.0 4.|-- c8.tpia.start.ca 0.0% 10 43.5 27.1 14.8 58.8 16.8 5.|-- 72.14.198.214 0.0% 10 41.7 22.9 16.4 41.7 7.6 6.|-- 192.178.99.31 0.0% 10 37.8 20.0 14.2 37.8 7.6 7.|-- 216.239.41.175 0.0% 10 41.2 19.3 15.3 41.2 7.8 8.|-- dns.google 0.0% 10 19.2 17.8 14.7 21.5 2.2
I’ve tried the same queries from NYC and Seattle but do not trigger any failures.
Thoughts?
Adam
On Jul 28, 2024, at 9:34 PM, Peter Potvin <peter.potvin@accuristechnologies.ca> wrote:
We were actually having this same issue earlier tonight, though across .com and .is domains as well. Interesting to see it's not just us.
That’s an interesting observation, Peter. I hadn’t seen it outside of the .ca zone personally, but I just received a response back from CIRA that tracks with what you saw.
This was actually a legit and ongoing issue with a third-party anycast DNS provider for .CA. It also impacted ~100+ other top-level domains and their dnssec-validation in the Toronto-Niagara region.
Anyways, it’s nice to finally have a resolution to this. Cheers.
participants (2)
-
Adam Daniels
-
Peter Potvin