We were actually having this same issue earlier tonight, though across .com and .is domains as well. Interesting to see it's not just us.

Kind regards,
Peter Potvin


On Sun, Jul 28, 2024 at 9:32 PM Adam Daniels <adam@mediadrive.ca> wrote:
Subject says it all, and I’m in 500-mile email mystery here. Anybody want to take a guess what it could be?

Synopsis: Queries to .ca zones randomly fail with a DNSSEC validation error but it appears to be region dependent and zone dependent. Anycast verifying resolvers seem most prone to trigger the failure mode. I can’t trigger it running a local verifying resolver (unbound).

I tried raising this with CIRA on Friday morning, and have observed it since Wednesday, but nothing back from CIRA yet. Being geo-dependent, I’m guessing the resolver that’s shortest path to me might have an issue, but I can trigger it on multiple services usually (Google + CloudFlare at least) so I can’t see that being the issue.

I can trigger this from the Google DNS Web page as well, but not reliably. I can trigger this on domains I am not authoritive for (random domains I found while browsing) but I’ll use two authoritive domains here.

$ dig seattle.mediadrive.ca @8.8.8.8
; <<>> DiG 9.10.6 <<>> seattle.mediadrive.ca
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 56149
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
; OPT=15: 00 07 45 78 70 69 72 65 64 20 52 52 53 49 47 20 66 6f 75 6e 64 20 66 6f 72 20 70 75 66 35 32 6b 70 68 36 75 30 71 35 67 68 73 69 6c 72 33 68 63 31 64 37 65 6c 62 61 68 67 33 2e 63 61 2f 6e 73 65 63 33 20 28 6b 65 79 74 61 67 3d 35 36 38 31 36 29 ("..Expired RRSIG found for puf52kph6u0q5ghsilr3hc1d7elbahg3.ca/nsec3 (keytag=56816)")
;; QUESTION SECTION:
;seattle.mediadrive.ca. IN A
;; Query time: 67 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Sat Jul 27 16:26:37 EDT 2024
;; MSG SIZE rcvd: 136

$ dig adamdaniels.ca @1.1.1.1
; <<>> DiG 9.10.6 <<>> adamdaniels.ca @1.1.1.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 4638
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; OPT=15: 00 0a 66 61 69 6c 65 64 20 74 6f 20 76 65 72 69 66 79 20 73 69 67 6e 61 74 75 72 65 73 20 66 6f 72 20 61 64 61 6d 64 61 6e 69 65 6c 73 2e 63 61 2e 20 6f 70 74 2d 6f 75 74 20 70 72 6f 6f 66 ("..failed to verify signatures for adamdaniels.ca. opt-out proof")
;; QUESTION SECTION:
;adamdaniels.ca.                        IN      A

;; Query time: 50 msec
;; SERVER: 1.1.1.1#53(1.1.1.1)
;; WHEN: Sat Jul 27 17:27:28 EDT 2024
;; MSG SIZE  rcvd: 110

If I let it run long enough, I can trigger it on canada.ca but not with any frequency.

Performing the queries from my home is incredibly reliable for reproducing this, but I can trigger it from a facility I colocate with in Toronto as well.

== MTR from Toronto, Canada
Start: 2024-07-27T17:30:07-0400
HOST: manager                                   Loss%   Snt   Last   Avg  Best  Wrst StDev
 1.|-- _gateway                                   0.0%    10    1.2   1.0   0.8   1.2   0.1
 2.|-- 198.55.53.14                              90.0%    10    0.5   0.5   0.5   0.5   0.0
 3.|-- i.cr003.ca1-01.yyz.as1100.net              0.0%    10    0.3   0.4   0.2   0.8   0.2
 4.|-- i.rogers.ca1-01.yyz.as1100.net             0.0%    10    0.4   0.5   0.3   0.7   0.2
 5.|-- 99.209.203.17                              0.0%    10    0.6   0.6   0.5   0.9   0.1
 6.|-- 24.153.31.130                              0.0%    10    1.4   1.6   1.3   1.9   0.2
 7.|-- 3021-cgw01.mtnk.asr9k.rmgt.net.rogers.com  0.0%    10    1.5   1.8   1.5   2.1   0.2
 8.|-- 209.148.235.222                            0.0%    10    2.9   4.7   2.6  14.6   3.9
 9.|-- ???                                       100.0    10    0.0   0.0   0.0   0.0   0.0
10.|-- 192.178.99.39                              0.0%    10    2.4   2.2   2.0   2.4   0.1
11.|-- 216.239.50.119                             0.0%    10    3.2   3.1   2.9   3.3   0.1
12.|-- dns.google                                 0.0%    10    2.0   2.0   1.9   2.2   0.1

== MTR from my home (Niagara region, Canada)
Start: 2024-07-27T17:30:21-0400
HOST: Adams-MacBook-Air.local Loss%   Snt   Last   Avg  Best  Wrst StDev
 1.|-- 192.168.1.1              0.0%    10    2.7   3.1   2.7   3.4   0.2
 2.|-- 10.202.100.1             0.0%    10   74.9  20.2   9.0  74.9  21.7
 3.|-- ???                     100.0    10    0.0   0.0   0.0   0.0   0.0
 4.|-- c8.tpia.start.ca         0.0%    10   43.5  27.1  14.8  58.8  16.8
 5.|-- 72.14.198.214            0.0%    10   41.7  22.9  16.4  41.7   7.6
 6.|-- 192.178.99.31            0.0%    10   37.8  20.0  14.2  37.8   7.6
 7.|-- 216.239.41.175           0.0%    10   41.2  19.3  15.3  41.2   7.8
 8.|-- dns.google               0.0%    10   19.2  17.8  14.7  21.5   2.2

I’ve tried the same queries from NYC and Seattle but do not trigger any failures.

Thoughts?

Adam