On Tue, Oct 5, 2021 at 8:57 AM Kain, Becki (.) <bkain1@ford.com> wrote:
Why ever would have a card reader on your external facing network, if that was really the case why they couldn't get in to fix it?
Let's hypothesize for a moment. Let's suppose you've decided that certificate-based authentication is the cat's meow, and so you've got dot1x authentication on every network port in your corporate environment, all your users are authenticated via certificates, all properly signed all the way up the chain to the root trust anchor. Life is good. But then you have a bad network day. Suddenly, you can't talk to upstream registries/registrars, you can't reach the trust anchor for your certificates, and you discover that all the laptops plugged into your network switches are failing to validate their authenticity; sure, you're on the network, but you're in a guest vlan, with no access. Your user credentials aren't able to be validated, so you're stuck with the base level of access, which doesn't let you into the OOB network. Turns out your card readers were all counting on dot1x authentication to get them into the right vlan as well, and with the network buggered up, the switches can't validate *their* certificates either, so the door badge card readers just flash their LEDs impotently when you wave your badge at them. Remember, one attribute of certificates is that they are designated as valid for a particular domain, or set of subdomains with a wildcard; that is, an authenticator needs to know where the certificate is being presented to know if it is valid within that scope or not. You can do that scope validation through several different mechanisms, such as through a chain of trust to a certificate authority, or through DNSSEC with DANE--but fundamentally, all certificates have a scope within which they are valid, and a means to identify in which scope they are being used. And wether your certificate chain of trust is being determined by certificate authorities or DANE, they all require that trust to be validated by something other than the client and server alone--which generally makes them dependent on some level of external network connectivity being present in order to properly function. [yes, yes, we can have a side discussion about having every authentication server self-sign certificates as its own CA, and thus eliminate external network connectivity dependencies--but that's an administrative nightmare that I don't think any large organization would sign up for.] So, all of the client certificates and authorization servers we're talking about exist on your internal network, but they all counted on reachability to your infrastructure servers in order to properly authenticate and grant access to devices and people. If your BGP update made your infrastructure servers, such as DNS servers, become unreachable, then suddenly you might well find yourself locked out both physically and logically from your own network. Again, this is purely hypothetical, but it's one scenario in which a routing-level "oooooops" could end up causing physical-entry denial, as well as logical network access level denial, without actually having those authentication systems on external facing networks. Certificate-based authentication is scalable and cool, but it's really important to think about even generally "that'll never happen" failure scenarios when deploying it into critical systems. It's always good to have the "break glass in case of emergency" network that doesn't rely on dot1x, that works without DNS, without NTP, without RADIUS, or any other external system, with a binder with printouts of the IP addresses of all your really critical servers and routers in it which gets updated a few times a year, so that when the SHTF, a person sitting at a laptop plugged into that network with the binder next to them can get into the emergency-only local account on each router to fix things. And yes, you want every command that local emergency-only user types into a router to be logged, because someone wanting to create mischief in your network is going to aim for that account access if they can get it; so watch it like a hawk, and the only time it had better be accessed and used is when the big red panic button has already been hit, and the executives are huddled around speakerphones wanting to know just how fast you can get things working again. ^_^; I know nothing of the incident in question. But sitting at home, hypothesizing about ways in which things could go wrong, this is one of the reasons why I still configure static emergency accounts on network devices, even with centrally administered account systems, and why there's always a set of "no dot1x" ports that work to get into the OOB/management network even when everything else has gone toes-up. :) So--that's one way in which an outage like this could have locked people out of buildings. ^_^; Thanks! Matt [ready for the deluge of people pointing out I've overly simplified the validation chain for certificates in order to keep the post short and high-level. ^_^; ]