On 6/28/06, Phillip Vandry <vandry@tzone.org> wrote:
We all know that the weakest link of SSH is key management: if you do not confirm by a secure out of band channel that the public host key of the device you are connecting to is correct, then SSH's crypto will not help you.
SSH's crypto won't help you accomplish what? Using host-based auth between "trusted" (secured, point of entry) hosts and hosts that don't have a public facing facility for authentication/login simplifies my life, I don't _need_ root passwords, and my points of entry are much easier to monitor for foul play. Requiring everybody who needs superuser access to use their own ssh private keys allows us to efficiently manage access without having to hand out passwords to geographically dispersed locations.
SSH implements neither a CA hierarchy (like X.509 certificates) nor a web of trust (like PGP) so you are left checking the validity of host keys yourself. Still, it's not so bad if you only connect to a small handful of well known servers. You will either have verified them all soon enough and not be bothered with it anymore, or system administrators will maintain a global known_hosts file that lists all the correct ones.
But it's quite different when you manage a network of hundreds or thousands of devices. I find myself connecting to devices I've never connected to before on a regular basis and being prompted to verify the public host keys they are offering up. This happens in the course of something else that I am doing and I don't necesarily have the time to check a host key. If I did have time, it's hard to check it anyway: the device is just one of a huge number of network elements of no special significance to me and I didn't install it and generate its key and I don't know who did. From time to time I also get hit with warning messages from my SSH client about a changed host key and it's probably just that someone swapped out the router's hardware sometime since the last time I connected and a new key got generated. But I'm not sure. Worst of all, my problem is repeated for every user because each user is working with their own private ssh_known_hosts database into which they accept host keys.
This seems like it can easily be fixed via security policy.
A possible solution is:
- Maintain a global known_hosts file. Make everyone who installs a new router or turns up SSH on an existing one contribute to it. Install it as the global (in /etc/) known_hosts file on all the ssh clients you can.
Pro: The work to accept a new host key is done one, and it's done by the person who installed the router, who is in the best position to confirm that no man in the middle attack is taking place.
Con: You need to make sure updates to this file are authentic (its benefit is lost if untrusted people are allowed to contribute), and you need to be sure it gets installed on the ssh clients people use to connect to the network elements.
Con: If a host key changes but it is found to be benign (such as the scenario I describe above), users can't do much about it until the global file is corrected and redeployed (complicated openssh options which users will generally not know to bypass the problem notwithstanding).
I'm looking for information on best practices that are in use to tackle this problem. What solutions are working for you?
I'm assuming you have in place procedures that happen on a regular basis, a finite number of ip addresses that will be connected to by any and all technicians, and access to standard openssh tools. Why not, on a regular basis, use ssh-keyscan and diff or something similar, to scan your range of hosts that DO have ssh on them (maybe nmap subnet scans for port 22?) to retrieve the host keys, compare them to last time the scan was run, see if anything changed, cross reference that with work orders by ip or any other identifiable information present, and let the tools do the work for you. Cron is your friend. Using rsync, scp, nfs or something similar it wouldn't be very difficult to upkeep an automated way of updating such a list once per day across your entire organization. If you're worried about password security and having a rogue device sniff a globally-used password, perhaps it's time to start looking into private ssh-keys and/or a single point of entry policy. If you're capable of preventing chaotic access, that might even keep you from having to deploy your /etc/ssh/ssh_known_hosts to every machine in your organization. It really seems to me that you really need to get lazy and see what kind of automation steps are there for you to exploit for your sanity. I've only got 20 machines, but I've got very defined paths for how to get from an untrusted (home) network to a trusted (internal cluster) network and on top of not being required to type 100 passwords to get from point-a to point-b, I've now got a simplified way to see who's been doing what.
Thanks
-Phil
HTH, Allen