Ever since first using it I've always liked tacacs+. Having said that I've grown to dislike some things about it recently. I guess, there have always been problems but I've been willing to leave them alone. I don't have time to give the code a real deep inspection, so I'm interested in others thoughts about it. I suspect people have just left it alone because it works. Also I apologize if this is too verbose or technical, or not technical enough, or just hard to read. History: TACACS+ was proposed as a standard to the IETF. They never adopted it and let the standards draft expire in 1998. Since then there have been no official changes to the code. Much has happened between now and then. I specifically was interested in parsing tac_plus logs correctly. After finding idiosyncrasies I decided to look at the source and the RFC to see what was really happening. Logging, or why I got into this mess: In the accounting log, fields are sometimes logged in different order. It appears the client is logging whatever it receives without parsing it or modifying it. That means the remote system is sending them in different orders, so technically the fault lies with them. However, it seems too trusting to take in data and log it without looking at it. This can also cause issues when you send a command like (Cisco) "dir /all nvram:" on a box with many files. The device expands the command to include everything on the nvram (important because you might want to deny access to that command based on something it expanded), but it gets truncated somewhere (not sure if it's the device buffer that is full, tac_plus, or the logging part. I might tcpdump for a while to see if I can figure out what it looks like on the wire) I'm not sure if there are security implications there. Encryption: The existing security consists of md5 XOR <content> with the md5 being composed of a running series of 16 byte hashes, taking the previous hash as part of the seed of the next hash. A sequence number is used so simple replay shouldn't be a factor. Depending on how vulnerable iterative md5 is to it, and how much time you had to sniff the traffic, I would think this would be highly vulnerable to chosen plaintext if you already have a user-level login, or at least partial known plaintext (with the assumption they make backups, you can guess that at least some of the packets will have "show running-config" and other common commands). They also don't pad the encrypted string so you can guess the command (or password) based on the length of the encrypted data. For a better description of the encryption you can read the draft: http://tools.ietf.org/html/draft-grant-tacacs-02 I found an article from May, 2000 which shows that the encryption scheme chosen was insufficient even then. http://www.openwall.com/articles/TACACS+-Protocol-Security For new crypto I would advise multiple cipher support with negotiation so you know what each client and server is capable of. If the client and server supported multiple keys (with a keyid) it would be easier to roll keys frequently, or if it isn't too much overhead they could use public key. Clients: As for clients, Wikipedia lists several that seem to be based on the original open-source tac_plus from Cisco. shrubbery.net has the "official" version that debian and freebsd use. I looked at some of the others and they all seemed to derive from Cisco's code directly or shrubbery.net code, but they retained the name and started doing their own versioning. All the webpages look like they're from 1995. In some cases I think it's intentional but in some ways it shows a lack of care for the code, like it's been dropped since 2000. Documentation is old: This only applies to shrubbery.net's version. I didn't look at the other ones that closely. While all of it appears valid, one Q&A in the FAQ was about IOS 10.3/11.0. Performance questions use the sparc 2 as a target machine. There isn't an INSTALL or README, just the FAQ/CHANGES/COPYING (and a tac_plus.conf manpage), so the learning curve for new users is probably pretty steep. Also there isn't a clear maintainer. The best email address I found was listed in the tacacs+.spec file, for packaging on rpm systems. If you hit the website they give some hints with some outdated, though still functional links. And they list the official email as tac_plus@shrubbery.net Conclusion: Did everyone already know this but me? If so have you moved to Kerberos? Can Kerberos do everything TACACS+ was doing for router authorization? I've got gear that only supports radius and tacacsplus, so in some cases I have no choice but to use one of those, neither of which I would trust over an unencrypted wire. If TACACS+ isn't a dead end then it needs a push to bring the protocol to a new version. There are big name vendors involved in making supported clients and servers. There should be someone invested in keeping it secure and adding features.
I don't understand why vendors and operators keep turning to TACACS. It seems like they're often looking to Cisco as some paragon of best security practices. It's a vulnerable protocol, but some times the only thing to choose from. One approach to secure devices that can support only TACACS or RADIUS: Deploy a small embedded *nix machine (Soekris, Raspberry Pi, etc.) that runs a RADSEC (for RADIUS) or stunnel (for TACACS) proxy. Attach it to a short copper with 802.1q, take weak xor'ed requests in on one tag, wrap the requests with TLS, and forward out another tag towards your central AAA box. Kerberos or more certificate-based SSH on routers would be super. SSH with certificates is nice in that it allows authenticators out in the field to verify clients "offline", without needing a central AAA server. However, the tradeoff is that you must then make sure all the clocks are correct and in-sync, and root certificates are verified. On Mon, Dec 30, 2013 at 2:06 AM, Robert Drake <rdrake@direcpath.com> wrote:
Ever since first using it I've always liked tacacs+. Having said that I've grown to dislike some things about it recently. I guess, there have always been problems but I've been willing to leave them alone.
I don't have time to give the code a real deep inspection, so I'm interested in others thoughts about it. I suspect people have just left it alone because it works. Also I apologize if this is too verbose or technical, or not technical enough, or just hard to read.
History:
TACACS+ was proposed as a standard to the IETF. They never adopted it and let the standards draft expire in 1998. Since then there have been no official changes to the code. Much has happened between now and then. I specifically was interested in parsing tac_plus logs correctly. After finding idiosyncrasies I decided to look at the source and the RFC to see what was really happening.
Logging, or why I got into this mess:
In the accounting log, fields are sometimes logged in different order. It appears the client is logging whatever it receives without parsing it or modifying it. That means the remote system is sending them in different orders, so technically the fault lies with them. However, it seems too trusting to take in data and log it without looking at it. This can also cause issues when you send a command like (Cisco) "dir /all nvram:" on a box with many files. The device expands the command to include everything on the nvram (important because you might want to deny access to that command based on something it expanded), but it gets truncated somewhere (not sure if it's the device buffer that is full, tac_plus, or the logging part. I might tcpdump for a while to see if I can figure out what it looks like on the wire) I'm not sure if there are security implications there.
Encryption:
The existing security consists of md5 XOR <content> with the md5 being composed of a running series of 16 byte hashes, taking the previous hash as part of the seed of the next hash. A sequence number is used so simple replay shouldn't be a factor. Depending on how vulnerable iterative md5 is to it, and how much time you had to sniff the traffic, I would think this would be highly vulnerable to chosen plaintext if you already have a user-level login, or at least partial known plaintext (with the assumption they make backups, you can guess that at least some of the packets will have "show running-config" and other common commands). They also don't pad the encrypted string so you can guess the command (or password) based on the length of the encrypted data.
For a better description of the encryption you can read the draft: http://tools.ietf.org/html/draft-grant-tacacs-02 I found an article from May, 2000 which shows that the encryption scheme chosen was insufficient even then. http://www.openwall.com/articles/TACACS+-Protocol-Security
For new crypto I would advise multiple cipher support with negotiation so you know what each client and server is capable of. If the client and server supported multiple keys (with a keyid) it would be easier to roll keys frequently, or if it isn't too much overhead they could use public key.
Clients:
As for clients, Wikipedia lists several that seem to be based on the original open-source tac_plus from Cisco. shrubbery.net has the "official" version that debian and freebsd use. I looked at some of the others and they all seemed to derive from Cisco's code directly or shrubbery.net code, but they retained the name and started doing their own versioning. All the webpages look like they're from 1995. In some cases I think it's intentional but in some ways it shows a lack of care for the code, like it's been dropped since 2000.
Documentation is old:
This only applies to shrubbery.net's version. I didn't look at the other ones that closely. While all of it appears valid, one Q&A in the FAQ was about IOS 10.3/11.0. Performance questions use the sparc 2 as a target machine. There isn't an INSTALL or README, just the FAQ/CHANGES/COPYING (and a tac_plus.conf manpage), so the learning curve for new users is probably pretty steep. Also there isn't a clear maintainer. The best email address I found was listed in the tacacs+.spec file, for packaging on rpm systems.
If you hit the website they give some hints with some outdated, though still functional links. And they list the official email as tac_plus@shrubbery.net
Conclusion:
Did everyone already know this but me? If so have you moved to Kerberos? Can Kerberos do everything TACACS+ was doing for router authorization? I've got gear that only supports radius and tacacsplus, so in some cases I have no choice but to use one of those, neither of which I would trust over an unencrypted wire. If TACACS+ isn't a dead end then it needs a push to bring the protocol to a new version. There are big name vendors involved in making supported clients and servers. There should be someone invested in keeping it secure and adding features.
On (2013-12-30 05:06 -0500), Robert Drake wrote:
TACACS+ was proposed as a standard to the IETF. They never adopted it and let the standards draft expire in 1998. Since then there
If continued existence of TACACS+ can be justified at IETF level, in parallel with radius and diameter, I have some interest in the subject and would be ready to work with draft.
Encryption:
For new crypto I would advise multiple cipher support with negotiation so you know what each client and server is capable of. If the client and server supported multiple keys (with a keyid) it
It seems encryption is your only/major woe? Personally I don't like how we need to keep reimplementing crypto per-application level. We're living in a world where crypto should be standard for all connection, not application issue. There are some solutions to this like BEEP framework or new L4 protocol like QUIC and MinimaLT, any of which I think would be workable as mandatory transport for TACACS.
Clients:
"official" version that debian and freebsd use. I looked at some of the others and they all seemed to derive from Cisco's code directly
There is also commercial server 'radiator' which does radius and tacacs amongst others.
Did everyone already know this but me? If so have you moved to
I think I missed the key revelation. The naive encryption? The limited amount of software available?
Kerberos? Can Kerberos do everything TACACS+ was doing for router
I think from networker point of view, it's radiator or tacacs, if it has to work today without new software. And if it can require new software, it can be pretty much arbitrary new protocol, if sound justification can be found. -- ++ytti
I don't think radius nor kerberos nor ssh with certificates supports command authorization, do they? On Dec 30, 2013 6:33 AM, "Saku Ytti" <saku@ytti.fi> wrote:
On (2013-12-30 05:06 -0500), Robert Drake wrote:
TACACS+ was proposed as a standard to the IETF. They never adopted it and let the standards draft expire in 1998. Since then there
If continued existence of TACACS+ can be justified at IETF level, in parallel with radius and diameter, I have some interest in the subject and would be ready to work with draft.
Encryption:
For new crypto I would advise multiple cipher support with negotiation so you know what each client and server is capable of. If the client and server supported multiple keys (with a keyid) it
It seems encryption is your only/major woe? Personally I don't like how we need to keep reimplementing crypto per-application level. We're living in a world where crypto should be standard for all connection, not application issue. There are some solutions to this like BEEP framework or new L4 protocol like QUIC and MinimaLT, any of which I think would be workable as mandatory transport for TACACS.
Clients:
"official" version that debian and freebsd use. I looked at some of the others and they all seemed to derive from Cisco's code directly
There is also commercial server 'radiator' which does radius and tacacs amongst others.
Did everyone already know this but me? If so have you moved to
I think I missed the key revelation. The naive encryption? The limited amount of software available?
Kerberos? Can Kerberos do everything TACACS+ was doing for router
I think from networker point of view, it's radiator or tacacs, if it has to work today without new software. And if it can require new software, it can be pretty much arbitrary new protocol, if sound justification can be found.
-- ++ytti
Nor accounting... On Dec 30, 2013 8:48 AM, "Christopher Morrow" <christopher.morrow@gmail.com> wrote:
I don't think radius nor kerberos nor ssh with certificates supports command authorization, do they? On Dec 30, 2013 6:33 AM, "Saku Ytti" <saku@ytti.fi> wrote:
On (2013-12-30 05:06 -0500), Robert Drake wrote:
TACACS+ was proposed as a standard to the IETF. They never adopted it and let the standards draft expire in 1998. Since then there
If continued existence of TACACS+ can be justified at IETF level, in parallel with radius and diameter, I have some interest in the subject and would be ready to work with draft.
Encryption:
For new crypto I would advise multiple cipher support with negotiation so you know what each client and server is capable of. If the client and server supported multiple keys (with a keyid) it
It seems encryption is your only/major woe? Personally I don't like how we need to keep reimplementing crypto per-application level. We're living in a world where crypto should be standard for all connection, not application issue. There are some solutions to this like BEEP framework or new L4 protocol like QUIC and MinimaLT, any of which I think would be workable as mandatory transport for TACACS.
Clients:
"official" version that debian and freebsd use. I looked at some of the others and they all seemed to derive from Cisco's code directly
There is also commercial server 'radiator' which does radius and tacacs amongst others.
Did everyone already know this but me? If so have you moved to
I think I missed the key revelation. The naive encryption? The limited amount of software available?
Kerberos? Can Kerberos do everything TACACS+ was doing for router
I think from networker point of view, it's radiator or tacacs, if it has to work today without new software. And if it can require new software, it can be pretty much arbitrary new protocol, if sound justification can be found.
-- ++ytti
On (2013-12-30 08:49 -0500), Christopher Morrow wrote:
Nor accounting...
I think this is probably sufficient justification for TACACS+. I'm not sure if command authorization is sufficient, as you can deliver group via radius which maps to authorized commands. But if you must support accounting, per-command authorization comes as free gift more or less. -- ++ytti
On Dec 30, 2013 9:01 AM, "Saku Ytti" <saku@ytti.fi> wrote:
On (2013-12-30 08:49 -0500), Christopher Morrow wrote:
Nor accounting...
I think this is probably sufficient justification for TACACS+. I'm not
sure if
command authorization is sufficient, as you can deliver group via radius which maps to authorized commands. But if you must support accounting, per-command authorization comes as free gift more or less.
Yes. Per-command auth and accounting is needed. So what we need is tacacs over TLS (sctp / ipv6) I agree tacacs is long in the tooth and needs to be revisited and invested in. Please take my money (serious) CB
-- ++ytti
Hi, On Mon, 30 Dec 2013, Christopher Morrow wrote:
I don't think radius nor kerberos nor ssh with certificates supports command authorization, do they?
it is with radius afaik ... Greetings Christian -- Christian Kratzer CK Software GmbH Email: ck@cksoft.de Wildberger Weg 24/2 Phone: +49 7032 893 997 - 0 D-71126 Gaeufelden Fax: +49 7032 893 997 - 9 HRB 245288, Amtsgericht Stuttgart Web: http://www.cksoft.de/ Geschaeftsfuehrer: Christian Kratzer
On Dec 30, 2013, at 9:01 AM, Christian Kratzer <ck-lists@cksoft.de> wrote:
Hi,
On Mon, 30 Dec 2013, Christopher Morrow wrote:
I don't think radius nor kerberos nor ssh with certificates supports command authorization, do they?
it is with radius afaik ...
RADIUS does not support command authorization or accounting. -jav
On Mon, Dec 30, 2013 at 8:11 AM, Javier Henderson <javier@kjsl.org> wrote:
Given the problem of remote auth; the restriction of choice of protocols is dictated by what protocols the relying party device supports. This is the problem: You are at the mercy of your router vendor, to support the authentication protocol functionality. Things are workable, but in a sad state. Obviously, providing highly robust, highly secure remote authentication, is not a high priority among the router vendors. They pay lip service to the whole thing. In many cases you might be better off with local auth. How do you feel about having to wait 30 seconds between every command you enter to troubleshoot, to fail to the second server, if the TACACS or RADIUS system is nonresponsive, because the dumb router can't remember which TACACS servers are up and which ones are down, and always tries the first one in the list first? At least RADIUS has the concept of a "dead timer" :) By all rights; routers should be implementing authorization using LDAP over TLS, with a locally cached persistent copy of the directory and credentials (so users can still log in, and their command exec rights cached, in case of network outages).. and authentication with either user SSH public key published in LDAP, Kerberos/GSSAPI with Smartcard and other 2factor auth/OTP support, or LDAP BIND using SASL. RADIUS and TACACS+ are what you get, because they've been there forever, and frequently enough deemed "good enough". Some routers have limited Kerberos support; although, usually, not support for Kerberos ticket forwarding SPNEGO / Negotiate authentication using GSSAPI over SSH. (Over encrypted Telnet, Yes) RADIUS and TACACS+, without IPSEC or TLS encapsulation of all the traffic are both highly insecure by today's standards, and in theory should not be used. Unfortunately; on many network devices, these are your only native central authentication options! Fallback plan: The network should be designed so such connections are not allowed to cross an untrusted Layer 2 domain. If an attacker can sniff auth traffic --- TACACS+ is particularly susceptible to decryption of the entire session including user credentials, whereas RADIUS is particularly susceptible to the possibility of authentication replay. Depending on the router vendor; the available functionality with each protocol, varies..... Cisco is most noted for providing rich functionality over TACACS+ for shell authorization and accounting, and providing very limited RADIUS support. It is not that RADIUS is limited --- its that your device vendor's RADIUS featureset is limited -- which, for all intents and purposes, means, the features available to you are more limited, if you use such gear.
On Dec 30, 2013, at 9:01 AM, Christian Kratzer <ck-lists@cksoft.de> wrote:
Hi, it is with radius afaik ... RADIUS does not support command authorization or accounting.
RADIUS protocol supports accounting; and there is no reason RADIUS start-stop accounting events cannot be sent for every shell command --- this is not a protocol limitation, this is a device implementation limitation. Some devices can provide per-command authorization by embedding the command being run in an Access-Request. RADIUS protocol response messages can encapsulate any attribute-value pair that can be sent in a TACACS response. using Vendor-specific attributes. There is a restriction on IOS devices, that arbitrarily forbids certain vendor-specific Attribute-value pairs from being encapsulated in the RADIUS reply message; per-command authorization is among prevented software capabilities of the router, not a limitation of the RADIUS protocol. http://wiki.freeradius.org/vendor/Cisco#Command-Authorization ' cisco-avpair = "shell:cmd=show" would do the trick to authorize the "show" command. except that there is a tiny note for the commands "cmd" and "cmd-arg" saying that they cannot be used for encapsulation in the Vendor-Specific space. These two are the ONLY ones.'
-jav
-- -JH
On Dec 30, 2013, at 6:42 PM, Jimmy Hess <mysidia@gmail.com> wrote:
How do you feel about having to wait 30 seconds between every command you enter to troubleshoot, to fail to the second server, if the TACACS or RADIUS system is nonresponsive, because the dumb router can't remember which TACACS servers are up and which ones are down, and always tries the first one in the list first? At least RADIUS has the concept of a "dead timer" :)
Are you talking about Cisco routers? The default timeout value for TACACS+ is five seconds, so I’m not sure where you’re coming up with thirty seconds, unless you have seven servers listed on the router and the first six are dead/unreachable. -jav
On Mon, Dec 30, 2013 at 6:05 PM, Javier Henderson <javier@kjsl.org> wrote:
Are you talking about Cisco routers? The default timeout value for TACACS+ is five seconds, so I’m not sure where you’re coming up with thirty seconds, unless you have seven servers listed on the router and the first six are dead/unreachable.
Even 5 seconds extra for each command may hinder operators, to the extent it would be intolerable; shell commands should run almost instantaneously.... this is not a GUI, with an hourglass. Real-time responsiveness in a shell is crucial --- which remote auth should not change. Sometimes operators paste a buffer with a fair number of commands, not expecting a second delay between each command --- a repeated delay, may also break a pasted sequence. It is very possible for two of three auth servers to be unreachable, in case of a network break, but that isn't necessary. The "response timeout" might be 5 seconds, but in reality, there are cases where you would wait longer, and that is tragic, since there are some obvious alternative approaches that would have had results that would be more 'friendly' to the interactive user. (Like remembering which server is working for a while, or remembering that all servers are down -- for a while, and having a 50ms timeout, with all servers queried in parallel, instead of a 5 seconds timeout) -jav
-- -JH
Picking back up where this left off last year, because I apparently only work on TACACS during the holidays :) On 12/30/2013 7:28 PM, Jimmy Hess wrote:
Even 5 seconds extra for each command may hinder operators, to the extent it would be intolerable; shell commands should run almost instantaneously.... this is not a GUI, with an hourglass. Real-time responsiveness in a shell is crucial --- which remote auth should not change. Sometimes operators paste a buffer with a fair number of commands, not expecting a second delay between each command --- a repeated delay, may also break a pasted sequence.
It is very possible for two of three auth servers to be unreachable, in case of a network break, but that isn't necessary. The "response timeout" might be 5 seconds, but in reality, there are cases where you would wait longer, and that is tragic, since there are some obvious alternative approaches that would have had results that would be more 'friendly' to the interactive user.
(Like remembering which server is working for a while, or remembering that all servers are down -- for a while, and having a 50ms timeout, with all servers queried in parallel, instead of a 5 seconds timeout) I think this needs to be part of the specification.
I'm sure the reason they didn't do parallel queries was because of both network and CPU load back when the protocol was drafted. But it might be good to have local caching of authentication so that can happen even when servers are down or slow. Authorization could be updated to send the permissions to the router for local handling. Then if the server dies while a session is open only accounting would be affected. That does increase the vendors/implementors work but it might be doable in phases and with partial support with the clients and servers negotiating what is possible. The biggest drawback to making things like this better is you don't gain much except during outages and if you increase complexity too much you make it wide open for bugs. Maybe there is a simpler solution that keeps you happy about redundancy but doesn't increase complexity that much (possibly anycast tacacs, but the session basis of the protocol has always made that not feasible). It's possible that one of the L4 protocols Saku Ytti mentioned, QUIC or MinimaLT would address these problems too. It's possible that if we did the transport with BEEP it would also provide this, but I'm reading the docs and I don't think it goes that far in terms of connection assurance.
-- -JH
So, here is my TACACS RFC christmas list: 1. underlying crypto 2. ssh host key authentication - having the router ask tacacs for an authorized_keys list for rdrake. I'm willing to let this go because many vendors are finding ways to do key distribution, but I'd still like to have a standard (https://code.google.com/p/openssh-lpk/ for how to do this over LDAP in UNIX) 3. authentication and authorization caching and/or something else
On Sun, Dec 28, 2014 at 6:02 PM, Robert Drake <rdrake@direcpath.com> wrote:
Picking back up where this left off last year, because I apparently only work on TACACS during the holidays :)
avoiding relatives? :)
On 12/30/2013 7:28 PM, Jimmy Hess wrote:
Even 5 seconds extra for each command may hinder operators, to the extent it would be intolerable; shell commands should run almost instantaneously.... this is not a GUI, with an hourglass. Real-time responsiveness in a shell is crucial --- which remote auth should not change. Sometimes operators paste a buffer with a fair number of commands, not expecting a second delay between each command --- a repeated delay, may also break a pasted sequence.
It is very possible for two of three auth servers to be unreachable, in case of a network break, but that isn't necessary. The "response timeout" might be 5 seconds, but in reality, there are cases where you would wait longer, and that is tragic, since there are some obvious alternative approaches that would have had results that would be more 'friendly' to the interactive user.
(Like remembering which server is working for a while, or remembering that all servers are down -- for a while, and having a 50ms timeout, with all servers queried in parallel, instead of a 5 seconds timeout)
I think this needs to be part of the specification.
I'm sure the reason they didn't do parallel queries was because of both network and CPU load back when the protocol was drafted. But it might be good to have local caching of authentication so that can happen even when servers are down or slow. Authorization could be updated to send the permissions to the router for local handling. Then if the server dies while a session is open only accounting would be affected.
Juniper, at least, does the authorization cache on the device trick... (or really scoping of commands/areas a user is permitted via a local cache file in /var/tmp)
That does increase the vendors/implementors work but it might be doable in phases and with partial support with the clients and servers negotiating what is possible. The biggest drawback to making things like this better is you don't gain much except during outages and if you increase complexity too much you make it wide open for bugs.
and I wonder what percentage of 'users' a vendor has actually USE tac+ (or even radius). I bet it's shockingly low...
Maybe there is a simpler solution that keeps you happy about redundancy but doesn't increase complexity that much (possibly anycast tacacs, but the session basis of the protocol has always made that not feasible). It's
does it really? :)
possible that one of the L4 protocols Saku Ytti mentioned, QUIC or MinimaLT would address these problems too. It's possible that if we did the transport with BEEP it would also provide this, but I'm reading the docs and I don't think it goes that far in terms of connection assurance.
So, here is my TACACS RFC christmas list:
1. underlying crypto
juniper, cisco, arista, sun, linux, freebsd still can't get TCP-AO working... they don't all have ssl libraries in their "os" either... Getting to some answer other than: "F-it, put it i clear text" for new protocols on routers really is a bit painful... not to mention ITARs sorts of problems that arise. -chris
2. ssh host key authentication - having the router ask tacacs for an authorized_keys list for rdrake. I'm willing to let this go because many vendors are finding ways to do key distribution, but I'd still like to have a standard (https://code.google.com/p/openssh-lpk/ for how to do this over LDAP in UNIX) 3. authentication and authorization caching and/or something else
On Sun, Dec 28, 2014 at 9:21 PM, Christopher Morrow <morrowc.lists@gmail.com> wrote:
On Sun, Dec 28, 2014 at 6:02 PM, Robert Drake <rdrake@direcpath.com> wrote: [snip] Juniper, at least, does the authorization cache on the device trick... That seems nice...
and I wonder what percentage of 'users' a vendor has actually USE tac+ (or even radius). I bet it's shockingly low...
Well, the percentage of users doing per-command authorization is probably much lower than the percentage simply using Tac+ for login authentication and accounting only or accounting and exec authorization. What happens in this case in terms of failure handling is probably OK for the common scenario. For many use cases it should probably be a workable tradeoff to simply have AAA server reply with the shell:priv-lvl=1 or shell:priv-lvl=10, and make the choice to authorize commands locally by customizing which commands different privilege level numbers have, and make sure all devices have the same scheme; limiting AAA usage to once per shell. The cases where that's no solution, are most likely PCI or other higher security environments where the usability problems with TACACS+ failover simply have to be accepted, use a dedicated OOB network for AAA servers, and a HA clustered pair of AAA servers dedicated to each and every site --- sharing a virtual service IP address.
So, here is my TACACS RFC christmas list: 1. underlying crypto RADIUS over TCP and DIAMETER have underlying crypto. Rfc6613: TLS or IPsec transport is shown as mandatory for RADIUS over TCP.
Getting to some answer other than: "F-it, put it i clear text" for new protocols on routers really is a bit painful... not to mention ITARs sorts of problems that arise.
The average cheap-o smartphone ships with a TLS library; I think it's safe to say your router should have one. They shouldn't have too many problems... after all, this type of equipment already includes SSH protocol. So why not have an option for setting up a SSH session to tunnel authentication requests over?
-chris
2. ssh host key authentication - having the router ask tacacs for an authorized_keys list for rdrake. I'm willing to let this go because many
I would be content for them to just support OpenSSH CA certificate-based authorization of a user's SSH key. If the key is signed by a trusted SSH CA, valid and not expired, and the session would be valid according to the certificate, then they can authenticate using one of their listed principals. Authenticate using key signed by valid certificate as first factor, perform second factor authentication against Kerberos server, authorize against LDAP or Tacacs server.
vendors are finding ways to do key distribution, but I'd still like to have a standard (https://code.google.com/p/openssh-lpk/ for how to do this over LDAP in UNIX)
SSSD is handling this on Redhat. It's probably best to consider that how to use an "openssh public ssh key" is specific to the OpenSSH application. It makes sense that if the public key is for use with GPG/PGP to authenticate, etc, then the LDAP attribute should be something different, again specific to the application and the key format that application uses. http://docs.fedoraproject.org/en-US/Fedora/18/html/FreeIPA_Guide/user-keys.h... AuthorizedKeysCommand or PubKeyAgent is used on the openssh server. But within the single-signon daemon SSSD-Ldap; the LDAP attribute for a user object's SSH key is a configurable setting. Within the IPA LDAP schema, there is an added ipaSshPubKey user attribute. I think this as close as you get to a 'standard' for now. dn: cn=schema add:attributeTypes: ( 2.16.840.1.113730.3.8.11.31 NAME 'ipaSshPubKey' DESC 'SSH public key' EQUALITY octetStringMatch SYNTAX 1.3.6.1.4.1.1466.115.121.1.40 X-ORIGIN 'IPA v3' ) add:objectClasses: ( 2.16.840.1.113730.3.8.12.11 NAME 'ipaSshGroupOfPubKeys' ABSTRACT MAY ipaSshPubKey X-ORIGIN 'IPA v3' ) add:objectClasses: ( 2.16.840.1.113730.3.8.12.12 NAME 'ipaSshUser' SUP ipaSshGroupOfPubKeys AUXILIARY X-ORIGIN 'IPA v3' ) add:objectClasses ( 2.16.840.1.113730.3.8.12.13 NAME 'ipaSshHost' SUP ipaSshGroupOfPubKeys AUXILIARY X-ORIGIN 'IPA v3' )
3. authentication and authorization caching and/or something else
-- -JH
On Mon, Dec 29, 2014 at 04:25:56PM +0900, Randy Bush wrote:
Rfc6613: TLS or IPsec transport is shown as mandatory for RADIUS over TCP.
sweet. can you ref conforming implementations?
FreeRADIUS and Radiator can do RADSEC, as well as radsecproxy, so it can be used to protect e.g. site-to-site proxying. I don't know whether any switches/NASes can do it at present, though. Matthew -- Matthew Newton, Ph.D. <mcn4@le.ac.uk> Systems Specialist, Infrastructure Services, I.T. Services, University of Leicester, Leicester LE1 7RH, United Kingdom For IT help contact helpdesk extn. 2253, <ithelp@le.ac.uk>
and I wonder what percentage of 'users' a vendor has actually USE tac+ (or even radius). I bet it's shockingly low...
On 12/28/2014 10:21 PM, Christopher Morrow wrote: true.. even in large-ish environments centralized authentication presents problems and can have a limited merit. Up to some arbitrary size, nobody really can be bothered unless some business case comes up like splitting responsibilities between groups. Accounting is probably the best early reason to turn it on in small networks. Being able to see who made a change makes it easier to figure out why.
Maybe there is a simpler solution that keeps you happy about redundancy but doesn't increase complexity that much (possibly anycast tacacs, but the session basis of the protocol has always made that not feasible). It's does it really? :) Well, the chance of two geographically close servers getting load-balanced made it not feasible for us to do. Not to mention the fact that we had only two tacacs servers and the use-case for anycasting wasn't worth the hassle of implementation.
juniper, cisco, arista, sun, linux, freebsd still can't get TCP-AO working... they don't all have ssl libraries in their "os" either... With it being a TCP extension, my guess is that it's harder to find someone at those companies willing to change things inside the kernel because it's used by too many people, and if nobody is asking for it then they don't want to build it just to advertise they're first to market.
Even the ISP's who probably asked for it ultimately don't put money on getting it done because the engineer who says they need it still doesn't turn down the new chassis that lacks support. The money is all flowing through the hardware guys now and if it's not directly related to moving packets quickly then they don't care.
Getting to some answer other than: "F-it, put it i clear text" for new protocols on routers really is a bit painful... not to mention ITARs sorts of problems that arise.
Now you're making me depressed. :) The question is should we be trying to move things along or just leave it as it is? There are certainly more important things on everyone's TODO list right now, but I'd rather the vendors have an open ticket in their queue saying "secure-tacacs+-rfc unimplemented" rather than letting them off the hook.
-chris
Robert
We are able to implement TACAS+. It is my understanding this a fairly old protocol, so are you saying there are numerous bugs that still need to be fixed? A question I have is TACAS+ is usually hosted on a server, and networking devices are configured to reach out to the server for authentication. My question is what happens if the device can't reach the server if the devices network connection is offline? Our goal with TACAS+ is to not have any default/saved passwords. Every employee will have their own username and password. That way if an employee gets hired/fired, we can enable or disable their account. We are trying to avoid having any organization wide or network wide default username or password. Is this possible? Do the devices keep of log of the last successful username/password combinations that worked incase the device goes offline? On Sun, Dec 28, 2014 at 5:02 PM, Robert Drake <rdrake@direcpath.com> wrote:
Picking back up where this left off last year, because I apparently only work on TACACS during the holidays :)
On 12/30/2013 7:28 PM, Jimmy Hess wrote:
Even 5 seconds extra for each command may hinder operators, to the extent it would be intolerable; shell commands should run almost instantaneously.... this is not a GUI, with an hourglass. Real-time responsiveness in a shell is crucial --- which remote auth should not change. Sometimes operators paste a buffer with a fair number of commands, not expecting a second delay between each command --- a repeated delay, may also break a pasted sequence.
It is very possible for two of three auth servers to be unreachable, in case of a network break, but that isn't necessary. The "response timeout" might be 5 seconds, but in reality, there are cases where you would wait longer, and that is tragic, since there are some obvious alternative approaches that would have had results that would be more 'friendly' to the interactive user.
(Like remembering which server is working for a while, or remembering that all servers are down -- for a while, and having a 50ms timeout, with all servers queried in parallel, instead of a 5 seconds timeout)
I think this needs to be part of the specification.
I'm sure the reason they didn't do parallel queries was because of both network and CPU load back when the protocol was drafted. But it might be good to have local caching of authentication so that can happen even when servers are down or slow. Authorization could be updated to send the permissions to the router for local handling. Then if the server dies while a session is open only accounting would be affected.
That does increase the vendors/implementors work but it might be doable in phases and with partial support with the clients and servers negotiating what is possible. The biggest drawback to making things like this better is you don't gain much except during outages and if you increase complexity too much you make it wide open for bugs.
Maybe there is a simpler solution that keeps you happy about redundancy but doesn't increase complexity that much (possibly anycast tacacs, but the session basis of the protocol has always made that not feasible). It's possible that one of the L4 protocols Saku Ytti mentioned, QUIC or MinimaLT would address these problems too. It's possible that if we did the transport with BEEP it would also provide this, but I'm reading the docs and I don't think it goes that far in terms of connection assurance.
-- -JH
So, here is my TACACS RFC christmas list:
1. underlying crypto 2. ssh host key authentication - having the router ask tacacs for an authorized_keys list for rdrake. I'm willing to let this go because many vendors are finding ways to do key distribution, but I'd still like to have a standard (https://code.google.com/p/openssh-lpk/ for how to do this over LDAP in UNIX) 3. authentication and authorization caching and/or something else
Colton, Yes, that's the 'normal' way of setting it up. Basically you still have to configure a root user, but that user name and password is kept locked up and only accessed in case of catastrophic failure of the remote authentication system. An important note is to make sure that the fail safe password can't be accessed without having several people engaged so it can't be used without many people knowing. Scott Helms Vice President of Technology ZCorum (678) 507-5000 -------------------------------- http://twitter.com/kscotthelms -------------------------------- On Mon, Dec 29, 2014 at 10:15 AM, Colton Conor <colton.conor@gmail.com> wrote:
We are able to implement TACAS+. It is my understanding this a fairly old protocol, so are you saying there are numerous bugs that still need to be fixed?
A question I have is TACAS+ is usually hosted on a server, and networking devices are configured to reach out to the server for authentication. My question is what happens if the device can't reach the server if the devices network connection is offline? Our goal with TACAS+ is to not have any default/saved passwords. Every employee will have their own username and password. That way if an employee gets hired/fired, we can enable or disable their account. We are trying to avoid having any organization wide or network wide default username or password. Is this possible? Do the devices keep of log of the last successful username/password combinations that worked incase the device goes offline?
On Sun, Dec 28, 2014 at 5:02 PM, Robert Drake <rdrake@direcpath.com> wrote:
Picking back up where this left off last year, because I apparently only work on TACACS during the holidays :)
On 12/30/2013 7:28 PM, Jimmy Hess wrote:
Even 5 seconds extra for each command may hinder operators, to the extent it would be intolerable; shell commands should run almost instantaneously.... this is not a GUI, with an hourglass. Real-time responsiveness in a shell is crucial --- which remote auth should not change. Sometimes operators paste a buffer with a fair number of commands, not expecting a second delay between each command --- a repeated delay, may also break a pasted sequence.
It is very possible for two of three auth servers to be unreachable, in case of a network break, but that isn't necessary. The "response timeout" might be 5 seconds, but in reality, there are cases where you would wait longer, and that is tragic, since there are some obvious alternative approaches that would have had results that would be more 'friendly' to the interactive user.
(Like remembering which server is working for a while, or remembering that all servers are down -- for a while, and having a 50ms timeout, with all servers queried in parallel, instead of a 5 seconds timeout)
I think this needs to be part of the specification.
I'm sure the reason they didn't do parallel queries was because of both network and CPU load back when the protocol was drafted. But it might be good to have local caching of authentication so that can happen even when servers are down or slow. Authorization could be updated to send the permissions to the router for local handling. Then if the server dies while a session is open only accounting would be affected.
That does increase the vendors/implementors work but it might be doable in phases and with partial support with the clients and servers negotiating what is possible. The biggest drawback to making things like this better is you don't gain much except during outages and if you increase complexity too much you make it wide open for bugs.
Maybe there is a simpler solution that keeps you happy about redundancy but doesn't increase complexity that much (possibly anycast tacacs, but the session basis of the protocol has always made that not feasible). It's possible that one of the L4 protocols Saku Ytti mentioned, QUIC or MinimaLT would address these problems too. It's possible that if we did the transport with BEEP it would also provide this, but I'm reading the docs and I don't think it goes that far in terms of connection assurance.
-- -JH
So, here is my TACACS RFC christmas list:
1. underlying crypto 2. ssh host key authentication - having the router ask tacacs for an authorized_keys list for rdrake. I'm willing to let this go because many vendors are finding ways to do key distribution, but I'd still like to have a standard (https://code.google.com/p/openssh-lpk/ for how to do this over LDAP in UNIX) 3. authentication and authorization caching and/or something else
Scott, Thanks for the response. How do you make sure the failsafe and/or root password that is stored in the device incase remote auth fails can't be accessed without having several employees engaged? Are there any mechanisms for doing so? My fear would be we would hire an outsourced tech. After a certain amount of time we would have to let this part timer go, and would disabled his or her username and password in TACAS. However, if that tech still knows the root password they could still remotely login to our network and cause havoc. The thought of having to change the root password on hundreds of devices doesn't sound appealing either every time an employee is let go. To make matters worse we are using an outsourced firm for some network management, so the case of hiring and firing is fairly consistent. On Mon, Dec 29, 2014 at 9:22 AM, Scott Helms <khelms@zcorum.com> wrote:
Colton,
Yes, that's the 'normal' way of setting it up. Basically you still have to configure a root user, but that user name and password is kept locked up and only accessed in case of catastrophic failure of the remote authentication system. An important note is to make sure that the fail safe password can't be accessed without having several people engaged so it can't be used without many people knowing.
Scott Helms Vice President of Technology ZCorum (678) 507-5000 -------------------------------- http://twitter.com/kscotthelms --------------------------------
On Mon, Dec 29, 2014 at 10:15 AM, Colton Conor <colton.conor@gmail.com> wrote:
We are able to implement TACAS+. It is my understanding this a fairly old protocol, so are you saying there are numerous bugs that still need to be fixed?
A question I have is TACAS+ is usually hosted on a server, and networking devices are configured to reach out to the server for authentication. My question is what happens if the device can't reach the server if the devices network connection is offline? Our goal with TACAS+ is to not have any default/saved passwords. Every employee will have their own username and password. That way if an employee gets hired/fired, we can enable or disable their account. We are trying to avoid having any organization wide or network wide default username or password. Is this possible? Do the devices keep of log of the last successful username/password combinations that worked incase the device goes offline?
On Sun, Dec 28, 2014 at 5:02 PM, Robert Drake <rdrake@direcpath.com> wrote:
Picking back up where this left off last year, because I apparently only work on TACACS during the holidays :)
On 12/30/2013 7:28 PM, Jimmy Hess wrote:
Even 5 seconds extra for each command may hinder operators, to the extent it would be intolerable; shell commands should run almost instantaneously.... this is not a GUI, with an hourglass. Real-time responsiveness in a shell is crucial --- which remote auth should not change. Sometimes operators paste a buffer with a fair number of commands, not expecting a second delay between each command --- a repeated delay, may also break a pasted sequence.
It is very possible for two of three auth servers to be unreachable, in case of a network break, but that isn't necessary. The "response timeout" might be 5 seconds, but in reality, there are cases where you would wait longer, and that is tragic, since there are some obvious alternative approaches that would have had results that would be more 'friendly' to the interactive user.
(Like remembering which server is working for a while, or remembering that all servers are down -- for a while, and having a 50ms timeout, with all servers queried in parallel, instead of a 5 seconds timeout)
I think this needs to be part of the specification.
I'm sure the reason they didn't do parallel queries was because of both network and CPU load back when the protocol was drafted. But it might be good to have local caching of authentication so that can happen even when servers are down or slow. Authorization could be updated to send the permissions to the router for local handling. Then if the server dies while a session is open only accounting would be affected.
That does increase the vendors/implementors work but it might be doable in phases and with partial support with the clients and servers negotiating what is possible. The biggest drawback to making things like this better is you don't gain much except during outages and if you increase complexity too much you make it wide open for bugs.
Maybe there is a simpler solution that keeps you happy about redundancy but doesn't increase complexity that much (possibly anycast tacacs, but the session basis of the protocol has always made that not feasible). It's possible that one of the L4 protocols Saku Ytti mentioned, QUIC or MinimaLT would address these problems too. It's possible that if we did the transport with BEEP it would also provide this, but I'm reading the docs and I don't think it goes that far in terms of connection assurance.
-- -JH
So, here is my TACACS RFC christmas list:
1. underlying crypto 2. ssh host key authentication - having the router ask tacacs for an authorized_keys list for rdrake. I'm willing to let this go because many vendors are finding ways to do key distribution, but I'd still like to have a standard (https://code.google.com/p/openssh-lpk/ for how to do this over LDAP in UNIX) 3. authentication and authorization caching and/or something else
In the Cisco world the AAA config is typically set up to try tacacs first, and local accounts second. The local account is only usable if tacacs is unavailable. Knowledge of the local username/password does not equate to full time access with that credential. Also, you would usually filter the incoming SSH sessions to only permit a particular management IP range; the local credential, or tacacs credential, shouldn't be usable from any arbitrary network. On Mon, Dec 29, 2014 at 10:32 AM, Colton Conor <colton.conor@gmail.com> wrote:
Scott,
Thanks for the response. How do you make sure the failsafe and/or root password that is stored in the device incase remote auth fails can't be accessed without having several employees engaged? Are there any mechanisms for doing so?
My fear would be we would hire an outsourced tech. After a certain amount of time we would have to let this part timer go, and would disabled his or her username and password in TACAS. However, if that tech still knows the root password they could still remotely login to our network and cause havoc. The thought of having to change the root password on hundreds of devices doesn't sound appealing either every time an employee is let go. To make matters worse we are using an outsourced firm for some network management, so the case of hiring and firing is fairly consistent.
On Mon, Dec 29, 2014 at 9:22 AM, Scott Helms <khelms@zcorum.com> wrote:
Colton,
Yes, that's the 'normal' way of setting it up. Basically you still have to configure a root user, but that user name and password is kept locked up and only accessed in case of catastrophic failure of the remote authentication system. An important note is to make sure that the fail safe password can't be accessed without having several people engaged so it can't be used without many people knowing.
Scott Helms Vice President of Technology ZCorum (678) 507-5000 -------------------------------- http://twitter.com/kscotthelms --------------------------------
On Mon, Dec 29, 2014 at 10:15 AM, Colton Conor <colton.conor@gmail.com> wrote:
We are able to implement TACAS+. It is my understanding this a fairly old protocol, so are you saying there are numerous bugs that still need to be fixed?
A question I have is TACAS+ is usually hosted on a server, and networking devices are configured to reach out to the server for authentication. My question is what happens if the device can't reach the server if the devices network connection is offline? Our goal with TACAS+ is to not have any default/saved passwords. Every employee will have their own username and password. That way if an employee gets hired/fired, we can enable or disable their account. We are trying to avoid having any organization wide or network wide default username or password. Is this possible? Do the devices keep of log of the last successful username/password combinations that worked incase the device goes offline?
On Sun, Dec 28, 2014 at 5:02 PM, Robert Drake <rdrake@direcpath.com> wrote:
Picking back up where this left off last year, because I apparently only work on TACACS during the holidays :)
On 12/30/2013 7:28 PM, Jimmy Hess wrote:
Even 5 seconds extra for each command may hinder operators, to the extent it would be intolerable; shell commands should run almost instantaneously.... this is not a GUI, with an hourglass. Real-time responsiveness in a shell is crucial --- which remote auth should not change. Sometimes operators paste a buffer with a fair number of commands, not expecting a second delay between each command --- a repeated delay, may also break a pasted sequence.
It is very possible for two of three auth servers to be unreachable, in case of a network break, but that isn't necessary. The "response timeout" might be 5 seconds, but in reality, there are cases where you would wait longer, and that is tragic, since there are some obvious alternative approaches that would have had results that would be more 'friendly' to the interactive user.
(Like remembering which server is working for a while, or remembering that all servers are down -- for a while, and having a 50ms timeout, with all servers queried in parallel, instead of a 5 seconds timeout)
I think this needs to be part of the specification.
I'm sure the reason they didn't do parallel queries was because of both network and CPU load back when the protocol was drafted. But it might be good to have local caching of authentication so that can happen even when servers are down or slow. Authorization could be updated to send the permissions to the router for local handling. Then if the server dies while a session is open only accounting would be affected.
That does increase the vendors/implementors work but it might be doable in phases and with partial support with the clients and servers negotiating what is possible. The biggest drawback to making things like this better is you don't gain much except during outages and if you increase complexity too much you make it wide open for bugs.
Maybe there is a simpler solution that keeps you happy about redundancy but doesn't increase complexity that much (possibly anycast tacacs, but the session basis of the protocol has always made that not feasible). It's possible that one of the L4 protocols Saku Ytti mentioned, QUIC or MinimaLT would address these problems too. It's possible that if we did the transport with BEEP it would also provide this, but I'm reading the docs and I don't think it goes that far in terms of connection assurance.
-- -JH
So, here is my TACACS RFC christmas list:
1. underlying crypto 2. ssh host key authentication - having the router ask tacacs for an authorized_keys list for rdrake. I'm willing to let this go because many vendors are finding ways to do key distribution, but I'd still like to have a standard (https://code.google.com/p/openssh-lpk/ for how to do this over LDAP in UNIX) 3. authentication and authorization caching and/or something else
Glad to know you can make local access only work if TACAS+ isn't available. However, that still doesn't prevent the employee who know the local username and password to unplug the device from the network, and the use the local password to get in. Still better than our current setup of having one default username and password that everyone knows. On Mon, Dec 29, 2014 at 9:38 AM, Michael Douglas <Michael.Douglas@ieee.org> wrote:
In the Cisco world the AAA config is typically set up to try tacacs first, and local accounts second. The local account is only usable if tacacs is unavailable. Knowledge of the local username/password does not equate to full time access with that credential. Also, you would usually filter the incoming SSH sessions to only permit a particular management IP range; the local credential, or tacacs credential, shouldn't be usable from any arbitrary network.
On Mon, Dec 29, 2014 at 10:32 AM, Colton Conor <colton.conor@gmail.com> wrote:
Scott,
Thanks for the response. How do you make sure the failsafe and/or root password that is stored in the device incase remote auth fails can't be accessed without having several employees engaged? Are there any mechanisms for doing so?
My fear would be we would hire an outsourced tech. After a certain amount of time we would have to let this part timer go, and would disabled his or her username and password in TACAS. However, if that tech still knows the root password they could still remotely login to our network and cause havoc. The thought of having to change the root password on hundreds of devices doesn't sound appealing either every time an employee is let go. To make matters worse we are using an outsourced firm for some network management, so the case of hiring and firing is fairly consistent.
On Mon, Dec 29, 2014 at 9:22 AM, Scott Helms <khelms@zcorum.com> wrote:
Colton,
Yes, that's the 'normal' way of setting it up. Basically you still have to configure a root user, but that user name and password is kept locked up and only accessed in case of catastrophic failure of the remote authentication system. An important note is to make sure that the fail safe password can't be accessed without having several people engaged so it can't be used without many people knowing.
Scott Helms Vice President of Technology ZCorum (678) 507-5000 -------------------------------- http://twitter.com/kscotthelms --------------------------------
On Mon, Dec 29, 2014 at 10:15 AM, Colton Conor <colton.conor@gmail.com
wrote:
We are able to implement TACAS+. It is my understanding this a fairly old protocol, so are you saying there are numerous bugs that still need to be fixed?
A question I have is TACAS+ is usually hosted on a server, and networking devices are configured to reach out to the server for authentication. My question is what happens if the device can't reach the server if the devices network connection is offline? Our goal with TACAS+ is to not have any default/saved passwords. Every employee will have their own username and password. That way if an employee gets hired/fired, we can enable or disable their account. We are trying to avoid having any organization wide or network wide default username or password. Is this possible? Do the devices keep of log of the last successful username/password combinations that worked incase the device goes offline?
On Sun, Dec 28, 2014 at 5:02 PM, Robert Drake <rdrake@direcpath.com> wrote:
Picking back up where this left off last year, because I apparently only work on TACACS during the holidays :)
On 12/30/2013 7:28 PM, Jimmy Hess wrote:
Even 5 seconds extra for each command may hinder operators, to the extent it would be intolerable; shell commands should run almost instantaneously.... this is not a GUI, with an hourglass. Real-time responsiveness in a shell is crucial --- which remote auth should not change. Sometimes operators paste a buffer with a fair number of commands, not expecting a second delay between each command --- a repeated delay, may also break a pasted sequence.
It is very possible for two of three auth servers to be unreachable, in case of a network break, but that isn't necessary. The "response timeout" might be 5 seconds, but in reality, there are cases where you would wait longer, and that is tragic, since there are some obvious alternative approaches that would have had results that would be more 'friendly' to the interactive user.
(Like remembering which server is working for a while, or remembering that all servers are down -- for a while, and having a 50ms timeout, with all servers queried in parallel, instead of a 5 seconds timeout)
I think this needs to be part of the specification.
I'm sure the reason they didn't do parallel queries was because of both network and CPU load back when the protocol was drafted. But it might be good to have local caching of authentication so that can happen even when servers are down or slow. Authorization could be updated to send the permissions to the router for local handling. Then if the server dies while a session is open only accounting would be affected.
That does increase the vendors/implementors work but it might be doable in phases and with partial support with the clients and servers negotiating what is possible. The biggest drawback to making things like this better is you don't gain much except during outages and if you increase complexity too much you make it wide open for bugs.
Maybe there is a simpler solution that keeps you happy about redundancy but doesn't increase complexity that much (possibly anycast tacacs, but the session basis of the protocol has always made that not feasible). It's possible that one of the L4 protocols Saku Ytti mentioned, QUIC or MinimaLT would address these problems too. It's possible that if we did the transport with BEEP it would also provide this, but I'm reading the docs and I don't think it goes that far in terms of connection assurance.
-- -JH
So, here is my TACACS RFC christmas list:
1. underlying crypto 2. ssh host key authentication - having the router ask tacacs for an authorized_keys list for rdrake. I'm willing to let this go because many vendors are finding ways to do key distribution, but I'd still like to have a standard (https://code.google.com/p/openssh-lpk/ for how to do this over LDAP in UNIX) 3. authentication and authorization caching and/or something else
If someone has physical access to a Cisco router they can initiate a password recovery; tacacs vs local account doesn't matter at that point. On Mon, Dec 29, 2014 at 12:28 PM, Colton Conor <colton.conor@gmail.com> wrote:
Glad to know you can make local access only work if TACAS+ isn't available. However, that still doesn't prevent the employee who know the local username and password to unplug the device from the network, and the use the local password to get in. Still better than our current setup of having one default username and password that everyone knows.
Making the TACAC+ server unavailable is fairly easy - a small LAN-based DDoS would do it, or a firewall rule change somewhere in the middle. Either would cause the router to failover to it's local account. - this is based on the fact that said attacker has some sort of access previously and wanted to elevate their privileges. On Tue, Dec 30, 2014 at 2:38 AM, Michael Douglas <Michael.Douglas@ieee.org> wrote:
If someone has physical access to a Cisco router they can initiate a password recovery; tacacs vs local account doesn't matter at that point.
On Mon, Dec 29, 2014 at 12:28 PM, Colton Conor <colton.conor@gmail.com> wrote:
Glad to know you can make local access only work if TACAS+ isn't available. However, that still doesn't prevent the employee who know the local username and password to unplug the device from the network, and the use the local password to get in. Still better than our current setup of having one default username and password that everyone knows.
Change the root when any senior person leaves. It shouldn't be known to a large set of staff members. During the bubble burst rifs we were changing them on 40k+ devices every week. Make sure you verify the pass before disconnecting the login acct making the change. Also make sure you understand the AAA process well when trying to do this so that you don't lock yourself out. On December 29, 2014 10:32:51 AM EST, Colton Conor <colton.conor@gmail.com> wrote:
Scott,
Thanks for the response. How do you make sure the failsafe and/or root password that is stored in the device incase remote auth fails can't be accessed without having several employees engaged? Are there any mechanisms for doing so?
My fear would be we would hire an outsourced tech. After a certain amount of time we would have to let this part timer go, and would disabled his or her username and password in TACAS. However, if that tech still knows the root password they could still remotely login to our network and cause havoc. The thought of having to change the root password on hundreds of devices doesn't sound appealing either every time an employee is let go. To make matters worse we are using an outsourced firm for some network management, so the case of hiring and firing is fairly consistent.
On Mon, Dec 29, 2014 at 9:22 AM, Scott Helms <khelms@zcorum.com> wrote:
Colton,
Yes, that's the 'normal' way of setting it up. Basically you still have to configure a root user, but that user name and password is kept locked up and only accessed in case of catastrophic failure of the remote authentication system. An important note is to make sure that the fail safe password can't be accessed without having several people engaged so it can't be used without many people knowing.
Scott Helms Vice President of Technology ZCorum (678) 507-5000 -------------------------------- http://twitter.com/kscotthelms --------------------------------
On Mon, Dec 29, 2014 at 10:15 AM, Colton Conor <colton.conor@gmail.com> wrote:
We are able to implement TACAS+. It is my understanding this a fairly old protocol, so are you saying there are numerous bugs that still need to be fixed?
A question I have is TACAS+ is usually hosted on a server, and networking devices are configured to reach out to the server for authentication. My question is what happens if the device can't reach the server if the devices network connection is offline? Our goal with TACAS+ is to not have any default/saved passwords. Every employee will have their own username and password. That way if an employee gets hired/fired, we can enable or disable their account. We are trying to avoid having any organization wide or network wide default username or password. Is this possible? Do the devices keep of log of the last successful username/password combinations that worked incase the device goes offline?
On Sun, Dec 28, 2014 at 5:02 PM, Robert Drake <rdrake@direcpath.com> wrote:
Picking back up where this left off last year, because I apparently only work on TACACS during the holidays :)
On 12/30/2013 7:28 PM, Jimmy Hess wrote:
Even 5 seconds extra for each command may hinder operators, to the extent it would be intolerable; shell commands should run almost instantaneously.... this is not a GUI, with an hourglass. Real-time responsiveness in a shell is crucial --- which remote auth should not change. Sometimes operators paste a buffer with a fair number of commands, not expecting a second delay between each command --- a repeated delay, may also break a pasted sequence.
It is very possible for two of three auth servers to be unreachable, in case of a network break, but that isn't necessary. The "response timeout" might be 5 seconds, but in reality, there are cases where you would wait longer, and that is tragic, since there are some obvious alternative approaches that would have had results that would be more 'friendly' to the interactive user.
(Like remembering which server is working for a while, or remembering that all servers are down -- for a while, and having a 50ms timeout, with all servers queried in parallel, instead of a 5 seconds timeout)
I think this needs to be part of the specification.
I'm sure the reason they didn't do parallel queries was because of both network and CPU load back when the protocol was drafted. But it might be good to have local caching of authentication so that can happen even when servers are down or slow. Authorization could be updated to send the permissions to the router for local handling. Then if the server dies while a session is open only accounting would be affected.
That does increase the vendors/implementors work but it might be doable in phases and with partial support with the clients and servers negotiating what is possible. The biggest drawback to making things like this better is you don't gain much except during outages and if you increase complexity too much you make it wide open for bugs.
Maybe there is a simpler solution that keeps you happy about redundancy but doesn't increase complexity that much (possibly anycast tacacs, but the session basis of the protocol has always made that not feasible). It's possible that one of the L4 protocols Saku Ytti mentioned, QUIC or MinimaLT would address these problems too. It's possible that if we did the transport with BEEP it would also provide this, but I'm reading the docs and I don't think it goes that far in terms of connection assurance.
-- -JH
So, here is my TACACS RFC christmas list:
1. underlying crypto 2. ssh host key authentication - having the router ask tacacs for an authorized_keys list for rdrake. I'm willing to let this go because many vendors are finding ways to do key distribution, but I'd still like to have a standard (https://code.google.com/p/openssh-lpk/ for how to do this over LDAP in UNIX) 3. authentication and authorization caching and/or something else
-- Sent from my Android device with K-9 Mail. Please excuse my brevity.
On Mon, Dec 29, 2014 at 09:32:51AM -0600, Colton Conor wrote:
Scott,
Thanks for the response. How do you make sure the failsafe and/or root password that is stored in the device incase remote auth fails can't be accessed without having several employees engaged? Are there any mechanisms for doing so?
Yes, this is possible as you can prevent the last resort username being used by having your AAA try tacacs+ first and having a non-overlapping username so it's rejected if t+ is operational. You should use username blah secret magic vs password as well to leverage md5 vs the reversable process.
My fear would be we would hire an outsourced tech. After a certain amount of time we would have to let this part timer go, and would disabled his or her username and password in TACAS. However, if that tech still knows the root password they could still remotely login to our network and cause havoc. The thought of having to change the root password on hundreds of devices doesn't sound appealing either every time an employee is let go. To make matters worse we are using an outsourced firm for some network management, so the case of hiring and firing is fairly consistent.
You can automate the login/change with scripting leveraging the clogin tool part of rancid. If you have a proper inventory of these devices and they are in rancid, it's easy to do clogin -x /tmp/commands `cat routerlist` - Jared
On Mon, Dec 29, 2014 at 9:22 AM, Scott Helms <khelms@zcorum.com> wrote:
Colton,
Yes, that's the 'normal' way of setting it up. Basically you still have to configure a root user, but that user name and password is kept locked up and only accessed in case of catastrophic failure of the remote authentication system. An important note is to make sure that the fail safe password can't be accessed without having several people engaged so it can't be used without many people knowing.
Scott Helms Vice President of Technology ZCorum (678) 507-5000 -------------------------------- http://twitter.com/kscotthelms --------------------------------
On Mon, Dec 29, 2014 at 10:15 AM, Colton Conor <colton.conor@gmail.com> wrote:
We are able to implement TACAS+. It is my understanding this a fairly old protocol, so are you saying there are numerous bugs that still need to be fixed?
A question I have is TACAS+ is usually hosted on a server, and networking devices are configured to reach out to the server for authentication. My question is what happens if the device can't reach the server if the devices network connection is offline? Our goal with TACAS+ is to not have any default/saved passwords. Every employee will have their own username and password. That way if an employee gets hired/fired, we can enable or disable their account. We are trying to avoid having any organization wide or network wide default username or password. Is this possible? Do the devices keep of log of the last successful username/password combinations that worked incase the device goes offline?
On Sun, Dec 28, 2014 at 5:02 PM, Robert Drake <rdrake@direcpath.com> wrote:
Picking back up where this left off last year, because I apparently only work on TACACS during the holidays :)
On 12/30/2013 7:28 PM, Jimmy Hess wrote:
Even 5 seconds extra for each command may hinder operators, to the extent it would be intolerable; shell commands should run almost instantaneously.... this is not a GUI, with an hourglass. Real-time responsiveness in a shell is crucial --- which remote auth should not change. Sometimes operators paste a buffer with a fair number of commands, not expecting a second delay between each command --- a repeated delay, may also break a pasted sequence.
It is very possible for two of three auth servers to be unreachable, in case of a network break, but that isn't necessary. The "response timeout" might be 5 seconds, but in reality, there are cases where you would wait longer, and that is tragic, since there are some obvious alternative approaches that would have had results that would be more 'friendly' to the interactive user.
(Like remembering which server is working for a while, or remembering that all servers are down -- for a while, and having a 50ms timeout, with all servers queried in parallel, instead of a 5 seconds timeout)
I think this needs to be part of the specification.
I'm sure the reason they didn't do parallel queries was because of both network and CPU load back when the protocol was drafted. But it might be good to have local caching of authentication so that can happen even when servers are down or slow. Authorization could be updated to send the permissions to the router for local handling. Then if the server dies while a session is open only accounting would be affected.
That does increase the vendors/implementors work but it might be doable in phases and with partial support with the clients and servers negotiating what is possible. The biggest drawback to making things like this better is you don't gain much except during outages and if you increase complexity too much you make it wide open for bugs.
Maybe there is a simpler solution that keeps you happy about redundancy but doesn't increase complexity that much (possibly anycast tacacs, but the session basis of the protocol has always made that not feasible). It's possible that one of the L4 protocols Saku Ytti mentioned, QUIC or MinimaLT would address these problems too. It's possible that if we did the transport with BEEP it would also provide this, but I'm reading the docs and I don't think it goes that far in terms of connection assurance.
-- -JH
So, here is my TACACS RFC christmas list:
1. underlying crypto 2. ssh host key authentication - having the router ask tacacs for an authorized_keys list for rdrake. I'm willing to let this go because many vendors are finding ways to do key distribution, but I'd still like to have a standard (https://code.google.com/p/openssh-lpk/ for how to do this over LDAP in UNIX) 3. authentication and authorization caching and/or something else
-- Jared Mauch | pgp key available via finger from jared@puck.nether.net clue++; | http://puck.nether.net/~jared/ My statements are only mine.
Colton, The best thing is to create the password with a random generator so it's impossible for most people to memorize in a short amount of time. It should be ~14 characters long with mixed cases, numbers, and special characters. That password should be tested once and then put in an envelope that is put in a safe. For all new routers/switches the encrypted form can be pasted in. The envelope should be pretty much impossible to open without it being obvious. You can get even more paranoid/security conscious and put the envelope in a safe deposit box, which would log and tape anyone retrieving it, but that keeps you from getting to the password if you need it when the bank isn't open. Scott Helms Vice President of Technology ZCorum (678) 507-5000 -------------------------------- http://twitter.com/kscotthelms -------------------------------- On Mon, Dec 29, 2014 at 10:32 AM, Colton Conor <colton.conor@gmail.com> wrote:
Scott,
Thanks for the response. How do you make sure the failsafe and/or root password that is stored in the device incase remote auth fails can't be accessed without having several employees engaged? Are there any mechanisms for doing so?
My fear would be we would hire an outsourced tech. After a certain amount of time we would have to let this part timer go, and would disabled his or her username and password in TACAS. However, if that tech still knows the root password they could still remotely login to our network and cause havoc. The thought of having to change the root password on hundreds of devices doesn't sound appealing either every time an employee is let go. To make matters worse we are using an outsourced firm for some network management, so the case of hiring and firing is fairly consistent.
On Mon, Dec 29, 2014 at 9:22 AM, Scott Helms <khelms@zcorum.com> wrote:
Colton,
Yes, that's the 'normal' way of setting it up. Basically you still have to configure a root user, but that user name and password is kept locked up and only accessed in case of catastrophic failure of the remote authentication system. An important note is to make sure that the fail safe password can't be accessed without having several people engaged so it can't be used without many people knowing.
Scott Helms Vice President of Technology ZCorum (678) 507-5000 -------------------------------- http://twitter.com/kscotthelms --------------------------------
On Mon, Dec 29, 2014 at 10:15 AM, Colton Conor <colton.conor@gmail.com> wrote:
We are able to implement TACAS+. It is my understanding this a fairly old protocol, so are you saying there are numerous bugs that still need to be fixed?
A question I have is TACAS+ is usually hosted on a server, and networking devices are configured to reach out to the server for authentication. My question is what happens if the device can't reach the server if the devices network connection is offline? Our goal with TACAS+ is to not have any default/saved passwords. Every employee will have their own username and password. That way if an employee gets hired/fired, we can enable or disable their account. We are trying to avoid having any organization wide or network wide default username or password. Is this possible? Do the devices keep of log of the last successful username/password combinations that worked incase the device goes offline?
On Sun, Dec 28, 2014 at 5:02 PM, Robert Drake <rdrake@direcpath.com> wrote:
Picking back up where this left off last year, because I apparently only work on TACACS during the holidays :)
On 12/30/2013 7:28 PM, Jimmy Hess wrote:
Even 5 seconds extra for each command may hinder operators, to the extent it would be intolerable; shell commands should run almost instantaneously.... this is not a GUI, with an hourglass. Real-time responsiveness in a shell is crucial --- which remote auth should not change. Sometimes operators paste a buffer with a fair number of commands, not expecting a second delay between each command --- a repeated delay, may also break a pasted sequence.
It is very possible for two of three auth servers to be unreachable, in case of a network break, but that isn't necessary. The "response timeout" might be 5 seconds, but in reality, there are cases where you would wait longer, and that is tragic, since there are some obvious alternative approaches that would have had results that would be more 'friendly' to the interactive user.
(Like remembering which server is working for a while, or remembering that all servers are down -- for a while, and having a 50ms timeout, with all servers queried in parallel, instead of a 5 seconds timeout)
I think this needs to be part of the specification.
I'm sure the reason they didn't do parallel queries was because of both network and CPU load back when the protocol was drafted. But it might be good to have local caching of authentication so that can happen even when servers are down or slow. Authorization could be updated to send the permissions to the router for local handling. Then if the server dies while a session is open only accounting would be affected.
That does increase the vendors/implementors work but it might be doable in phases and with partial support with the clients and servers negotiating what is possible. The biggest drawback to making things like this better is you don't gain much except during outages and if you increase complexity too much you make it wide open for bugs.
Maybe there is a simpler solution that keeps you happy about redundancy but doesn't increase complexity that much (possibly anycast tacacs, but the session basis of the protocol has always made that not feasible). It's possible that one of the L4 protocols Saku Ytti mentioned, QUIC or MinimaLT would address these problems too. It's possible that if we did the transport with BEEP it would also provide this, but I'm reading the docs and I don't think it goes that far in terms of connection assurance.
-- -JH
So, here is my TACACS RFC christmas list:
1. underlying crypto 2. ssh host key authentication - having the router ask tacacs for an authorized_keys list for rdrake. I'm willing to let this go because many vendors are finding ways to do key distribution, but I'd still like to have a standard (https://code.google.com/p/openssh-lpk/ for how to do this over LDAP in UNIX) 3. authentication and authorization caching and/or something else
On 12/29/2014 10:32 AM, Colton Conor wrote:
My fear would be we would hire an outsourced tech. After a certain amount of time we would have to let this part timer go, and would disabled his or her username and password in TACAS. However, if that tech still knows the root password they could still remotely login to our network and cause havoc. The thought of having to change the root password on hundreds of devices doesn't sound appealing either every time an employee is let go. To make matters worse we are using an outsourced firm for some network management, so the case of hiring and firing is fairly consistent.
You can setup your aaa in most devices so tacacs+ is allowed first and the local password is only usable if tacacs+ is unreachable. In that case, even if you fire someone you can just remove them from tacacs and they can't get in. At that point you will want to do a global password change of the local password since it's compromised, but it's not an immediate concern. You should also have access lists or firewall rules on all your devices which only allow login from specific locations. If you fire someone then you remove their access to that location (their VPN credentials, username and password for UNIX login, etc), which also makes it harder for them to log back into your network even if they know the local device password.
At 11:06 AM 12/29/2014, you wrote:
My fear would be we would hire an outsourced tech. After a certain amount of time we would have to let this part timer go, and would disabled his or her username and password in TACAS. However, if that tech still knows the root password they could still remotely login to our network and cause havoc. The thought of having to change the root password on hundreds of devices doesn't sound appealing either every time an employee is let go. To make matters worse we are using an outsourced firm for some network management, so the case of hiring and firing is fairly consistent. You can setup your aaa in most devices so tacacs+ is allowed first and the local password is only usable if tacacs+ is unreachable. In
On 12/29/2014 10:32 AM, Colton Conor wrote: that case, even if you fire someone you can just remove them from tacacs and they can't get in.
At that point you will want to do a global password change of the local password since it's compromised, but it's not an immediate concern.
You should also have access lists or firewall rules on all your devices which only allow login from specific locations. If you fire someone then you remove their access to that location (their VPN credentials, username and password for UNIX login, etc), which also makes it harder for them to log back into your network even if they know the local device password.
Umm...what do you guys do when the network is down? All of our engineers know the 'default' username/pw - but it is not usable unless the AAA server is unreachable. I don't know of a way we could do circuit troubleshooting with that password locked up in a safe somewhere. Yes, it's a pain to change when people leave - but it would be a much larger pain to do deployments without it, I think. Berry
participants (19)
-
Berry Mobley
-
cb.list6
-
Christian Kratzer
-
Christopher Morrow
-
Christopher Morrow
-
Colton Conor
-
Jared Mauch
-
Javier Henderson
-
Jimmy Hess
-
Jonathan Lassoff
-
joseph.snyder@gmail.com
-
Matthew Newton
-
Michael Douglas
-
Randy Bush
-
Robert Drake
-
Saku Ytti
-
Scott Helms
-
Tim Raphael
-
Tony Varriale