Jay Ashworth informs me that NTP security and risks has recently been a hot topic on NANOG, and that NTPsec was mentioned. Therefore I've written a bit of a background briefing on the project, which follows. The NTPsec project was initially funded in late 2014 by NSF when authorities there became concerned about frequent incidents of DDoS amplification involving ntpd. I was hired in as tech lead early on (in part because of my work on GPSD, a technically similar project) and retained that position when the Linux Foundation picked up funding the project. It's now managed by Mark Atwood, who some of you may know from HPE and OpenStack. I'm going to pass over the politics around the fork from what our team now calls "NTP Classic" because it wasn't my decision or desire, merely observing that I reluctantly acknowledged the necessity and wish things could have been otherwise. The goal of NTPsec is to achieve high security and high assurance through systematic application of modern best practices. Though not yet at release 1.0, our progress can be judged by the fact that when the last batch of 11 CVEs against NTP was released, we were not vulnerable to 8 of them because we had previously removed or successfully hardened the code they exploited. This was no fluke. Over the last 11 months, we have compiled a significantly - I think it's fair to say "dramatically" - better security record than Classic. We've earned some trust in the infosec research community, working effectively with (among others) Sharon Goldberg's group at BU. We were the first to develop a verified fix for the now infamous off-path KOD bug (CVE-2015-7979). You can read more about our code-hardening practices here: http://esr.ibiblio.org/?p=6881 In brief, we've thrown out a lot of cruft and archaisms. The code has been lifted to C99/POSIX-2001 conformance; other than a few warts near Mac OS X and some unused Windows code probably destined for removal itself, the port shims that used to infest the codebase are nearly gone. Mode 7 has been removed, as has Autokey; these were nests of bugs too risky to leave in. We've also done a lot of code auditing using tools like Coverity and cppcheck, and worked hard on improving our test coverage (that part has been more difficult than any of the code changes, actually, and is still very much a work in progress). Here's a figure that should tell you a lot: we removed 57% of the original codebase in the process of cleaning it up. No, that's not a typo; the NTPsec codebase is *less than half* the size of NTP Classic. And much, much easier to read. That's even without counting the huge simplification win from ditching autotools. Nevertheless, sysdamins will find it very familiar. The largest speedbump you will encounter in normal operation is that we've changed the names of some auxiliary tools so everything has an "ntp" prefix. The only thing I expect to actually surprise you is the documentation, which has been greatly improved, specifically by removing duplications and inconsistencies and distracting references to equipment that has been dead since the Late Cretaceous. So far this is a deliberately conservative fork. We haven't yet tried to add protocol features for security because there is plenty of useful work to be done before tackling that very hard problem. We're actively cooperating with the IETF NTP WG (we've committed to supplying second interop for some upcoming draft RFCs) and we're watching the work on NTS closely. It is likely, though not yet certain, that we'll be second interop on that. Finally, I note some criticism that NTPsec is short on people who understand all the subtleties of time service in the field. This is partly justified. The tech lead admits to being something of a newbie; though I know a lot about some adjacent technical areas from ten years of leading GPSD, I was not a time-service expert before being engaged for this project and am still coming up to speed. The team does already include one time-service old hand and a really good crypto/infosec specialist. NANOG listmember Gary E. Miller was an early team member who remains a friend of the project. We would certainly welcome more engagement and advice fom time-service experts, and the sort of experienced sysadmins who frequent NANOG. In a future post I may have a bit to say about Stratum 1s based on the RPi and other hackerboards. I've been working in that area as well. I'll be happy to answer technical and procedural questions about NTPsec. Any questions about politics and policy should go to Mark Atwood. See www.ntpsec.org for more information. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a>
Eric, Thanks for this really helpful insider look into NTPsec. Does your project have anything like a portable regression test suite that the rest of us could use for NTP product evaluations? And what I be correct in guessing that all of your work is foss? When you say that nothing has been done to add security mechanisms to NTP, are you saying that all the work so far has been code hardening exclusively? Finally, do you want to weigh in on the necessity for highly accurate local RT clocks in NTP servers? That seems to be the big bugaboo in cost limiting right now. -mel beckman
On May 13, 2016, at 6:40 AM, Eric S. Raymond <esr@thyrsus.com> wrote:
Jay Ashworth informs me that NTP security and risks has recently been a hot topic on NANOG, and that NTPsec was mentioned. Therefore I've written a bit of a background briefing on the project, which follows.
The NTPsec project was initially funded in late 2014 by NSF when authorities there became concerned about frequent incidents of DDoS amplification involving ntpd. I was hired in as tech lead early on (in part because of my work on GPSD, a technically similar project) and retained that position when the Linux Foundation picked up funding the project. It's now managed by Mark Atwood, who some of you may know from HPE and OpenStack.
I'm going to pass over the politics around the fork from what our team now calls "NTP Classic" because it wasn't my decision or desire, merely observing that I reluctantly acknowledged the necessity and wish things could have been otherwise.
The goal of NTPsec is to achieve high security and high assurance through systematic application of modern best practices. Though not yet at release 1.0, our progress can be judged by the fact that when the last batch of 11 CVEs against NTP was released, we were not vulnerable to 8 of them because we had previously removed or successfully hardened the code they exploited.
This was no fluke. Over the last 11 months, we have compiled a significantly - I think it's fair to say "dramatically" - better security record than Classic. We've earned some trust in the infosec research community, working effectively with (among others) Sharon Goldberg's group at BU. We were the first to develop a verified fix for the now infamous off-path KOD bug (CVE-2015-7979).
You can read more about our code-hardening practices here:
http://esr.ibiblio.org/?p=6881
In brief, we've thrown out a lot of cruft and archaisms. The code has been lifted to C99/POSIX-2001 conformance; other than a few warts near Mac OS X and some unused Windows code probably destined for removal itself, the port shims that used to infest the codebase are nearly gone. Mode 7 has been removed, as has Autokey; these were nests of bugs too risky to leave in.
We've also done a lot of code auditing using tools like Coverity and cppcheck, and worked hard on improving our test coverage (that part has been more difficult than any of the code changes, actually, and is still very much a work in progress).
Here's a figure that should tell you a lot: we removed 57% of the original codebase in the process of cleaning it up. No, that's not a typo; the NTPsec codebase is *less than half* the size of NTP Classic. And much, much easier to read. That's even without counting the huge simplification win from ditching autotools.
Nevertheless, sysdamins will find it very familiar. The largest speedbump you will encounter in normal operation is that we've changed the names of some auxiliary tools so everything has an "ntp" prefix. The only thing I expect to actually surprise you is the documentation, which has been greatly improved, specifically by removing duplications and inconsistencies and distracting references to equipment that has been dead since the Late Cretaceous.
So far this is a deliberately conservative fork. We haven't yet tried to add protocol features for security because there is plenty of useful work to be done before tackling that very hard problem. We're actively cooperating with the IETF NTP WG (we've committed to supplying second interop for some upcoming draft RFCs) and we're watching the work on NTS closely. It is likely, though not yet certain, that we'll be second interop on that.
Finally, I note some criticism that NTPsec is short on people who understand all the subtleties of time service in the field. This is partly justified. The tech lead admits to being something of a newbie; though I know a lot about some adjacent technical areas from ten years of leading GPSD, I was not a time-service expert before being engaged for this project and am still coming up to speed.
The team does already include one time-service old hand and a really good crypto/infosec specialist. NANOG listmember Gary E. Miller was an early team member who remains a friend of the project. We would certainly welcome more engagement and advice fom time-service experts, and the sort of experienced sysadmins who frequent NANOG.
In a future post I may have a bit to say about Stratum 1s based on the RPi and other hackerboards. I've been working in that area as well.
I'll be happy to answer technical and procedural questions about NTPsec. Any questions about politics and policy should go to Mark Atwood.
See www.ntpsec.org for more information. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a>
Mel Beckman <mel@beckman.org>:
Does your project have anything like a portable regression test suite that the rest of us could use for NTP product evaluations?
We do not, yet. Testing NTP at above the level of unit tests for individual functions is *quite* difficult - I say that as the person who successfully implemented a very rigorous regression-test suite in GPSD. The NTP version of this problem is, unfortunately, much less tractable. We have some ideas and a partial implementation, but this is the technical area in which we have had the least success so far. We will persevere. We're going to need good end-to-end testing to maintain provable functional stability through some of the large changes I have in mind. I cannot, however, promise that our test framework will be applicable to other implementations.
And what I be correct in guessing that all of your work is foss?
Yes. NTP and 2-clause BSD licenses.
When you say that nothing has been done to add security mechanisms to NTP, are you saying that all the work so far has been code hardening exclusively?
Yes. There remains a considerable amount of this to be done. We have our eyes on several risky and only marginally useful features that should probably be excised. The recently-acquired ability of Windows to run many Linux binaries probably means all the Windows port shims can be thrown out. And so forth. The official motto of our project, front and center on www.ntpsec.org, is the Saint-Exupery quote: "Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away." I must say that the effectiveness of ruthlessly cutting away bloat as a security-hardening strategy has actually exceeded our initial expectations. We were hoping for "successful" and seem to have achieved "wildly successful" - I think dodging 8 of 11 CVEs in the last batch counts as that.
Finally, do you want to weigh in on the necessity for highly accurate local RT clocks in NTP servers? That seems to be the big bugaboo in cost limiting right now.
I'll reply to this starting a separate thread. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a>
participants (3)
-
Eric S. Raymond
-
esr@thyrsus.com
-
Mel Beckman