Mel Beckman <mel@beckman.org>:
Finally, do you want to weigh in on the necessity for highly accurate local RT clocks in NTP servers? That seems to be the big bugaboo in cost limiting right now.
Yes, this is a topic on which I have some well-developed judgments due to having collected (and, where practical, tested) a pretty comprehensive set of figures on components of the NTP error budget. I've even invented some hardware that simplifies the problem. The background to my findings is laid out in my "Introduction to Time Service" HOWTO: http://www.catb.org/gpsd/time-service-intro.html I find that an effective way to learn my way into a new application domain is to first do knowledge capture on the assumptions its experts are using and then document those. "Introduction to Time Service" was written to do that and as a white paper for my project management. Criticism and corrections are, of course, welcome. In order to discuss the value of accurate clocks intelligently, we need to split apart two issues: accuracy and availability. Of course we want the most accurate time our networks can deliver; we also want to hedge against sporadic or systemic failure of single time sources. The most important simplification of either issue is that clock accuracy worth paying for is bounded both by user expectations and the noise floor defined by our network jitter. According to RFC 5095 expected accuracy of NTP time is "several tens of milliseconds." User expectations seem to evolved to on the close order of 10ms. I think it's not by coincidence this is pretty close to the jitter in ping times I see when I bounce ICMP off a well-provisioned site like (say) google.com through my Verizon FIOS connection. It's good rule-of-thumb engineering that if you want to be metrologically accurate you should pay for precision an order of magnitude below your feature size *and not more than that*. Thus, with a feature size of 10ms the economic sweet spot is a clock with accuracy about 1ms. One reason discussions of how to budget for WAN timeservice clocks has tended to become heated in the past is that nothing inexpensive hit this sweet spot. The world was largely divided between cheap time sources with too much jitter (e.g. GPS in-band data with a wander of 100ms or worse) and expensive high-precision clocks designed for PTP over Ethernet that deliver three or more orders of magnitude than WAN time service can actually use. However...also use the 1PPS signal your GPS engine ships (top of UTC second accurate to about 50ns) and the picture changes completely. With that over RS232 your delivered accuracy rises to single-digit numbers of microseconds, which is two orders of magnitude below what you need for your 1ms goal. Now we have a historical problem, though: RS232 and the handshake lines you could use to signal 1PPS are dying, being replaced by USB. which doesn't normally bring 1PPS out to where the timeserver OS can see it. In 2012, nearly three years before being recruited for NTPsec, I solved this problem as part of my work on GPSD. The key to this solution is an obscure feature of USB, and a one-wire patch to the bog-standard design for generic USB that exploits it. Technical details on request, but what it comes down to is that with this one weird trick(!) you can mass-produce primary time sources with a jitter bounded by the USB polling interval for about $20 a pop. The USB 1 polling interval is 1ms. Bingo. We're done. If we're only weighting accuracy and not availability, a USB GPS is as much clock as you need for WAN timeservice *provided it exposes 1PPS*. These devices exist, because I designed them and found a shop in Shenzhen to build them. They're called the Navisys GR-601W, GR-701W, and GR-801W. (A viable, only skightly more expensive alternative is to mate a GPS daughterboard to a hackerboard like the Raspberry Pi and run NTP service on that. I'll have much, much more to say about that in a future post.) Of course, now we have to talk about availability. GPS sometimes loses lock. There are sporadic and systemic availability risks due to jamming and system failures like the 2012 hiccup, and extreme scenarios like a giant meteorite hitting GPSS ground control in Colorado. What you should be willing to pay for a hedge against this is proportional to your 95% confidence estimate of the maximum outage interval. At the low end, this is simple; put your antenna on a mast that guarantees unobstructed skyview. At the high end it's effectively impossible, in that anything that takes down GNSS and Galileo permanently (giant meteor impact, war in space, collapse of the U.S. and Europe) is likely to be in the you-have-much-bigger- problems than-inaccurate-time department. Traditionally dedicated time-source hardware like rubidium-oscillator GPSDOs is sold on accuracy, but for WAN time service their real draw is long holdover time with lower frequency drift that you get from the cheap, non-temperature-compensated quartz crystals in your PC. There is room for debate about how much holdover you should pay for, but you'll at least be thinking more clearly about the problem if you recognize that you *should not* buy expensive hardware for accuracy. For WAN time service, in that price range, you're wither buying holdover and knowing you're doing so or wasting your money. -- <a href="http://www.catb.org/~esr/">Eric S. Raymond</a> Everything you know is wrong. But some of it is a useful first approximation.