Wednesday, April 05, 2017

Tick

A common source of inaccuracy in timepieces is an error in the period of the device's tick, which causes the clock to run too fast or too slow. This is sometimes described in dimensionless units like parts per million or PPM: a one part per million error in a clock with a tick of one second is equivalent to an error of about thirty-two seconds each year, because there are approximately thirty-two million seconds in a year. One PPM is equivalent to a relative error of 0.000001 or 0.0001%.

More ticks per second - so each tick has a shorter period - are better. The shorter the period of the tick, the less important a relative error in that period is, because the smaller the error will be in absolute terms.

The inverse of the period of the tick is, of course, its frequency, what a scientist or engineer would measure in units of Hertz or cycles per second, but what horologists refer to as beats per second.

Rolex Oyster Perpetual Date GMT Master II Automatic Chronometer

My beloved Rolex GMT Master II wristwatch - originally designed by Rolex for Pan American flight crews on long haul flights - ticks at a frequency of eight beats per second: if you listen very carefully, for every second the second hand marks off, you can hear it tick eight times. That's the sound of the mechanical escapement and balance wheel - you can think of them as a kind of tiny pendulum - inside my Rolex, driven by a mainspring that is automatically wound every time I move my arm. Eight beats per second is pretty typical of a good mechanical watch. The mechanism, or movement, of the watch handles the translation of the 8Hz beat into the motion of all of the hands on the watch.

A one PPM error rate in my Rolex would equate to it being off about four seconds a year. In practice, no mechanical watch is anywhere near that good. Rolex aims for a few seconds a day, perhaps in the neighborhood of fifty PPM. Still, my Rolex is a certified Swiss chronometer, a test for stability, accuracy, precision, and reliability that can be traced back to the invention of chronometers for celestial navigation in the days of sail. My Rolex is perfectly adequate for navigating by sextant, should the need arise. Also, it doesn't need batteries.

Quartz watches work like this too. But instead of a mechanical escapement, they use the vibration of a quartz crystal, called an oscillator, driven by an electric current, to count off ticks. (The audible click you may hear in an analog quartz watch isn't the quartz crystal, which is vibrating much to fast to discern, but that of the electro-mechanical actuator of the watch that moves its hands.) The quartz watch movement exploits the piezoelectric effect: a minute electrical current applied to a quartz crystal causes it to vibrate at a known rate. (And vice versa: applying mechanical stress to a quartz crystal generates a minute electrical current. Physics!)

Todd Snyder Timex Mod Quartz Watch

The quartz crystal in my affordable yet oh so stylish Todd Snyder Timex Mod wristwatch vibrates at a frequency of 32,768 beats per second, 4,096 times more often than my Rolex. So the period of the Timex tick is a tiny fraction of that of my Rolex. A relative error in the period of the Timex tick would be far smaller in absolute terms than the same relative error in my Rolex. Which is why my Timex is typically going to be more accurate than my Rolex.

Even so, quartz crystals vary in their exact rate of vibration according to their temperature, which can vary widely in a wristwatch. My Timex may have an error rate of perhaps four PPM. In applications which require a higher degree of precise timing, a crystal oscillator may be embedded in an oven that thermostatically controls its temperature. Such oven-controlled crystal oscillators (OCXO) can be surprisingly small surface mount components.

Casio F-91W Quartz Digital Watch

The use of crystal oscillators as timing sources, or frequency standards, is ubiquitous. There is at least one crystal oscillator in every digital device you own. The accuracy, precision, and reliability of the oscillator in this US$10 Casio F-91W digital watch - that, plus its back is easily removed, giving access to its innards, and, of course, it's cheap as dirt and easily available - makes it a favorite amongst fabricators of improvised explosive devices (IED), or so I've read.

A rubidium atomic clock - an atomic clock is really an extraordinarily high precision oscillator - has a frequency of 6,834,682,610.904 beats per second. That astonishingly high frequency is why atomic clocks are so good: we are hard pressed to even measure a relative error in a period that short, except by comparing two atomic clocks. (Which, by the way, is one way in which time dilation in the Theory of Relativity has been experimentally verified.)

A cesium atomic clock has a frequency of exactly 9,192,631,770 beats per second. Why exactly? Because 9,192,631,770 beats of a cesium atomic clock is the definition of a second in the International System of Units.

And a strontium atomic clock - the most accurate timekeeping device ever constructed by humankind - beats at about 430 trillion times a second. It will lose a second perhaps every fifteen billion years. That's an error rate better than any number for which we have a name.

Tuesday, April 04, 2017

Some Stuff That Has Worked For Me In C

Diminuto (Spanish for "tiny") is a library of C functions and macros that support the kind of low level embedded, real-time, and systems programming work I am typically called upon to do on the Linux/GNU platform. It's no longer so tiny. I started it way back in 2008 and it is still going strong. Its open-source (FSF LGPL 2.1) code has found its way into a number of shipping commercial products for more than one of my clients. Diminuto can be found on GitHub along with eighteen (so far) other repositories (some private) of my work.

My forty year career has been a long, strange, and marvelous trip. Along the way, I've found a number of techniques of architecture, design, and implementation in C and C++ that have worked well for me, solving a number of recurring problems. I don't claim that these techniques are the best, or that I am the only, or even the first, developer to have thought them. Some of them I shamelessly borrowed, sometimes from other languages or operating systems, because I try to know a good idea when I see one.

This article documents some of those ideas.

Standard Headers and Types

Just about every big closed-source project I've worked on defined its own integer types. Even the Linux kernel does this. Types like u32 for unsigned 32-bit integers, or s8 for signed 8-bit integers. Historically - and I'm talking ancient history here - that was a good idea, way back when C only had integer types like int and char, and there weren't dozens and dozens of different processors with register widths that might be 8, 16, 32, or 64 bits.

But this kind of stuff is now available in standard ANSI C headers, and Diminuto leverages the heck out of them.

stdint.h defines types like uint32_t, and int8_t, and useful stuff like uintptr_t, which is guaranteed to be able to hold a pointer value, which on some platforms is 32-bits, and on others 64-bits.

stddef.h defines types like size_t and ssize_t that are used by a variety of POSIX and Linux systems calls and functions, and are guaranteed to be able to hold the value returned by the sizeof operator.

stdbool.h defines the bool type and constant values for true and false (although, more about that in a moment).

In at least one client project, I talked them into redefining their own proprietary types via typedef to use these ANSI types, which simplified porting their code to new platforms.

Boolean Values

Even though I may use the bool type defined in stdbool.h, I don't actually like the constants true and false. For sure, false is always 0. But what value is true? Is it 1? Is it all ones, e.g. 0xff? Is 9 true? How about -1?

In C, true is anything that is not 0. So that's how I code a true value: !0. I don't care how the C standard or the C compiler encodes true, because I know for sure that !0 represents it.

I normalize any value I want to use as a boolean using a double negation. For example, if I have two variables alpha and beta that are booleans. Do I use
(alpha == beta)
to check if they are equal? What if alpha is 2 and beta is 3? Both of those are true, but the comparison will fail. Do I use
(alpha && beta)
instead? No, because I want to know if they are the same, both true, or both false, not if they are both true. If C had a logical exclusive OR operator, I'd use that - or, actually, the negation of that - but it doesn't. I could do something like
(((alpha && beta) || ((!alpha) && (!beta)))
but that hurts my eyes.

Double negation addresses this: !!-1 equals !0 equals whatever the compiler uses for true. I use
((!!alpha) == (!!beta))
unless I am positive that both alpha and beta have already been normalized.

I also do this if I am assigning a value to a boolean
alpha = !!beta
unless I am very sure that beta has already been normalized.

Inferring Type

If I want to check if a variable being used as a boolean - no matter how it was originally declared - is true, I code the if statement this way.
if (alpha)
But if the variable is a signed integer and I want to know if it is not equal to zero, I code it this way.
if (alpha != 0)
And if it's an unsigned integer, I code it this way.
if (alpha > 0)
When I see a variable being used, I can usually infer its intended type, no matter how it might be declared elsewhere.

Parentheses and Operator Precedence

You may have already noticed that I use a lot of parentheses, even where they are not strictly speaking necessary. Can I remember the rules of operator precedence in C? Maybe. Okay, probably not. But here's the thing: I work on big development projects, hundreds of thousands or even millions of lines of code, with a dozen or more other developers. And if I do my job right, the stuff I work on is going to have a lifespan long after I leave the project and move on to something else. Just because I can remember the operator precedence of C, the next developer that comes along may not. So I want to make my assumptions explicit and unambiguous when I write expressions.

Also: sure, I may be writing C right now. But this afternoon, I may be elbow deep in C++ code. And tomorrow morning I may be writing some Python or Java. Tomorrow afternoon, sadly, I might be hacking some JavaScript that the UI folks wrote, even though I couldn't write a usable line of JavaScript if my life depended on it. Even if I knew the rules of operator precedence for C, that's not good enough; I need to know the rules for every other languages I may find myself working in.

But I don't. So I use a lot of parentheses.

This is the same reason I explicitly code the access permissions - private, protected, public - when I define a C++ class. Because the default rules in C++ are different for class versus struct (yes, you can define access permissions in a struct in C++), and different yet again for Java classes.

Exploiting the sizeof Operator

I never hard-code a value when it can be derived at compile time. This is especially true of the size of structures or variables, or values which can be derived from those sizes.

For example, let's suppose I need two arrays that must have the same number of elements, but not necessarily the same number of bytes.
int32_t alpha[4];
int8_t beta[sizeof(alpha)/sizeof(alpha[0])];
The number of array positions of beta is guaranteed to be the same as the number of array positions in alpha, even though they are different types, and hence different sizes. The expression
(sizeof(alpha)/sizeof(alpha[0]))
divides the total number of bytes in the entire array with the number of bytes in a single array position.

This has been so useful that I defined a countof macro in a header file to return the number of elements in an array, providing it can be determined at compile time. I've similarly defined macros like widthof, offsetof, memberof, and containerof.

Doxygen Comments

Doxygen is a documentation generator for C and C++, inspired by Java's documentation generator Javadoc. You write comments in a specific format, run the doxygen program across your source code base, and it produces an API document in HTML. You can use the HTML pages directly, or convert them using other tools into a PDF file or other formats.

/**
 * Wait until one or more registered file descriptors are ready for reading,
 * writing, or accepting, a timeout occurs, or a signal interrupt occurs. A
 * timeout of zero returns immediately, which is useful for polling. A timeout
 * that is negative causes the multiplexer to block indefinitely until either
 * a file descriptor is ready or one of the registered signals is caught. This
 * API call uses the signal mask in the mux structure that contains registered
 * signals.
 * @param muxp points to an initialized multiplexer structure.
 * @param timeout is a timeout period in ticks, 0 for polling, <0 for blocking.
 * @return the number of ready file descriptors, 0 for a timeout, <0 for error.
 */
static inline int diminuto_mux_wait(diminuto_mux_t * muxp, diminuto_sticks_t timeout)
{
    return diminuto_mux_wait_generic(muxp, timeout, &(muxp->mask));
}

Even if I don't use the doxygen program to generate documentation, the format enforces a useful discipline and consistent convention in documenting my code. If I find myself saying "Wow, this API is complicated to explain!", I know I need to revisit the design. I document the public functions in the .h header file, which is the de facto API definition. If there are private functions in the .c source file, I put Doxygen comments for those functions there.

Namespaces (updated 2017-04-25)

Most of the big product development projects I've worked on have involved writing about 10% new code, and 90% integrating existing closed-source code from prior client projects. When that integration involved using C or C++ code from many (many) different projects, name collisions were frequently a problem, either with symbols in the source code itself (both libraries have a global function named log), or in the header file names (e.g. both libraries have header files named logging.h).

In my C++ code, such as you'll find in my Grandote project on GitHub (a fork from Desperadito, itself a fork from Desperado, both since deprecated), there is an easy fix for the symbol collisions: C++ provides a mechanism to segregate compile-time symbols into namespaces.

namespace com {
namespace diag {
namespace grandote {

Condition::Condition()
{
 ::pthread_cond_init(&condition, 0);
}

Condition::~Condition() {
 ::pthread_cond_broadcast(&condition);
 ::pthread_yield();
 ::pthread_cond_destroy(&condition);
}

int Condition::wait(Mutex & mutex) {
 return ::pthread_cond_wait(&condition, &mutex.mutex); // CANCELLATION POINT
}

int Condition::signal() {
 return ::pthread_cond_broadcast(&condition);
}

}
}
}

It doesn't matter that another package defines a class named Condition, because the fully-qualified name of my class is actually com::diag::grandote::Condition. Any source code that is placed within the namespace com::diag::grandote will automatically default to using my Condition, not the other library's Condition, eliminating the need for me to specify that long name every time.

Note too that the namespace incorporates the domain name of my company in reverse order - borrowing an idea from how Java conventionally organizations its source code and byte code files. This eliminates any collisions that might have occurred because another library itself has the same library, class, or module name as part of its own namespace.

This works great for C++, but the problem remains for C. So there, I resort to just prepending the library name and the module name to the beginning of every function. You already may have noticed that in the diminuto_mux_wait function in the Doxygen example above: the wait operation in the mux module of the diminuto library.

But one problem that neither C++ namespaces nor my C naming convention, solves is header file collisions. So in both C and C++ I organize header file directories in a hierarchical fashion so that #include statements look like this, from my Assay project.

#include <stdio.h>
#include <errno.h>
#include "assay.h"
#include "assay_parser.h"
#include "assay_fixup.h"
#include "assay_scanner.h"
#include "com/diag/assay/assay_scanner_annex.h"
#include "com/diag/assay/assay_parser_annex.h"
#include "com/diag/diminuto/diminuto_string.h"
#include "com/diag/diminuto/diminuto_containerof.h"
#include "com/diag/diminuto/diminuto_log.h"
#include "com/diag/diminuto/diminuto_dump.h"
#include "com/diag/diminuto/diminuto_escape.h"
#include "com/diag/diminuto/diminuto_fd.h"
#include "com/diag/diminuto/diminuto_heap.h"

The first two header files, stdio.h and errno.h, are system header files. The next four header files whose names begin with assay_ are private header files that are not part of the public API and which are in the source directory with the .c files being compiled. The next two header files are in a com/diag/assay subdirectory that is part of the Assay project. The remaining header files are in a com/diag/diminuto subdirectory that is part of the Diminuto project.

Here's another example, from the application gpstool that makes use of both my Hazer and Diminuto projects.

#include <assert.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <stdint.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
#include "com/diag/hazer/hazer.h"
#include "com/diag/diminuto/diminuto_serial.h"
#include "com/diag/diminuto/diminuto_ipc4.h"
#include "com/diag/diminuto/diminuto_ipc6.h"
#include "com/diag/diminuto/diminuto_phex.h"

This convention is obviously not so important in my C header files, where the project name is part of the header file name. But it becomes a whole lot more important in C++, where my convention is to name the header file after the name of the class it defines, as this snippet from Grandote illustrates.

#include <unistd.h>
#include <sys/stat.h>
#include <fcntl.h>
#include "com/diag/grandote/target.h"
#include "com/diag/grandote/string.h"
#include "com/diag/grandote/Platform.h"
#include "com/diag/grandote/Print.h"
#include "com/diag/grandote/DescriptorInput.h"
#include "com/diag/grandote/DescriptorOutput.h"
#include "com/diag/grandote/PathInput.h"
#include "com/diag/grandote/PathOutput.h"
#include "com/diag/grandote/ready.h"
#include "com/diag/grandote/errno.h"
#include "com/diag/grandote/Grandote.h"

Once I came upon this system of organizing header files, not only did it become much easier to integrate and use multiple projects into a single application, but the source code itself became more readable, because it was completely unambiguous where everything was coming from. This strategy worked so well, I use it not just for C and C++, but to organize my Python, Java, and other code too. I also organize my GitHub repository names, and, when I use Eclipse, my project names, in a similar fashion:

  • com-diag-assay,
  • com-diag-grandote,
  • com-diag-diminuto,
  • com-diag-hazer

etc.

This is more important than it sounds. Digital Aggregates Corporation (www.diag.com) is my consulting company that has been around since 1995. It holds the copyright on my open-source code that may find its way into my clients' products. The copyright of my closed-source code is held by another of my companies, Cranequin LLC (www.cranequin.com). So if I see the header file directory com/cranequin or a project or repository with the prefix com-cranequin, I am reminded that I am dealing with my own proprietary code under a different license.

Time

The way time is handled in POSIX- whether you are talking about time of day, a duration of time, or a periodic event with an interval time - is a bit of a mess. Let's see what I mean.

gettimeofday, which returns the time of day, uses the timeval structure that has a resolution of microseconds. It's cousin time uses an integer that has a resolution of seconds.

clock_gettime, which when used with the CLOCK_MONOTONIC_RAW argument returns an elapsed time suitable for measuring duration, uses the timespec structure that has a resolution of nanoseconds.

setitimer, which is be used to invoke an periodic interval timer, uses the itimerval structure that has a resolution of microseconds.

select, which is be used to multiplex input/output with a timeout, uses the timerval structure and has a resolution of microseconds. Its cousin pselect uses the timespec structure that has a resolution of nanoseconds.

poll, which can also be used to multiplex input/output with a timeout, uses an int argument that has a resolution of milliseconds. It's cousin ppoll uses the timespec structure that has a resolution of nanoseconds.

nanosleep, which is used to delay the execution of the caller, uses the timespec structure that has a resolution of nanoseconds. Its cousin sleep uses an unsigned int argument that has a resolution of seconds. Its other cousin usleep uses an useconds_t argument that is a 32-bit unsigned integer and which has a resolution of microseconds.

Seconds, milliseconds, microseconds, nanoseconds. Six different types. It's too much for me.

Diminuto has two types in which time is stored: diminuto_ticks_t and diminuto_sticks_t. Both are 64-bit integers, the only difference being one is unsigned and the other is signed. (I occasionally regret even that.) All time is maintained in a single unit of measure: nanoseconds. This unit is generically referred to as a Diminuto tick.

diminuto_time_clock returns the time of day in Diminuto ticks. There is another function in the time module to convert the time of day from a ticks since the POSIX epoch into year, month, day, hour, minute, second, and fraction of a second in - you guessed it - Diminuto ticks.

diminuto_time_elapsed returns the elapsed time in Diminuto ticks suitable for measuring duration.

diminuto_timer_periodic invokes a periodic interval timer, and diminuto_timer_oneshot invokes an interval timer that fires exactly once, both using Diminuto ticks to specify the interval.

diminuto_mux_wait uses pselect for I/O multiplexing, and diminuto_poll_wait does the same but uses ppoll, both specifying the timeout in Diminuto ticks. (I believe pselect is now implemented in the Linux kernel using ppoll, but that wasn't true when I first wrote this code.)

diminuto_delay delays the caller for the specified number of Diminuto ticks.

Floating Point

If you really need to use floating point, you'll probably know it. But using floating point is problematic for lots of reasons. Diminuto avoids using floating point except in a few applications or unit tests where it is useful for reporting final results.

Even though Diminuto uses a single unit of time, that doesn't mean the underlying POSIX or Linux implementation can support all possible values in that unit. So every module in Diminuto that deals with time has an inline function that the application can call to find out what resolution the underlying implementation supports. And every one of those functions returns that value in a single unit of measure: Hertz. Hertz - cycles per second - used because it can be expressed as an integer. It's inverse is the smallest time interval supported by the underlying implementation.

diminuto_frequency returns 1,000,000,000 ticks per second, the base frequency used by the library. The inverse of this is one billionth of a second or one nanosecond.

diminuto_time_frequency returns 1,000,000,000, the inverse being the resolution of timespec in seconds.

diminuto_timer_frequency returns 1,000,000, the inverse being the resolution of setitimer.

diminuto_delay_frequency returns 1,000,000,000, the inverse being the resolution of timespec.

With just some minor integer arithmetic - which can often be optimized out at compile-time - the application can determine what is the smallest number of ticks the underlying implementation can meaningfully support in each module. (Note that this is merely the resolution representable in the POSIX or Linux API; the kernel or even the underlying hardware may support an even coarser resolution.)

Remarks

I first saw the countof technique in a VxWorks header file perhaps twenty years ago while doing embedded real-time development in C++ at Bell Labs. Similarly, a lot of these techniques have been picked up - or learned the hard way - while taking this long strange trip that has been my career. I have also benefitted greatly from having had a lot of mentors, people smarter than I am, who are so kind and generous with their time. Many of these techniques gestated in my C++ library Desperado, which I began putting together in 2005.

I started developing and testing the Diminuto C code for a number of reasons.
  • I found myself solving the same problems over and over again, sometimes for different clients, sometimes even for different projects for the same client. For one reason or another - sometimes good, sometimes not so good - the proprietary closed-source code I developed couldn't be shared between projects. But open-source code could easily be integrated into the code base.
  • I wanted to capture some of the design patterns I had found useful in my closed-source work. Working, unit tested, open-source code, developed in a clean room environment, and owned by my own company, was a useful way to do that.
  • I needed a way to get my head around some of the evolving C, C++, POSIX, and Linux APIs. Since I can only really learn by doing, I had to do, which for me meant writing and testing code.
  • I wanted to make an API that was more consistent, more easily interoperable, and less prone to error than was offered by raw POSIX, Linux, and GNU.
  • In the past I have been a big proponent of using C++ for embedded development. I have written hundreds of thousands of lines of C++ while doing such work over the years. But more recently, I have seen a gradual decline in the use of C++ by my clients, with a trend more towards segregating development into systems code in C and application code in languages like Python or Java. Although I miss C++, I have to agree with the economics that. And I fear that C++ has evolved into a language so complex that it is beyond the ken of developers that my clients can afford.
I am not a "language lawyer". I get paid to develop product that ships and generates revenue. I believe I have been successful in part because I am a very hands-on and evidence based person, but also because I am extremely pragmatic about what works and what doesn't.

This is some stuff that, over the span of many years, has worked.

Update (2017-04-14)

I recently forked Desperadito (which itself is a mashed up fork of both Desperado and Hayloft) into Grandote (https://github.com/coverclock/com-diag-grandote). What's different is Grandote uses the Diminuto C library as its underlying platform abstraction. It requires that you install Diminuto, Lariat (a Google Test helper framework), and Google Test (or Google Mock). That's some effort, but at least for me it wasn't overly burdensome when I built it all this morning on a Ubuntu 16.04 system. The other improvement over Desperadito is that both the old Desperado hand coded unit tests, and the newer Hayloft Goggle Test unit tests, work. Grandote would be my C++ framework going forward, if I were to need my own C++ framework. (And Grandote doesn't preclude using STL or Boost as well.)

Friday, March 31, 2017

My Stratum-1 Desk Clock

Long time readers who remember my article Does anybody really know what time it is? from way back in 2006 will appreciate that I have a deep and abiding interest in timekeeping. So it might come as a surprise to those folks that it has taken me this long to build a stratum-1 desk clock. This is a desk clock so accurate and precise that it is within a few milliseconds of atomic clocks, which aren't really clocks in the usual sense but oscillators that make super precise frequency standards, the best sources of clock ticks that humankind knows how to make.

My Stratum-1 Desk Clock

I cobbled together this little project from
All of the heavy lifting is done by open source software written by other folks. I followed Eric S. Raymond's Stratum-1-Microserver HOWTO and used his clockmaker Python script to do most of the software build and install. As a result, the Pi runs the gpsd Global Positioning System (GPS) daemon and the ntpd Network Time Protocol (NTP) daemon. I administered the WiFi radio on the desk clock so that it can be queried by NTP daemons running on other systems on my local area network. As far as I know, this is the only desk clock I've ever owned that has virtual memory and supports ssh.

The GPS daemon reads the National Marine Electronics Association (NMEA) sentences output from the GPS board via the Pi's serial port. These sentences contain the computed time and date based on timestamps received from both the U. S. NAVSTAR GPS and the Russian GLONASS satellites in mid-earth orbit. The daemon also synchronizes with the GPS board's One Pulse Per Second (1PPS) output via a General Purpose Input/Output (GPIO) pin. This 1Hz heartbeat is derived from the GNSS signals, and is the high technology digital equivalent to the "when you hear the tone, the time will be..." from the old time-of-day telephone service.

The Real-Time Clock (RTC) saves the current time in non-volatile memory when the Pi is shutdown, and keeps ticking off the seconds even when the Pi is powered off, thanks to a battery backup. The time is restored from the RTC when the Pi boots up. This allows the desk clock to know the time and date (within a fraction of a second anyway) before the GPS board can achieve a lock on enough satellites (four at least) to compute the time.

The LCD display is driven by a modest little Python script that I wrote that just queries the Linux system clock five times a second. The NTP daemon keeps the system clock synchronized to GPS Time (GPST) plus the appropriate number of leap seconds offset from Coordinated Universal Time (UTC) as received from the satellites. The GNU library handles the conversion from UTC to local time, depending on the administered time zone, as well as any arcane conversions necessary for Daylight Saving Time (DST).

This is remarkably complex, but all handled by the open source software. The Python script that I wrote is perhaps a dozen lines of code at most. By virtue of the fact that the time maintained by the GNSS satellites is stratum-0 - we'll get to what that means in a moment - and the desk clock is constantly being corrected to follow that clock source, my desk clock is effectively a stratum-1 clock.

Untitled2

There was a time when owning a stratum-1 clock would have been a major financial investment. Today, it's a hobby.

You can find the additional files I modified or added on the Raspberry Pi on GitHub: https://github.com/coverclock/com-diag-hourglass .

Too Much Information

A little background might be called for to really appreciate my desk clock. I haven't discovered the optimal order to present these factlets, so you will have to forgive a few forward references.

GNSS. There are two fully operational Global Navigation Satellite Systems in medium Earth Orbit (about 12,000 miles altitude). The Navigation Satellite Timing and Ranging (NAVSTAR) Global Positioning System (GPS) constellation was the first one, launched and supported by the U. S. Department of Defense. The Globalnaya Navigazionnaya Sputnikovaya Sistema (GLONASS) is the Russian constellation. BeiDou-2 is the constellation that China is building. Galileo is the constellation the European Union is building. Some GPS receivers only receive NAVSTAR signals; some (like the one in my desk clock) receive both NAVSTAR and GLONASS signals. (Henceforth I shall use the initialism GPS as a generic term.)

Accuracy versus precision. Accuracy is a measure of systematic error. Precision is a measure of random error. The classic example is if you fire arrows at a target and the arrows scatter in and around the bullseye, you are accurate but not precise; if the arrows all hit dead on the same spot that is not the bullseye, you are precise but not accurate. Clocks can be described the same way.

Clocks that are synchronized to GPS are accurate because of the timestamps transmitted by the GPS satellites, which can be read from the NMEA data stream presented by the GPS receiver. They are precise because of the 1Hz 1PPS pulse derived from the GPS transmissions that indicates exactly when the timestamp is correct. Not all GPS receivers emit the 1PPS pulse.
(Update 2017-04-27) I have recently learned that folks that think about timekeeping a lot deeper than I do prefer the term "stability" to the more statistical "precision". I personally find that term to be a little ambiguous in my line of work. But it's important to know the lingo.
Clock strata. Clocks in information and telecommunication systems are conventionally classified as to their precision, accuracy, and stability into strata. (This classification predates GPS and its use as a clock source.) Stratum-0 clocks are the most accurate and precise clocks we know how to construct: "atomic" clocks that discipline the duration of their ticks by measuring the physical properties of streams of cesium or rubidium atoms. Stratum-1 clocks are those that derive their timing from stratum-0 clocks, and so forth.

(Updated 2017-04-11) Although I don't see it referred to much, ANSI T1.101 (6.1, p. 24, 1987) specified the minimum accuracy for clocks of each strata in dimensionless units, omitting stratum-0; this was in the days of copper (e.g. DS-1) and occasionally optical (e.g. OC-3) telecommunications channels.

  • Stratum-1: 1 x 10-11
  • Stratum-2: 1.6 x 10-8
  • Stratum-3: 4.6 x 10-6
  • Stratum-4: 3.2 x 10-5

Clocks of each stratum are kept accurate and precise by a combination of continual adjustment to their higher-quality lower-stratum clock source, and by their own physical properties. For example, digital electronic clocks discipline the period of their ticks on the vibration of a crystal, which is more consistent if it is in a temperature controlled environment. I have worked on embedded systems which had surface mount clock crystals contained in a tiny oven - a thermostatically controlled chamber that kept the crystal at a constant operating temperature to insure consistent behavior - called an Oven Controlled Crystal Oscillator (OCXO). The inherent quality of a clock is a measure of its ability to holdover - to maintain the correct time during any period when it temporily loses contact with its lower-stratum clock source.

Time of day, duration, and frequency. When we talk about clocks we are really talking about three fundamentally different kinds of things: an instrument which measures the time of day; one which measures time duration; and one that serves as a frequency standard. Sometimes the same device can be used for all three, but in digital information and telecommunications systems, typically not.

When we think of a clock, we probably first think of one that tells us the time of day. In fact, some computer systems refer to such a device as the Time Of Day (TOD) clock. (POSIX likes to refer to this as the real-time clock, but this time is no more, and sometimes a lot less, real than other ways of measuring time.) A TOD clock is useful for knowing when to meet your friends for lunch. But it's usefulness as a measurement of time duration breaks down when it is

  • sprung forward or fallen backward for Daylight Saving Time, or
  • the NTP daemon adjusts the clock to synchronize it with a remote time source, or
  • a leap second is inserted (or subtracted) to re-align the TOD clock with the solar day.

Time Of Day clocks are useful for time stamping events in log files, especially if the clocks on different systems are more or less synchronized, and are even more useful if such timestamps are kept in a common time zone, like Coordinated Universal Time, so that related events in different systems can be easily correlated. But such time stamps are at best only approximately useful for measuring duration, since the TOD clock may be routinely adjusted forwards or backwards while the system is running. The POSIX standard defines the gettimeofday(2) system call to acquire the time of day, but this call is not appropriate for measuring duration for just the reasons cited.

When we think of measuring time duration, we might think of a stopwatch. The POSIX standard defines just such a capability in the clock_gettime(2) system call when it is invoked using CLOCK_MONOTONIC_RAW argument. This provides access to a monotonically increasing hardware clock with which duration can be measured by subtracting successive values from prior values. It is useless for telling time of day, but it is the right way to measure a time duration since it does not change as the TOD clock is adjusted.

All conventional digital electronic systems are synchronous: actions in integrated circuits are triggered by digital pulses generated, directly or indirectly, by the tick of a common frequency standard. Typically this frequency standard is a quartz crystal oscillator - basically the same hardware as in your quartz wristwatch. (It is this oscillator function - the thing that provides the tick of the clock - that the atomic clock provides.) Sometimes it is a periodic digital pulse arriving from an external source, like a communication channel. The entire global telecommunication infrastructure - landlines, internet connections, cellular systems - are in effect all synchronized using a shared frequency standard. A Bell Labs colleague once described this as "getting everyone in the world to jump up and down at the same time". Before GPS was available to provide such a shared standard, using a common constellation of satellites containing atomic clocks, AT&T owned several ground-based atomic clocks, which were used as high quality frequency standards for their network and others. The POSIX standard defines the setitimer(2) system call that can be used to access an interval timer that fires with a caller-defined period.
Update (2017-04-25): Although frequency, duration, and time of day are three very different things, they may be derived from one another. Time duration may be derived from counting ticks from a frequency standard. And time of day can be thought of as a nomenclature for naming a duration from a fixed point in time, such as midnight or noon. But in hardware and software systems, they are typically all treated very differently.
Offset, jitter, and drift. These are terms used to describe variation between two clocks, or between successive ticks of the same clock. Offset is easy: it is the absolute difference between two clocks. The hard part is that the offset is likely to change as time goes by; that's where jitter and drift come in. The exact definitions of jitter and drift are somewhat open to debate, but here is how I have come to think about them: jitter is a high-frequency, short-term, statistical variation in a clock; drift is a low-frequency, long-term, systematic bias in a clock. Or: jitter is about precision, drift is about accuracy.

Jitter and Drift

Jitter in real-time data streams like audio or voice is a small variation in the expected inter-arrival time of the packets, either a little too early or a little too late. With jitter, there is no correlation between successive packets; the next packet is just as likely to be early as late, so that the mean arrival time is correct. As long as jitter is small, packets arriving early can be accommodated by using jitter buffers; the receiver just holds the data until its expected arrival time occurs, then plays it out. However, if a packet of data arrives too late, it may have to be discarded because its "best if used before" time has past and the next packet in the stream is right on its tail, and the user will notice a audible or visible artifact in the play back of the stream.

Drift is most typically unidirectional, although it can also be bidirectional. Let's say you have an interval timer that is supposed to fire once per second - 1Hz. So you expect a timer event to occur at 1 second, 2 seconds, 3 seconds, etc. on the timeline. But the clock that is driving the interval timer is running at a frequency that is 10% too slow, so its period is 10% too long; the first timer event arrives at 1.1 second, the second at 2.2 seconds, the third at 3.3 seconds. The timeline of the events is shifting in the increasing direction. Something similar happens if the clock is 10% too fast, except the events on the time line shift in the decreasing direction. 10% is a big error; if the error in the clock is really tiny, you might not notice the shift in the timeline until a lot of them go by. This kind of thing can happen with the old school RS-232 serial connections because the transmitter and the receiver had no common reference clock. (My digital design friends would say that each end of the serial connection is in a different clock domain.) The received data would seem just fine - because the receiver was still sampling the incoming pulses within the window of correctness - until its clock drifted its sampling into the next character frame, at which point that character would be read incorrectly. That's why RS-232 has start and stop bits at the beginning and end of each character frame, whose purpose is to identify such a framing error so that the bad character can be discarded.

Navigation and time. Global navigation satellite systems know nothing about position. What they know about is time. Every GPS satellite has multiple redundant atomic clocks - cesium and/or rubidium - which are the most precise timepieces humankind knows how to construct. Every GPS satellite is accurate because the onboard atomic clocks are synchronized to within a few nanoseconds of each other by commands from ground stations.

Each GPS satellite continually transmits its unique identity number, a timestamp, and its ephemeris (a description of its orbit). By receiving this information from four different satellites, the receiver can solve four equations in four unknowns - its x, y and z position in space, and its t position in time -  because it can compute its exact distance from each of the four satellites. The receiver then uses an abstract model of the earth - the World Geodetic System - to convert the x, y and z spatial coordinates into something humans find more useful: latitude, longitude, and altitude above mean sea level.

Each satellite also transmits the offset between GPS time (which does not recognize leap seconds) and UTC (which does). This offset changes each time the International Earth Rotation Service (IERS) adds a leap second to UTC. This is done to keep our wall clocks in sync with the rotation of the Earth as it gradually slows down due to the drag of the tidal forces of the Sun and the Moon.

The use of time for navigation in this way is a very old idea. The technology of mechanical clocks themselves was driven by the need for an accurate and precise portable time standard with which to compute longitude when using celestial navigation during the age of sail. Later, in the age of steam, ships captains noted when they heard foghorns, which ran off mechanical clockwork automation, and could use the offset from their ship's clock plus knowledge of the speed of sound to compute their distance from shore. In WWII, Long Range Navigation (LORAN) used radio beacons to similar effect, although before digital computers it required the ship carry an oscilloscope - which in those days was the size of a dorm refrigerator and took three men and a little boy to carry - tied to the LORAN receiver; the operator would measure distances between pulses from LORAN transmitters at known locations to triangulate the ship's position.

Distributed systems and time. Whether you are running a vast distributed network of devices, a few computers in an organization, or just a laptop, there are a lot of advantages to synchronizing the Time Of Day clock on a computer with the Internet at large. That, and a lot of other network protocols depend on it. That's what the Network Time Protocol (NTP), and (on Linux/GNU systems anyway) the NTP daemon (ntpd) does. NTP on the client periodically exchanges time stamps with a remote server, and from those time stamps it computes a clock offset from the server and a round trip time between it and the server. This allows NTP to gradually synchronizes the clock on the local client with that on the remote server, often to within just a few milliseconds, by inserting or removing ticks of the TOD clock. It can do this with a number of remote servers and, by measuring the quality of the received timestamps in terms of offset, jitter, and drift, decide which remote server has the most accurate and precise clock and the best network connection.

While the statistical filters and algorithms used by NTP are beyond my ken, the protocol is not in principle complicated. Measurements are based on four time stamps:

  • T1 is when the client transmitted the request packet,
  • T2 is when the server received the request packet,
  • T3 is when the server transmitted the response packet, and
  • T4 is when the client received the response packet.

The time offset (which can be positive or negative) between the client and the server clocks is
Toffset = ((T2 - T1) + (T3 - T4)) / 2
(the trick to this algebra is the assumption that the travel time for the packet is the same in either direction) and the round trip time between the client and server is
Trtt = (T4 - T1) - (T3 - T2) .
Buying Instead of Building

It took a bit of effort to put this desk clock together. Besides the basic hardware assembly and minor software hacking, the Adafruit LCD board was a kit, so I had to break out the Weller soldering station.

Some Assembly Is Required

But the principle of operation of my desk clock is so simple, you could be forgiven for wondering "Hey, why can't I just buy one of these devices?" Indeed, you can. And for a lot less money than my desk clock cost me to build, taking my hourly rate into account. Cheap enough that you might be tempted, as I was, to actually purchase a couple of inexpensive GPS-based NTP servers and try them out on your home network, comparing them to your home brew device and to each other.

Communication Systems Solutions TM 1000A. The CSS TM 1000A (the TM stands for Time Machines) is US$300 GPS-disclipined NTP server in (literally) a black box.

CSS TM 1000A Time Machine NTP Server

Its front panel has three LEDs indicating power, satellite lock, and the 1PPS heartbeat.

CSS TM 1000A Time Machine NTP Server (back)

The rear panel has a round jack for its dedicated power brick, an RJ45 Ethernet jack, a DB9S serial port for debugging, and an SMA connector for the GPS antenna.

The SMA connector provides a voltage bias that is used to power the amplifier in the remote GPS antenna.

Screen Shot 2017-03-29 at 10.20.09 AM

The TM 1000A is (mostly) easily administered using a password protected web page at its initial address of 192.168.1.15. I say "mostly" because it took about an hour of fiddling on my part to understand that the device's embedded web server didn't work with the Safari browser on my desktop Mac. It wasn't until I plugged a USB-to-serial adaptor into the TM 1000A's debug serial port and watched the diagnostic output that I switched to using Firefox and everything worked just fine. Had I started with Firefox, device setup would have been a matter of five minutes or so.

One peculiarity of the web page is that it reports the signal strength of three satellites it uses to compute its time and space solution. My understanding is that four satellites are necessary for a solution since there are four unknowns: x, y, z and t. Not a big deal, but it does call into question my own understanding of the theory behind all of this.

The TM 1000A responds just fine to requests from NTP daemons on other computers on my LAN. It does not however seem to respond to queries from ntpq, the NTP administration tool. That's not a deal breaker, but it can make debugging NTP problems more difficult. I fired off a query to the CSS tech support email address, but have not heard back yet.
Update 2017-04-03: The CSS folks have verified that the Safari browser is not supported, and neither is ntpq.
I haven't tried this, but the TM 1000A can be configured to emit its 1PPS digital pulse on one of the pins on the serial connector. This means the device can in principle be use to discipline digital clocks on other devices using (for example) a phase-locked loop (PLL).

The best part about the TM 1000A is that you can order it from Amazon.com and have it at your front door in a couple of days.

Leo Bodnar Electronics LeoNTP. The Leo Bodnar LeoNTP is a US$325 GPS-disclipined NTP server that would fit in your shirt pocket.

Leo Bodnar LeoNTP NTP Server

It's front panel has an LCD display and a simple turn/push control whose use is reminiscent of the scroll wheel on the original Apple iPod.

Leo Bodnar LeoNTP NTP Server (back)

The rear panel has a BNC connector, an RJ45 Ethernet jack, a female USB type C connector for power, and an SMA connector for the GPS antenna.

I didn't test it, but the BNC connector can be administered to output the 1PPS digital pulse, allowing it in principle to be used to discipline other digital clocks.

I didn't test it, but the Ethernet jack supports Power over Ethernet (PoE), a common feature of Ethernet switches used in many telecommunications applications, which eliminates the need for a power brick (lots of VoIP phones for example use PoE, requiring only the Ethernet cable be run to the device from a PoE-capable switch).

The SMA connector provides a voltage bias that is used to power the amplifier in the remote GPS antenna.

Leo Bodnar LeoNTP Stratum-1 NTP Server (operational)

The LCD display and the scroll button are the sole means of administering the device. It is so simple to use that it makes you wonder what all the fuss about web pages is about. Administering the device's static IP address was a matter of just a few seconds work. The downside is anyone walking by the device can administer it; there is no security. That's not a problem for me (unless the cats figure it out), but it might be for you. The color of the display, white or orange, indicates whether the device has a satellite lock (white is good). A green dot in the upper left corner of the display blinks at 1Hz to indicate the 1PPS signal. A bar graph indicates the network load on the device as it processes NTP requests.

In theory, the front panel can be administered to display the local time by setting a time zone. However, not my time zone. Which is just as well, since to actually be correct, the device would have to understand Daylight Saving Time in my neck of the woods, which the powers that be are apt to change, requiring a firmware upgrade of the device. I'm happy with it displaying UTC.

Untitled

The LeoNTP slowly responds to ntpq queries, which is useful for debugging. The LeoNTP reports that it contains a 32-bit ARMv7E-M microcontroller. This suggests that the LeoNTP runs some flavor of real-time operating system (RTOS), or possibly even just a task loop, instead of Linux, which (as we shall see) is the likely reason for its higher precision and lower jitter when reporting the time via NTP.

I ordered my LeoNTP from Uputronics in England (and had it at my front door near Denver Colorado via UPS in just a few days - yet more evidence that I am living in the future), but it's sold in the U.S. and U.K. by other dealers as well.

Results

I now have both off-the-shelf NTP servers wired up to the Ethernet switch in my home office to which everything but my Mac laptop are connected. Both of these servers are plugged into a small UPS so that they can continue to keep time in the event of a brief power outage. The home brew desk clock communicates over WiFi and has a real-time clock with a battery backup.

Untitled

The TM 1000A (left) is named "waterclock". The LeoNTP (right) is named "sundial". My home brew desk clock is named "hourglass".

Untitled

I ran coaxial cable behind the bookcases and whatnot in my home office so that I can place three GPS antennas - two patch antennas and an outdoor marine antenna - in my second story office window where they can get a clear view of the sky. (This stage of the project may require an understanding spousal unit.) The transmissions from the GPS and GLONASS satellite constellations are remarkably low power - I'm told they have about the same received power as a night light - and their transmissions can be obstructed by as little as heavy tree cover. I was concerned about the glass window (and now I'm concerned about replacing this window with a more energy efficient model), but this setup seems to work fine: all three NTP servers quickly achieve a satellite lock. Early testing with the GPS board on the Raspberry Pi with a small patch antenna revealed it routinely can see as many as twelve satellites amongst the GPS and GLONASS constellations. The LeoNTP with the big waterproof marine antenna reports it can see fifteen satellites.

I have the NTP daemon on my Ubuntu development system "mercury" configured to query all three NTP servers, plus some other publicly accessible NTP servers on the Internet.

Untitled

Here is the output of an ntpq query on mercury. Using its own algorithms to measure the quality of each NTP server as a timing source, the NTP daemon on mercury has chosen the LeoNTP as its primary time reference, with the TM 1000A and the Pi as its backups. You can see the wired LeoNTP and TM 1000A have substantially lower round trip delay than the wireless Pi, and all three are a lot lower than any of the Internet servers. The LeoNTP wins in terms of measured jitter; that's why I suspect that the LeoNTP is running an RTOS with a lot less software latency and scheduling variation than you see with a typical Linux system. The use of stratum-1 time sources on the local network causes the NTP daemon on mercury to declare itself a stratum-2 time source.

All three NTP servers have been running for days now with no issues.

Conclusions

I'd be happy with either the TM 1000A or the LeoNTP as a time server on my home network. But the LeoNTP won me over with its clever user interface design, it's pretty front panel LCD display, its support of ntpqand its marginally better performance. It's support of a BNC connector could be a plus in the future; BNC and coax is a common mechanism used for exporting precision timing pulses to other devices. It's lack of security wasn't a concern for me.

Untitled

There can be no doubt that it is a lot cheaper to go with either of the off-the-shelf NTP servers than building your own. But maybe not as fun. Plus, now, when asked "Does anybody really know what time it is?" I can definitively say "I do!"

Updates

2017-04-10: added the jitter and drift diagram.

Sources

ANSI, Telecommunications - Synchronization Interface Standards for Digital Networks, T1.101, 1987

E. Kaplan (ed.), Understanding GPS Principles and Applications, Artech House Publishers, 1996

G. Miller, E. Raymond, "GPSD Time Service HOWTO", http://www.catb.org/gpsd/gpsd-time-service-howto.html

D. Mills, Computer Network Time Synchronization (2nd ed.), CRC Press, 2011

D. Mills, J. Martin, J. Burbank, W. Kasch, "Network Time Protocol Version 4: Protocol and Algorithm Specification", RFC 5905, 2010-06

J. Mogul, D. Mills, J. Brittenson, J. Stone, U. Windl, "Pulse-Per-Second API for UNIX-like Operating Systems" (Version 1.0), RFC 2783, 2000-03

C. Overclock, "Does anybody really know what time it is?", http://coverclock.blogspot.com/2006/11/does-anybody-really-know-what-time-it.html

Rakon Limited, "Timekeeping with Quartz Crystals", 2009

Raltron Electronics Corporation, "Stratum Levels Defined", http://www.raltron.com/products/pdfspecs/sync_an02-stratumleveldefined.pdf

E. Raymond, "Stratum-1-Microserver HOWTO", https://www.ntpsec.org/white-papers/stratum-1-microserver-howto/

D. Sobel, W. Andrewes, The Illustrated Longitude, Walker & Company, 2003

Wikipedia, "Accuracy and precision", https://en.wikipedia.org/wiki/Accuracy_and_precision

Wikipedia, "Atomic clock", https://en.wikipedia.org/wiki/Atomic_clock

Wikipedia, "Error analysis for the Global Positioning System", https://en.wikipedia.org/wiki/Error_analysis_for_the_Global_Positioning_System

Wikipedia, "Foghorn", https://en.wikipedia.org/wiki/Foghorn

Wikipedia, "Global Positioning System", https://en.wikipedia.org/wiki/Global_Positioning_System

Wikipedia, "GPS signals", https://en.wikipedia.org/wiki/GPS_signals

Wikipedia, "History of longitude", https://en.wikipedia.org/wiki/History_of_longitude

Wikipedia, "John Harrison", https://en.wikipedia.org/wiki/John_Harrison

Wikipedia, "LORAN", https://en.wikipedia.org/wiki/LORAN

Wikipedia, "Marine chronometer", https://en.wikipedia.org/wiki/Marine_chronometer

Wikipedia, "Network Time Protocol", https://en.wikipedia.org/wiki/Network_Time_Protocol

Wednesday, March 29, 2017

John Sloan and Internet Protocol version 6

Once again my good friends at Gogo Business Aviation in Broomfield Colorado have been following my work, this time on the Internet Protocol version 6 (IPv6) testbed that I wrote about in Buried Treasure. They invited my alter ego John Sloan to give a talk about it. The fact that they fed us all pizza may have had more to do with the packed house than the topic at hand. But Gogo has a vast system of cell towers - whose antennas point up - and an associated ground network to talk to aircraft that use their Air To Ground (ATG) network connectivity product. So they know a thing or two about networking in the cloud.

The folks at Gogo recorded the talk, made it available on YouTube, and generously allowed me to share it. You can find the slides for this talk on GitHub.



Thanks once again to my colleagues at Gogo Business Aviation for sponsoring this talk. It was fun!

Thursday, February 23, 2017

Clean Up In Aisle Three!

I typically use feedly.com to peruse my favorite blogs while I'm eating lunch. Yesterday I noticed that I (or Blogger) somehow managed to botch the RSS feed for chipoverclock.com so that my latest article either doesn't show up at all, or it returns a 404. So this is an attempt at a bit of a manual update: yesterday I posted my most recent epic article The Need for Non-Determinacy in which I lay out the case for adding a hardware entropy generator (which you may already have and not even know it) to your system. I review a number of such devices that I've tested, and describe the tests I put them through. There are pictures. It'll be fun. And it expands on the video of the talk on this topic that I posted in John Sloan and Hardware Entropy Generators. Cheers!

Monday, February 20, 2017

The Need for Non-Determinacy

Slot machine outcomes are controlled by programs called pseudorandom number generators that produce baffling results by design. Government regulators, such as the Missouri Gaming Commission, vet the integrity of each algorithm before casinos can deploy it. 
But as the “pseudo” in the name suggests, the numbers aren’t truly random. Because human beings create them using coded instructions, PRNGs can’t help but be a bit deterministic. (A true random number generator must be rooted in a phenomenon that is not manmade, such as radioactive decay.) PRNGs take an initial number, known as a seed, and then mash it together with various hidden and shifting inputs—the time from a machine’s internal clock, for example—in order to produce a result that appears impossible to forecast. But if hackers can identify the various ingredients in that mathematical stew, they can potentially predict a PRNG’s output. 
- Brendan Koerner , “Russians Engineer a Brilliant Slot Machine Cheat—And Casinos Have No Fix”, Wired.com
Tilting At Windmills

Around 2007 I was working on an embedded Linux/GNU product. I was perusing the installation notes for GNU Privacy Guard, which was an open source implementation of the popular (at the time anyway) encryption software package Pretty Good Privacy (PGP). I had this crazy idea for a mechanism by which I could securely automate certain aspects of system maintenance on the laboratory prototypes on which we were developing by using encrypted shell scripts. (I eventually did an open source implementation of this which I wrote about in Automating Maintenance on Embedded Systems with Biscuits.)

The GnuPG README talked about the implications of choosing  /dev/urandom versus /dev/random for acquiring random numbers with which to generate the encryption keys used by the software. I went off to read the random(4) manual page, after which I began to get the vague notion that I had no idea what I was doing. That realization launched me on a quest that I have been on for the better part of a decade now. And it led me owning what surely must be one of the world's greatest private collections of hardware entropy generators.

This article is the epic tale of that quest (and it expands on my talk John Sloan and Hardware Entropy Generators).

Ancient History

If you were to read the manual page for rand(3), one of several random number generators available in the standard C library, you would find two functions as part of that API.
void srand(unsigned int seed); 
int rand(void);
The srand function is used to initialize the random number generator with a seed, after which the rand function returns a different "random" number upon every subsequent call.
I write "random" because there was actually nothing random about it. If you initialize the generator by calling srand with exactly the same value for the seed, you will get exactly the same sequence of integer numbers with every call to rand.

In fact, not only will the same seed yield the same sequence of numbers from the generator, the sequence would eventually start to repeat, if you made enough calls to rand. It would take a lot of calls. But still.

So here's the thing: there is no randomness in the standard C - or indeed any - random number generator algorithm. The randomness, if there is any, comes from the value you choose for the seed.

It has to work this way, because there is nothing random about digital computer systems. We work really hard to eliminate any randomness, or non-determinism, from them. Determinism, or predictability, is a good thing. For the most part. Usually.

The period of the repeating cycle is also completely deterministic. If the random number generator has an internal state of N bits, than its output can have a period of no more than 2N values - and sometimes a lot less - after which it will begin to repeat. This is an inevitable result from information theory. But if you think about it, the algorithm, which is a kind of state machine, has to work this way. When there is no source of external stimulus, the same input - its internal state - yields the same output.

So if you were to see a long enough - perhaps a very, very, very long - sequence of values from a particular random number generator, it might be possible - expensive, but possible - for you to identify that sequence just from its pattern, correlate it with a particular seed value, and then predict what every single subsequent value from that random number generator would be.

That's why the people that actually worry about these things call these algorithms pseudo random number generators (PRNGs). So if you want truly random results, you have to figure out some way to introduce it using a random seed value. It is the seed that contains all the randomness, or unpredictability, or what information theorists, borrowing a term from thermodynamics, call entropy.
The rand(3) API has a long history. Way back in 1977 I was writing simulation code in Fortran IV for an IBM mainframe. The random number generator in the standard Fortran IV library had the same two functions with the same names, albeit in upper case of course (I don't think punch card machines even had a shift key): SRAND and RAND. And I got the bright idea of using the value of the mainframe's Time Of Day (TOD) hardware clock as the seed. Today, when I write simulations for a UNIX system, I might use the value returned by time(2), the system call that returns the number of seconds since the start of the UNIX epoch, as the seed to rand(3). Of course, that means if someone knew exactly when I ran my simulation, they could guess my seed. But for simulation code that needs to have different results every time it is run, it works pretty well.
Roll On Big O

If you think about the properties you want in an encryption algorithm, the one right at the top of the list would be: you want it to be computationally cheap to encrypt and decrypt, but almost impossible to brute force decrypt. Functions that are easy in one direction but almost impossible in another are called trapdoor functions, or sometimes one-way functions.

A good example of a trapdoor function is multiplying two big numbers together, then trying to brute force deduce what those original two numbers were by factoring the product into its prime factors. It turns out that if the two numbers were N bits long, then multiplying them together (or dividing the product by either original number) takes an amount of work proportional to N2, or O(N2) in "big O" notation. But factoring the resulting product into its prime factors and trying every combination of all those prime factors to reproduce the original two numbers proportionately takes O(2N) work.

We can make a little table showing how the amount of work for factoring relative to multiplying grows as N gets larger.

Scaling (Table)

The log10 columns tell you how many decimal digits are in the results, which is just a quick way to see the magnitude of those values. So if you were to multiply two 512 bit numbers together, the work involved would be proportional to a value more then five digits long. But if you were to brute force try to recover the original two numbers from their product, it would take an amount of work proportional to a value that is more than one hundred and fifty four digits in length.

Just to make this clear: brute force factoring for numbers eight bits long is 256/64 or four times as expensive than multiplication or division; for numbers 512 bits long it is within shouting distance of 10148 times more expensive than the original multiplication or division operation. That value is a 1 followed by one hundred and forty-eight zeros, a number so astoundingly big we don't have a word for it, but it is a lot larger than the estimated number of hydrogen atoms in the universe. (corrected 2017-03-03)

If it's not obvious what's going on, here is a graph of the log10 columns.

Scaling (Graphic)

Keep in mind that we're graphing the logarithm of the values. Otherwise the 2N graph would increase so quickly it would be unusable. The growth of the work required for multiply and divide is said to be polynomial. But the growth of the work for factoring is exponential, which grows far far faster.

You could be forgiven if the thought occurred to you: hey, if we could just encrypt a message by multiplying it by a secret key (the message and the key being just numbers inside a computer), and decrypt the encrypted message by dividing it by that same secret key, the bad guys would find it almost impossible to brute force decrypt our encrypted message.
This is a issue even for very well funded government intelligence organizations. Thanks to Wikileaks and Edward Snowden, we now know that the U. S. National Security Agency (NSA) reads secrets by tapping into the unencrypted data stream, not by breaking the encryption, often with the cooperation of the service provider. (Updated 2017-03-10)
Similarly, while the NSA pursues mass surveillance, thanks to Wikileaks and an as of yet unknown source we now also know that the U. S. Central Intelligence Agency (CIA) pursues individual surveillance by hacking a mobile phone and accessing its data before it is encrypted (or after it is decrypted), not by breaking the encryption on the device. (Added 2017-03-10)
With a large enough secret key, all the computing power in the world might not be enough to break our encryption. And as computers grow more powerful, if we get worried, we can just make our key a few bits longer, eat the (relatively) small increment in computing power necessary to encrypt (multiplying) and decrypt (dividing), and the computing power necessary for brute force cracking (factoring) of our encryption becomes so large that all the computing power in the universe wouldn't be enough.

And that's exactly what happens.
Almost all modern encryption algorithms are based on multiplication/factoring. But there are other trapdoor functions, and encryption algorithms based on them - elliptic curve cryptology is one you'll read about. Some Microsoft researchers discovered what is widely believed to be a backdoor deliberately engineered into one elliptic curve encryption algorithm - the Dual Elliptic Curve Deterministic Random Bit Generator or Dual_EC_DRBG - that was endorsed by the U.S. National Institute of Standards and Technology (NIST) as Federal Information Processing Standard (FIPS). Which is why studying this area is a bit like reading a spy thriller. Because it is a spy thriller.
Why We Care

It is obvious now that the security of your encryption mechanism all revolves around keeping your secret key secret. If the bad guy has your secret key, you might as well be publishing your secret messages on Facebook.

You have to choose a secret key that the bad guy isn't likely to guess. Your birthday, or your mobile phone number, and what not, is a really bad idea.

And you have to have some secure way to communicate your secret key to the person who is to receive your encrypted messages. If the bad guy taps your communication channel and reads your secret key, he has the means to trivially break your encryption.

You also need a lot of secret keys. Every time you click on a web link whose Uniform Resource Locator (URL) uses the https protocol, like in https://www.google.com - the s standing for "secure" - you need a new secret key with which to encrypt the data going to the web server and coming back to your web browser.

So what your web browser does is it generates a new secret key for every such transaction, and it does so in a way that the bad guy is not likely to guess: it generates a random number to use as the key.
What I have described is symmetric key encryption - the key used for encryption is the same one used for decryption. Public key encryption - where the key used for encryption is different from the one used for decryption - is asymmetric. Public key encryption is beyond the scope of this article. (That's what guys like me say when they don't really understand something.) But public key encryption is so computationally expensive that it is typically used just for key exchange - securely transmitting the random number used for symmetric key encryption to the recipient of your message, after which the message or data stream is encrypted using the symmetric key mechanism I just described. Public key encryption solves the problem of how to transmit your secret symmetric key securely to your message recipient.
The security of your encryption now all depends on the quality of your keys. The quality of your keys depends on the quality of your random numbers. And the quality - or unguessability - of your random numbers now depends on [1] the quality of the seed used to generate those random numbers (the more entropy the better), and [2] the period of the pseudo random number generator you are using (the longer the better).

And now we are on a mission to find a good seed.

The random Device Driver

If the option CONFIG_ARCH_RANDOM was enabled when the Linux kernel was configured and built, there is a device driver that generates random numbers. An application can acquire random numbers just by doing a read(2) against this driver. The driver exposes two interfaces in the /dev device directory: /dev/random and /dev/urandom. The random(4) manual page distinguishes between the two interfaces by telling us that /dev/random blocks (makes the application wait on the read) if the available entropy in the driver is exhausted, while /dev/urandom endlessly delivers as many values as the application cares to read. This raises the obvious question as to what exactly is /dev/urandom is returning if /dev/random is blocked because entropy is exhausted? And where does that entropy come from?

Linux Random

We can find the source code for this driver in drivers/char/random.c in the kernel source tree (3.10.105 is the version I perused). It's not that big. We'll start at the application interface and work backwards to discover where these random numbers from from.

/dev/urandom and /dev/random each have their own 512 bit pool that is used to store Cryptographically Secure Pseudo Random Numbers (CSPRNs). "Cryptographically Secure" means the random numbers are of high enough quality to be useful for cryptography, that is, for encryption. "Pseudo" is our first clue that they are the result of a pseudo-random number generator algorithm.

And indeed they are: they are generated by SHA-1, a well known cryptographic hash algorithm invented by the U. S. National Security Agency (NSA). SHA-1 takes an input - a seed - of any size and outputs a 160 bit number or cryptographic hash value. This hash value is said to be "cryptographic" because (you guessed it) it has properties useful for cryptography, one of the main ones being that given a SHA-1 hash value it is so expensive to try to brute force recover the original input value as to be computationally infeasible.
Whether this is true or not for SHA-1 today a matter of some debate in certain circles. Some flaws that have come to light in the SHA-1 design has shown that its effective internal state is not as large as once thought, reducing its period to within brute force attack range of at least state actors with sufficient computational resources. People who are deeply concerned about security no longer consider SHA-1 adequate. This means that the quality of the random bits produced by the random device driver in Linux depends even more now on the quality of the entropy sources feeding it.
When an application reads from /dev/urandom or /dev/random, the driver removes bits from the appropriate pool, as long as there is entropy there to be had. The difference is when the driver determines that the level of entropy, as measured in bits, is insufficient, the driver blocks the application reading from /dev/random until the entropy is replenished. (We'll get to exactly what that means in a moment.) But when an application reads from /dev/urandom under the same circumstances, the driver cycles back through SHA-1, rehashing to generate a new output value. This can go on forever, as long as applications continue to read from /dev/urandom while there is insufficient entropy.

We have enough background now to know that the quality of the random numbers from this driver must depend on [1] the quality of the entropy being fed as a seed to SHA-1 for either /dev/random or /dev/urandom, and [2] the period of the SHA-1 algorithm when it gets used iteratively by /dev/urandom.

Note that there is still no source of entropy here. SHA-1 is a deterministic algorithm.

The SHA-1 function takes its input from a 4096 bit pool of entropy. The bits stored in this pool, no matter from whence they came, flow through a Cyclic Redundancy Check (CRC) polynomial that is used to merely mix bits from different entropy sources together, rendering the bits stored in the entropy pool anonymous, that is, unidentifiable as to their origin or meaning. This prevents a bad guy from learning anything useful about the internal state of the system from perusing the contents of this entropy pool.

Nope, still no source of entropy. The CRC is completely deterministic.

Measuring, and Consuming, Entropy

When the driver blocks a read on /dev/random, it is not measuring entropy based on the number of bits in either the 512 bit CSPRN pool used by /dev/random, nor on the number of bits in the 4096 bit entropy pool that feeds the SHA-1 algorithm. They are not that kind of First In First Out (FIFO) pools. Instead the driver tries to measure the effective randomness of the bits flowing into the CRC. It does this by measuring the first and second derivatives of the values for each of the entropy sources feeding the CRC.

This sounds a lot more high falutin' than it is. The driver merely looks to see if the difference between successive values of the original entropy data is changing (the first derivative), and if the difference between successive values of the first derivative is changing (the second derivative). From these values that are the result of simple subtractions, it computes an approximate measure of the entropy, or amount of randomness, in the data feeding the entropy pool. It is when this measure reaches zero that the behavior of both /dev/random and /dev/urandom changes.

You can read this measure of entropy but just looking at the value of the /proc file system entry /proc/sys/kernel/random/entropy_avail. You can do this from the command line by just using the cat command. But if you do so, you'll notice something interesting: every time you run the command, the amount of entropy is reduced!

This is because the kernel itself is a consumer of entropy, even if the entropy is not being used to generate encryption keys. The Linux kernel implements a feature called Address Space Layout Randomization (ASLR). Whenever the kernel dynamically allocates memory, including to run processes like the cat command, it reads bits from /dev/urandom to use to randomize where in the virtual memory address space the allocated memory block is placed.

This makes it much more difficult for the bad guy to attack the kernel by, for example, using kernel exploits - also known as bugs - to deliberately fool the kernel into overflowing buffers, resulting in overwriting other kernel data structures or even executable instructions.

As result, the pool of available entropy in the kernel driver is continuously being consumed.
Remarkably, some researchers have recently found a way to defeat ASLR, or at least reduce its effectiveness, by examining values stored in the cache and virtual memory management hardware of the microprocessor.
Sources of Entropy, and Lack of Them

So exactly where is the entropy in the kernel random driver coming from? As I mentioned before, digital computer systems kinda suck by design at non-determinacy. So entropy in Linux comes from natural sources of non-determinism: measured latencies of physical I/O devices like spinning disks; activity by users of human interface devices (HID) like keyboards and mice; inter arrival time of network packets and interrupts; variation in the firing of timers; and other sources of unpredictable behavior.

The random driver contains a kernel API by which other device drivers in the system can report these values and add them to the entropy pool. You can find calls to this API littered throughout the Linux kernel. Here are some to look for, using your favorite source code browser.
add_disk_randomness 
add_device_randomness
add_timer_randomness
add_input_randomness
add_interrupt_randomness
add_hwgenerator_randomness
It was at this point in my fiddling with GnuPG, way back in 2007, that I began to get concerned. I was working on an embedded Linux product that ran completely from RAM disk. It had no interactive users. It only rarely used the network. Most of the time it spent dealing with an FPGA whose activity was based on a highly precise clock synchronized to GPS time using a phase locked loop.

I came to realize that there was almost no non-determinacy in the system. My product spent most of its time with little or no entropy. Which meant that /dev/random was effectively unusable, and the quality of the random numbers from /dev/urandom kinda sucked.

I realized that this could be a issue with all embedded Linux systems. Over time, I was to discover that this situation was hardly unique to me, or even to embedded systems.

Virtual machines running Linux - the backbone of cloud based computing - frequently ran short of usable entropy, because their I/O devices were for the most part nothing more than software abstractions on top of the system on which they ran, and which exhibited little or no random behavior.

And even an honest to goodness actual Linux server may have little or no entropy available when it first boots up, which is exactly when it may be establishing secure socket layer (SSL) connections to other servers, long lived connections that were made using poor quality seeds to their encryption algorithms.

What we needed was a reliable source of true entropy that could be easily integrated into products built around the Linux/GNU stack, a source that would not rely on largely hypothetical non-determinancy.

Note that the implementation of /dev/random and /dev/urandom are virtually identical. Both deliver random bits from the same entropy pool, filled from the same entropy sources, mixed with the same CRC polynomial, and conditioned by the same SHA-1 algorithm. The only difference is what they do when the estimated entropy in the driver is exhausted. So the answer to my original question back in 2007 about whether to use /dev/random or /dev/urandom is:
  • always use /dev/urandom,
  • but make sure there is a high quality source of entropy in the system,
  • and use /dev/urandom to frequently reseed a pseudo random number generator to avoid detectable repeating patterns.
The good news is that so many folks are concerned about this, true hardware entropy sources are becoming widely available. So widely, they may already be all around you, if you care to look for them: in the form of external devices, hardware components on the motherboard, a peripheral built into a System on a Chip, or even a machine instruction. And most of them are easily integrated into your Linux/GNU-based system.
The issue about reseeding a PRNG before it repeats is not an idle concern. Security folks are starting to become worried about long lived Transport Layer Security (TLS) sessions, TLS being an encryption mechanism used for Internet traffic. It has been demonstrated that the TLS encryption can be brute force cracked if enough blocks from the encrypted TLS stream are collected. "Enough" is a big number - billions - but it is within the realm of practicality for adversaries with sufficient resources.
Entropy is a Natural Resource

There are a number of physical processes that exhibit behavior that is effectively unpredictable and which can be digitized. These sources of entropy can be broadly divided into two categories: quantum or classical.

Quantum entropy sources are, as far as we know today, wired right into the physical nature of reality. Our current understanding of physics leads us to believe that there is no way to externally influence such entropy sources. Examples of quantum entropy include whether a photon passes through or is reflected by a partially silvered mirror; when a radioactive particle decays; when an electron tunnels through the band gap of a transistor to its base; when a photon falls on a photodiode and creates an electrical current.

Classic entropy sources may be said to be more chaotic than random, that is, influenced by environmental factors that are typically beyond our control. Examples of classic entropy include thermal noise from a resistor; electronic noise from an avalanche diode or from a zener diode in breakdown; atmospheric noise from a radio receiver; analog/digital convertor noise. These are all noise sources that a competent electrical engineer would normally work hard to eliminate in any design.

Each of these entropy sources can be amplified, instrumented, and digitized to produce a string of ones and zeros. However, even though they are random processes, such processes may not be unbiased: they may be more likely to produce ones or zeros. The bias can occur from practically unmeasureable variations in materials, or from minuscule differences in the sampling rate of the digitization, so that even in a modern semiconductor fabrication process, successive devices may yield very slightly different results.

A simple algorithm for whitening the bit stream - turning it into true white noise where ones and zeros occur with equal probability - is a problem solved long ago by none other than mathematician,  computer scientist, information theorist, and Cold War strategist John von Neumann.

Every hardware entropy generator consists of (at least) three stages: a hardware process that generates true entropy, a stage that digitizes the entropy into a stream of ones and zeros, and a whitening stage that conditions the binary stream into white noise. These stages are typically implemented in hardware, or microcontroller firmware, embedded inside the hardware entropy generator.

Testing Entropy in the Short Run

For hardware entropy generators destined for use in applications which must conform to U.S. Federal Information Processing Standards (FIPS), which include not just U.S. government systems, or systems used by U.S. government contractors, but also commercial companies or regional agencies who just think such conformance is a good idea (or, as it turns out, manufacturers of electronic games of chance destined for gambling casinos that don't want to be ripped off), the U. S. National Institute of Standards and Technology (NIST) defines a set of statistical tests, FIPS 140-2, to be continuously run on the output of such generators to insure that they remain unbiased and have not failed in some non-obvious way. This is a fourth stage of such generators.

Some of the generators I have tested implement these FIPS tests themselves, but most take advantage of the rngd(8) user space daemon that is part of the rng-tools package available for most Linux/GNU distributions. rngd implements these tests in software against any hardware entropy generator for which it is configured to extract entropy to be injected via  ioctl into the random device driver.

For most of the hardware entropy generators that I have tested, it is a very simple matter to integrate them into the system using rngd: usually a matter of installing the rng-tools package, doing some minor editing to the configuration file /etc/default/rng-tools, and sometimes a little udev rule hacking so that the generator shows up as a serial device in /dev.

Verifying Entropy in the Long Run

In addition to the FIPS tests, which are fast enough to be done in real-time, several folks who are interested in such things have developed suites of tests to be run against hardware entropy generators, and pseudo random number generators in general, to verify their broad long-term statistical properties. Each of these tests or suites of tests consume anywhere from kilobytes to terabytes of random bits, and require anywhere from seconds to weeks to run, even on a fast processor with a high-speed interface to the generator.

Scattergun is the code-name for my project to test consumer-grade hardware entropy generators and some software pseudo random number generators. It consists of the test script scattergun.sh, a Makefile to drive it, and a smattering of supporting software required to use the dozen or so generators that I own. (The supporting software included in Scattergun might be really useful to anyone using the more exotic generators, independent of the test script.) Scattergun is open source and can be found on GitHub. The raw results of all the testing are part of that repository as well. A link to the repository can be found near the end of this article.

Here is a brief description of the tests that the Scattergun script runs. The Scattergun script runs all of them, serially, accessing the hardware random number generator directly, without requiring that the generator be integrated into the system.

rngtest. Like rngd, rngtest is installed as part of the rng-tools package. It splits out the FIPS 140-2 tests that rngd runs in real-time into a command line tool where they can be run on demand. If a generator can't pass rngtest, it's not going to integrate into the system using rngd.

bitmap. This clever test ("clever" because I didn't think of it) exploits the ability of the human eyes and brain to perceive patterns in complex data. It extracts sufficient bits from a hardware entropy generator to create a 256 x 256 pixel Portable Network Graphics (PNG) file in 8-bit color. A successful generator should produce a PNG file that, in the words of one of my favorite authors William Gibson, "is the color of television, tuned to a dead channel". It is interesting to compare PNG files from various entropy sources and realize you can detect that the source only produces, for example, positive numbers because the most significant bit in each number being zero creates a distinct pattern, or it limits the dynamic range of values in some way that creates higher contrast colors.

bcm2708 rawtoppm (Pi 3)

This bitmap is from the hardware random number generator built into the Broadcom chip used on the Raspberry Pi 3.

crandom rawtoppm

This bitmap is from the random(3) C function that produces positive long integers.

cmrand48 rawtoppm

This bitmap is from the mrand48(3) C function that produces positive and negative long integers.

ent. This is a quick "pseudorandom number sequence test program" written by John Walker. Some Linux/GNU distributions provide ent as an installable package. For those that don't, it can be easily downloaded from his web site, built, and installed. I like ent because it quickly provides an estimate of the effective entropy per byte, e.g. 7.999958 bits.

SP800-90B. This is a large suite of tests written in Python, developed by NIST, and made available on GitHub. It takes one or more hours to run on the typical generator. During this long duration side project, the repo on GitHub has been occasionally updated, so the SP800-90B tests have evolved as I've been using it. Since this suite is a NIST standard, its results probably carry more weight in certain circles.

dieharder. This is an enormous suite of tests developed by Robert Brown, a physicist at Duke University. Its name was inspired by the earlier diehard suite of statistical tests developed by George Marsaglia (and of course the Bruce Willis movie franchise). This dieharder suite consumes about a terabyte of entropy and can run for days or even weeks, depending on the speed of the hardware entropy generator, the interface to it, and the processor on which the test is running. It even drove one of the generators (OneRNG) I tested into failure due to overheating. I like that about it.

If you examine the results of these tests in the Scattergun GitHub repo, you will notice that some of the tests identify the generator as "weak", and some tests may even fail. This is not unusual. Statistically speaking, a sequence of numbers that doesn't look random is just as likely to be generated as any other sequence, so an occasional poor result is to be expected. It is when a lot of tests fail that you need to be concerned.

Shapes and Sizes (added 2017-03-03)

Hardware devices that generate true entropy, no matter what physical process they use to do so, come in a lot of shapes and sizes, with a variety of interfaces and features. Most of these can be integrated into your Linux/GNU system using the rngd(8) user space daemon from the rng-tools package. Some are already built right into the random(4) driver and just require the correct kernel configuration option (which may already be there).

External devices. Most of the generators I tested are external devices that connect to the host computer via the Universal Serial Bus (USB). Even so, there was a wide variety of software interfaces to these devices. The easiest enumerate as a serial port, typically requiring some minor udev rule hacking, and were trivial to use by just reading random bits from the port (TrueRNG2, TrueRNG3, TrueRNGpro, FST-01/NeuG). Some enumerated as a serial port, but required software to write commands to that port to initialize the device (OneRNGs). Some USB generators required a vendor-provided user-space utility to read from them (BitBabblers). One required a substantial software development effort (that I've done for you, yay!) using a vendor-provided library (Quantis).

Internal devices. The OneRNG comes in a form factor made to attach to an internal USB connector now present (as I was to learn) on many motherboards.

Internal cards. I didn't test it, but the the Quantis is available in a version with multiple entropy sources on a PCI-bus card.

Motherboard components. Although I didn't test it, Apple iPhones that use the A7 processor (the iPhone 5S and later) have a co-processor called the Secure Enclave Processor (SEP). The SEP has its own ARM core and has as a peripheral a hardware entropy generator. This generator is used to seed the encryption used by the iPhone.

Many personal computers intended for the enterprise or government markets include a Trusted Platform Module (TPM). The TPM, an industry standard created by the Trusted Computing Group (TCG), and examples of which are manufactured by several vendors, is similar in function to the SEP, and includes a hardware entropy generator. Even if you don't use the other features of the TPM, you can easily incorporate the TPM's generator into your Linux/GNU system as an entropy source. I've tested TPMs from two different vendors, in a laptop and in a server, belonging to one of my clients. Because that work was covered under non-disclosure, I don't include those results here. But you may have a TPM with a hardware entropy generator on your system already.

System peripherals. Some System on a Chips (SoC) feature entropy generators as a built-in peripheral. The Broadcom SoC used by the Raspberry Pi 2 and 3 boards (I haven't tested the Raspberry Pi 1) have such a peripheral. It's not enabled by default, but it's easy to do so by just loading the appropriate device driver module and using rngd(8).

Machine instructions. Recent Intel processors have implemented a hardware entropy source, and a PRNG that is automatically periodically reseeded by that source, as machine instructions: rdseed (Broadwell family) and rdrand (Ivy Bridge family) respectively. Latest versions of the Linux kernel destined for these processors automatically make use of rdrand to augment the legacy entropy sources in the random device driver.
There is some controversy about these instructions thanks to Intel's collaboration in the past with the U. S. National Security Agency and the closed nature of their implementation. The Linux kernel maintainers have said that this is why they don't use rdrand as their sole entropy source. However, by this time you will be concerned that it may effectively be the only entropy source on an embedded system using an Intel processor. This may be justification enough to use an additional entropy source. One of my clients took the wise precaution on an Intel-based embedded product of also using the entropy generator on the non-Intel TPM chip that was available on the ComExpress module they used in their design.
Aspects (updated 2017-04-23)

Remarkably, there are a lot of hardware entropy generators on the market. Making an informed decision as to which one you want to use requires that you consider what your priorities are. It is a surprisingly multi-dimensional decision matrix.

Correctness. This is the easy one. I'll just jump right to the punch line and tell you that all of the hardware entropy generators that completed the Scattergun script did fine. One device failed during testing, and a new version (said to be more tolerant of heating) will start testing soon. Another is being tested now, and looks good. I noted the number of entropy bits (relative to eight bits) reported by ent, and the minimum entropy (ditto) reported by SP800-90B for its Independent and Identically Distributed (IID) test.

Price. For hardware entropy generators, I've paid as little as US$50 and as much as US$1100 (definitely an outlier on the price axis). There is no correctness difference, and not as much performance difference as you might expect, between the far ends of that range. You may be able to purchase a perfectly adequate hardware entropy generator for, if not lunch money, at least good dinner money, depending on what your other priorities are.

Acquisition. One of my generators I purchased by clicking the one-click ordering button on Amazon.com, had US$50 or so deducted off my credit card, and thanks to Amazon Prime had the device delivered to my front door in a couple of days. For others, I had to exchange encrypted email, send FAXes to Australia, and then wait for a shipment from China. All of the other devices fall somewhere on that range of effort to acquire.

Performance. The performance measurements should probably be taken with at least a small grain of salt, since there were a number of buffers between the actual device and the test suite. Even so, there is probably some merit in comparing the numbers. There was a fair amount of performance difference, in terms of entropy bits per second, among the generators. Some of the generators are peripherals built into an SoC (Raspberry Pi 2 and 3). Some are machine instructions built directly into the processor (rdrand and rdseed). Most are external USB devices. There was little correlation between the price of the device and its performance. But the interface typically made a difference. As you would expect, generally the internal devices performed better. But the best performer was an US$40 device with a USB interface (Chaos Key); the second best was the US$1100 device that also interfaced by USB (Quantis); the third place performer was a US$99 USB device (TrueRNGpro).

Robustness. One of the devices (OneRNG) failed during testing, probably due to overheating, which was a common complaint by users of this device. The vendor is now shipping a new version that I have received but have not tested yet. None of the other devices failed during testing, although I did manage to destroy one (OneRNG) designed for an internal USB connection on a motherboard, by hooking it up incorrectly.

Integration. Some of these devices were trivial to integrate into the system, requiring at most just a little udev hacking to get their USB interface recognized as a serial device. Some required use of a vendor-supplied user-space daemon or script (OneRNG, BitBabbler). And some (Quantis) required a substantial effort in software development using a vendor-supplied library; it's performance might be worth it.

Mechanism. Most of the hardware entropy generators use one of the cost effective mechanisms based on solid state electronics to generate entropy. Some have multiple entropy sources of different types which were mixed together. Some have multiple entropy sources of the same type used in parallel for more performance. And the Quantis has a beam splitter: a semiconductor photon emitter aimed at a photon detector behind a partially silvered mirror, for full on Heisenberg quantum uncertainty multiverse action. There probably are reasons why organizations that have an unusual sensitivity to the unimpeachability of their random numbers might use such a device.

Openness. This is one domain where it may pay to be a foil-hat-wearing member of the conspiracy theory club. Some of the devices were completely open, their firmware source code and hardware schematics published by the vendor. Most were completely closed and proprietary. Some of them (TrueRNG2, TrueRNG3, TrueRNGpro) were designed by folks within commuting distance of the headquarters of the National Security Agency; this is probably not a coincidence: that neck of the woods would tend to attract engineers who know how to design stuff like this. You'll have to decide how paranoid you are about these kinds of things.

Hardware Entropy Generators I Have Known (updated 2017-04-23)

Here is a rundown of the hardware entropy generators that I own, listed in the order that I tested them. Or am testing them. Or will be testing them.

Unfortunately, the Scattergun script doesn't lend itself to precise performance measurements since the script is fed data from the device through a UNIX pipe. Never the less, I did note the speed at which random bits were consumed from the pipe, figuring that over large amounts of data it would even out to more or less match the rate at which data was introduced into the pipe. All speeds are in kilobits or megabits per second.

TrueRNG v2 and TrueRNGPro

TrueRNGpro (right).
Vendor: ublt.it (U.S.).
Correctness: 7.999958 entropy bits per byte, 7.94464 minimum entropy bits per byte.
Price: US$99.
Acquisition: easy.
Performance: 3.2Mbps claimed, 3.4Mbps measured.
Robustness: high.
Integration: easy (USB) (and the LEDs are comforting too).
Mechanism: semiconductor avalanche effect (two parallel sources).
Openness: closed.
Source: http://ubld.it/products/truerngpro

Quantum Entropy Generator

Quantis-USB-4M (left).
Vendor: ID Quantique (Switzerland).
Correctness: 7.999962 entropy bits per byte, 7.94601 minimum entropy bits per byte.
Price: US$1100 (E990).
Acquisition: medium.
Performance: 4Mbps claimed, 3.9Mbps measured.
Robustness: high.
Integration: difficult (USB but requires software development using vendor library).
Mechanism: beam splitter with photon counter (quantum effect).
Openness: closed.
Source: http://www.idquantique.com/random-number-generation/quantis-random-number-generator/

Raspberry Pi 2: "Bronze" formerly "Betty"

Raspberry Pi 2 (32-bit) BCM2708 HWRNG.
Vendor: Broadcom (BCM2836 SoC).
Correctness: 7.999954 entropy bits per byte, 7.94918 minimum entropy bits per byte.
Price: US$35 (Raspberry Pi 2).
Acquisition: easy.
Performance: 824Kbps measured.
Robustness: high.
Integration: easy (device driver).
Mechanism: undocumented.
Openness: open software, closed hardware.
Source: https://www.raspberrypi.org/products/raspberry-pi-2-model-b/

GNU Press FST-01 NeuG

FST-01 NeuG.
Vendor: Flying Stone Technology via GNU Press (Free Software Foundation).
Correctness: 7.999959 entropy bits per byte, 7.94114 minimum entropy bits per byte.
Price: US$50.
Acquisition: easy.
Performance: 602 Kbps claimed, 653 Kbps measured.
Robustness: high.
Integration: easy (USB).
Mechanism: analog/digital convertor noise.
Openness: open (both software and hardware).
Source: https://shop.fsf.org/storage-devices/neug-usb-true-random-number-generator

Untitled

rrand (machine instruction in the Ivy Bridge processor family and later).
Vendor: Intel (i7-6700T in my test).
Correctness: 7.999957 entropy bits per byte, 7.94404 minimum entropy bits per byte.
Price: N/A.
Acquisition: N/A.
Performance: 1600Mbps claimed,  228Mbps measured.
Robustness: high.
Integration: easy (integrated into random device driver in later Linux kernels).
Mechanism: AES PRNG periodically reseeded by the rdseed mechanism (see below).
Openness: open software, closed hardware.
Source: https://software.intel.com/en-us/articles/intel-digital-random-number-generator-drng-software-implementation-guide

rdseed (machine instruction in the Broadwell processor family and later).
Vendor: Intel (i7-6700T in my test).
Correctness: 7.999958 entropy bits per byte, 7.9408 minimum entropy bits per byte.
Price: N/A.
Acquisition: N/A.
Performance: 3.5 Mbps measured.
Robustness: high.
Integration: easy (integrated into random device driver in later Linux kernels).
Mechanism: thermal noise.
Openness: open software, closed hardware.
Source: https://software.intel.com/en-us/articles/intel-digital-random-number-generator-drng-software-implementation-guide

Raspberry Pi 3: "Copper" formerly "Shortcake"

Raspberry Pi 3 (64-bit) BCM2708 HWRNG.
Vendor: Broadcom (BCM2837 SoC).
Correctness: 7.999956 entropy bits per byte, 7.9334 minimum entropy bits per byte.
Price: US$35 (Raspberry Pi 3).
Acquisition: easy.
Performance: 1Mbps measured.
Robustness: high.
Integration: easy (device driver).
Mechanism: undocumented.
Openness: open software, closed hardware.
Source: https://www.raspberrypi.org/products/raspberry-pi-3-model-b/

Untitled

TrueRNG v2.
Vendor: ublt.it (U.S.).
Correctness: 7.999958 entropy bits per byte, 7.94464 minimum entropy bits per byte.
Price: US$50.
Acquisition: easy.
Performance: 350 Kbps claimed, 367 Kbps measured.
Robustness: high.
Integration: easy (USB).
Mechanism: semiconductor avalanche effect.
Openness: closed.
Source: http://ubld.it/products/truerng-hardware-random-number-generator/

Moonbase Otago OneRNG (External)

OneRNG V2.0 External.
Vendor: Moonbase Otago (New Zealand).
Correctness: N/A.
Price: US$40.
Acquisition: medium (takes a while).
Performance: 350 Kbps claimed.
Robustness: failed during testing (overheating a common complaint, may be fixed in V3.0).
Integration: medium (USB but requires some initialization).
Mechanism: semiconductor avalanche effect and atmospheric noise.
Openness: open (both hardware and software).
Source: http://onerng.info

Moonbase Otago OneRNG (Internal)

OneRNG v2.0 Internal.
Vendor: Moonbase Otago (New Zealand).
Correctness: N/A.
Price: US$40.
Acquisition: medium (takes a while).
Performance: 350 Kbps claimed.
Robustness: I managed to fry my one unit by hooking the internal USB cable up incorrectly.
Integration: medium (internal USB but requires some initialization).
Mechanism: semiconductor avalanche effect and atmospheric noise.
Openness: open (both hardware and software).
Source: http://onerng.info

Voicetronix BitBabbler Black and White

BitBabbler White (bottom).
Vendor: VoiceTronix Pty. Ltd. (Australia).
Correctness: 7.999951 entropy bits per byte, 7.94498 minimum entropy bits per byte.
Price: AUS$199.
Acquisition: difficult (required a fair amount of effort).
Performance: 2.5 Mbps claimed, 1.2 Mbps measured.
Robustness: high.
Integration: medium (USB but requires vendor supplied user-space daemon).
Mechanism: shot noise, Johnson-Nyquist noise, flicker noise, RF noise (multiple sources).
Openness: open software and hardware.
Source: http://bitbabbler.org

BitBabbler Black (top).
Vendor: VoiceTronix Pty. Ltd. (Australia).
Correctness: 7.999963 entropy bits per byte, 7.93873 minimum entropy bits per byte.
Price: AUS$49.
Acquisition: difficult (required a fair amount of effort).
Performance: 625 Kbps claimed, 307 Kbps measured.
Robustness: high.
Integration: medium (USB but requires vendor supplied user-space daemon).
Mechanism: shot noise, Johnson-Nyquist noise, flicker noise, RF noise.
Openness: open software and hardware.
Source: http://bitbabbler.org

Untitled

TrueRNG v3.
Vendor: ublt.it (U.S.).
Correctness: 7.999956 entropy bits per byte, 7.94505 minimum entropy bits per byte.
Price: US$50.
Acquisition: easy.
Performance: 400 Kbps claimed, 404 Kbps measured.
Robustness: high.
Integration: easy (USB).
Mechanism: semiconductor avalanche effect.
Openness: closed.
Source: http://ubld.it/products/truerng-hardware-random-number-generator/

Untitled

OneRNG V3.0 External.
Vendor: Moonbase Otago (New Zealand).
Correctness: N/A (awaiting testing).
Price: US$40.
Acquisition: medium (takes a while).
Performance: 350 Kbps claimed.
Robustness: N/A.
Integration: medium (USB but requires some initialization).
Mechanism: semiconductor avalanche effect and atmospheric noise.
Openness: open (both hardware and software).
Source: http://onerng.info

Chaos Key

Chaos Key 1.0.
Vendor: Altus Metrum via Garbee and Garbee.
Correctness: 7.999956 entropy bits per byte, 7.94565 minimum entropy bits per byte.
Price: US$40.
Acquisition: easy.
Performance: 10Mbps claimed, 6.7Mbps measured (still remarkably fast).
Robustness: High.
Integration: easy (USB with device driver incorporated into kernel).
Mechanism: transistor noise.
Openness: open (both hardware and software).
Source: http://altusmetrum.org/ChaosKey/

Infinite Noise True Random Number Generator

Infinite Noise.
Vendor: Wayward Geek via Tindie.
Correctness: 7.999955 entropy bits per byte, 7.97759 minimum entropy bits per byte.
Price: US$35.
Acquisition: easy.
Performance: 259 Kbps claimed, 357 Kbps measured.
Robustness: (now being tested).
Integration: medium (requires building infnoise utility that must be run as root).
Mechanism: thermal noise.
Openness: open (both hardware and software).
Source: https://www.tindie.com/products/WaywardGeek/infinite-noise-true-random-number-generator/

Conclusions

If you're not concerned about openness, the US$99 TrueRNGpro delivers great performance at a reasonable price, is dirt simple to integrate into your Linux/GNU system, and is easy to acquire.

If you are concerned about openness, the US$50 FST-01 NeuG available from the Free Software Foundation is reasonably priced, fairly easy to acquire, and the FSF publishes its firmware source code and hardware schematics.

If you want to feel like you're living in a William Gibson novel, or have extraordinary requirements for an entropy mechanism beyond reproach, or need high performance, don't mind a bit of work to integrate it into your system, and have very deep pockets, the US$1100 Quantis is the way to go.

If you want something like the TrueRNGpro, but is maybe a little easier to carry in your laptop bag and use at the coffee shop, and can live with a little less performance, it's hard to beat the US$50 TrueRNG (the version 3 of which I'm still testing, but so far it looks great). You can one-click this from Amazon.com and have it day after tomorrow.

If you live in Australia or New Zealand, I couldn't blame you for checking out the devices designed by folks there. My luck with the OneRNG hasn't been good. But the BitBabbler White worked great, is IMNSHO worth every penny, and is a great open alternative to the FST-01 NeuG for anyone.

If you are using a platform that already incorporates a hardware entropy generator - as a component, a peripheral, or even a machine instruction - it may just be a matter of installing the rng-tools package and doing a bit of configuration. Or maybe just updating your kernel. Or it may be present in your system right now.

It's worth a look.

Repositories

The Scattergun repository, which includes the scattergun.sh test script and Makefile, software I wrote to use some of the generators, and the results of running the test script on each device, can be found on GitHub.
https://github.com/coverclock/com-diag-scattergun
The photographs, diagrams, and PNG images that are results of testing can be found on Flickr.
https://www.flickr.com/photos/johnlsloan/albums/72157660632647214 
Acknowledgements

One never sets out on an epic quest without some traveling companions. There are a couple I can acknowledge. My office mate from my Bell Labs days, Doug Gibbons, was coincidentally working in this area and gave me much invaluable guidance. My occasional colleague Doug Young invited me to give a talk on this topic at Gogo Business Aviation, which incentivized me to firm up my thinking on this topic.

Update 2017-02-28
Cryptographic hash functions like SHA-1 are a cryptographer’s swiss army knife. You’ll find that hashes play a role in browser security, managing code repositories, or even just detecting duplicate files in storage. Hash functions compress large amounts of data into a small message digest. As a cryptographic requirement for wide-spread use, finding two messages that lead to the same digest should be computationally infeasible. Over time however, this requirement can fail due to attacks on the mathematical underpinnings of hash functions or to increases in computational power. 
Today, more than 20 years after of SHA-1 was first introduced, we are announcing the first practical technique for generating a collision.
-- M. Stevens et al., "Announcing the first SHA1 collision", Google Security Blog, 2017-02-23
Sources

Apple, "iOS Security", Apple Inc., 2016-05

Atmel, "Trusted Platform Module LPC Interface", AT97SC3204, Atmel, 2013-03-20

Atmel, "Atmel Trusted Platform Module AT97SC3204/AT97SC3205 Security Policy", FIPS 140-2, level 1, version 4.3, Atmel, 2014-04-03

E. Barker, J. Kelsey, "Recommendation for Random Number Generation Using Deterministic Random Bit Generators", NIST Special Publication 800-90A, Revision 1, National Institute of Standards and Technology,  2015-06

E. Barker, J. Kelsey, "Recommendation for the Entropy Sources Used for Random Bit Generation", NIST Special Publication 800-90B, second draft, National Institute of Standards and Technology,  2016-01-27

R. Brown, D. Eddelbuettel, D. Bauer, "Dieharder: A Random Number Test Suite", version 3.31.1, http://www.phy.duke.edu/~rgb/General/dieharder.php, 2016-02-06

W. Gibson, Neuromancer, Ace, 1984

M. Green, "Attack of the week: 64-bit ciphers in TLS", https://blog.cryptographyengineering.com/2016/08/24/attack-of-week-64-bit-ciphers-in-tls/, 2016-08-24

M. Gulati, M. Smith, S. Yu, "Secure Enclave Processer for a System on a Chip", U. S. Patent 8,832,465 B2, Apple Inc., 2014-09-09

G. Heiser, K. Elphinstone, "L4 Microkernels: The Lessons from 20 Years of Research and Deployment", ACM Transactions on Computer Systems, 34.1, 2016-04

T. H├╝hn, "Myths about /dev/urandom", http://www.2uo.de/myths-about/urandom/, 2016-11-06
Intel, "Intel Digital Random Number Generator (DRNG) Software Implementation Guide", revision 2.0, Intel, 2014-05-15

Intel, "Intel Trusted Platform Module (TPM module-AXXTPME3) Hardware User's Guide", G21682-003, Intel

B. Koerner, "Russians Engineer a Brilliant Slot Machine Cheat - and Casinos Have No Fix", wired.com, 2014-06

NIST, "Security Requirements for Cryptographic Modules", FIPS PUB 140-2, National Institute of Standards and Technology,  2001-05-25

NIST, "Minimum Security Requirements for Federal Information and Information Systems", FIPS PUB 200, National Institute of Standards and Technology,  2006-03

NIST, "Derived Test Requirements for FIPS PUB 140-2", National Institute of Standards and Technology,  2011-01-04

NIST, "Digital Signature Standard (DSS)", FIPS PUB 186-4, National Institute of Standards and Technology, 2013-07

NIST, "NIST Removes Cryptography Algorithm from Random Number Generator Recommendations", National Institute of Standards and Technology, 2014-04-21

NIST, "Implementation Guidance for FIPS PUB 140-2 and the Cryptographic Module Validation Program", National Institute of Standards and Technology,  2016-08-01

NIST, "SP800-90B Entropy Assessment", https://github.com/usnistgov/SP800-90B_EntropyAssessment, 2016-10-24

C. Overclock, "John Sloan and Hardware Entropy Generators", http://chipoverclock.com/2016/10/john-sloan-and-hardware-entropy.html, 2016-10-18

J. Schiller, S. Crocker, "Randomness Requirements for Security", RFC 4086, 2005-06

B. Schneier, "SHA1 Deprecation: What You Need to Know", Schneier on Security, https://www.schneier.com/blog/archives/2005/02/cryptanalysis_o.html, 2005-02-18

B. Schneier, "The Strange Story of Dual_EC_DRBG", Schneier on Security, https://www.schneier.com/blog/archives/2007/11/the_strange_sto.html, 2007-11

CWI, "SHAttered", http://shattered.it/, 2017-02-23

R. Snouffer, A. Lee, A. Oldehoeft, "A Comparison of the Security Requirements for Cryptographic Modules in FIPS 140-1 and FIPS 140-2", NIST Special Publication 800-29, National Institute of Standards and Technology,  2001-06

M. Stevens, E. Bursztein, P. Karpman, A. Albertini, Y. Markov, "The first collision for full SHA-1", CWI Amsterdam, Google Research, 2017-02-23

M. Stevens, E. Bursztein, P. Karpman, A. Albertini, Y. Markov, A. Petit Bianco, C. Baisse, "Announcing the first SHA1 collision", Google Security Blog, https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html, 2017-02-23

TCG, "TCG Specification Architecture Overview", revision 1.4, Trusted Computing Group, 2007-08-02

TCG, "TCG PC Client Specific TPM Interface Specification (TIS)", version 1.2, revision 1.00, Trusted Computing Group, 2005-07-11

M. Turan, "IID Testing in SP 800 90B", Random Bit Generation Workship 2016, National Institute of Standards and Technology, 2016-05

J. Walker, "ENT: A Pseudorandom Number Sequence Test Program", https://www.fourmilab.ch/random/, 2016-02-27

Wikipedia, "Hardware random number generator", https://en.wikipedia.org/wiki/Hardware_random_number_generator, 2016-12-30

Wikipedia, "Comparison of hardware random number generators", https://en.wikipedia.org/wiki/Comparison_of_hardware_random_number_generators, 2017-02-19

Wikipedia, "Dual_EC_DRBG", https://en.wikipedia.org/wiki/Dual_EC_DRBG, 2017-01-02

Wikipedia, "SHA-1", https://en.wikipedia.org/wiki/SHA-1, 2017-02-15

Wikipedia, "Trusted Platform Module",  https://en.wikipedia.org/wiki/Trusted_Platform_Module, 2017-02-17