Monday, October 30, 2017

xkcd: Digital Resource Lifespan

Today's excellent (as usual) xkcd cartoon by Randall Munroe reminds us of the ephemerality of digital information.


Once again, this being a topic I have raised over and over in this blog, I'd like to mention that the definitive work on this topic is
Jeff Rothenberg, "Ensuring the Longevity of Digital Documents", Scientific American, vol. 272, no. 1, January 1995 .
A digital copy of Rothenberg's article can, at least at the moment, be found at
https://www.clir.org/pubs/archives/ensuring.pdf .
It's well worth reading. And appropriately terrifying, given the season.

It's not just the digital media itself that has a (sometimes remarkably short) lifespan, but also the technology - both hardware and software - to read the media, and even the specifications that describe the format and encoding of the media. I've seen a vase from ancient Greece inscribed with art and symbols that are perfectly legible millennia later. I've been up close to a first edition of a book written by Galileo in the early 18th century. We can still read the letters John Adams and his wife Abagail wrote to one another during the time of the American war of independence. I have CD-ROMs that are just a few years old which are no longer usable. And trying to read the eight-inch floppy disks for the CP/M file system that have the text of my master's thesis in WordStar format? Good luck. (The thirty-four year old hardbound version on acid-free paper is completely legible.)

The only way to insure the longevity of digital information is to continuously refresh it onto new media using current technology. This becomes less and less practical as the amount of digital information continues to grow an a rate that can only be described as alarming.

Friday, October 27, 2017

You Cannot Cheat What You Do Not Measure

The international edition of the German magazine Spiegel has a two part article about the three graduate students who uncovered the huge emissions fraud by Volkswagen. This is of interest to motorheads and techies alike. The case is another great example of measurement dysfunction (and its kin incentive distortion, moral hazard, perverse incentives, unintended consequences, etc.) albeit on a enormous scale.

http://www.spiegel.de/international/business/the-three-students-who-discovered-dieselgate-a-1173686.html

http://www.spiegel.de/international/business/the-three-students-who-discovered-dieselgate-a-1173686-2.html

Saturday, October 21, 2017

Thought Experiment

A long long time ago, in a university far far away, I was thinking about how the classic ring-shaped rotating Von Braun space station, like the one with the Hilton hotel in the movie 2001: A Space Odyssey, uses centrifugal forces to simulate gravity. I began to wonder: if you took that space station way way out into deep space, presumably its occupants would still feel pushed towards the floor that was the outer portion of the ring. But if they looked out the window, there is no way they could tell relative to what they were rotating. In fact, other than the fact they could still sip coffee out of open-topped mugs, they couldn't tell they were rotating at all, visually.

(image credit: Wikimedia)

When faced with such conundrums, I did what I always did at the time, I asked my friend and colleague David Hemmendinger. David and I were both graduate students in computer science at the time. But while I had a freshly minted bachelors degree in CS, David already had a Ph.D. in philosophy. At the time he was the smartest guy I knew. (He may still be the smartest guy I know. But in the more than three decades hence, I've continued my habit of working with people a lot smarter than I am, so it's not a given.)

Dave explained that this was a question that troubled lots of people. He told me about Mach's Principle - the same Mach as in Mach Numbers, for describing supersonic speeds. Mach proposed that the rotation that results in centripetal force is relative to absolute space (take that, Relativity!). It's the same principle that causes gimbaled gyroscopes to try to maintain their orientation as they spin no matter how you move and rotate them; they are trying to remain in the same position relative to the entire mass of the universe.

Seriously?

That sounded pretty darn fishy to me. I'd had several physics courses as an undergraduate and knew the math behind Relativity well enough. I understood that rotation was a form of acceleration, and that accelerated inertial reference frames were different from unaccelerated inertial frames of reference.

But how exactly does the distant mass in the universe affect the gyroscope on my desk? Apparently instantaneously? I didn't buy it. I could tell from Dave's clouded expression and half smile that he didn't quite buy it either. He made it clear that he was just reporting what he had read.

This question stayed with me for the next thirty-plus years. I just figured I was kinda stupid, and that someday I would come across the answer.

So this morning I read Tony Rothman's article in American Scientist, "The Forgotten Mystery of Inertia".
https://www.americanscientist.org/article/the-forgotten-mystery-of-inertia
It's not long. And it's an easy read. Here's is what I learned.

No one understands this.

This same basic thought experiment troubled Newton, Einstein, Mach, Gödel, and lots of physicists and philosophers right up into the present day.

Everyone knows the what of inertia; the math we use to describe it is completely accessible. But no one can explain the how.

How do rotating space stations simulate gravity? How do gyroscopes strive maintain their original orientation - RELATIVE TO WHAT, FOR CROM'S SAKE?!?!?!? - once they start spinning, and do so with such reliability that we use them inside gyrocompasses as part of inertial navigation systems in everything from nuclear submarines to intercontinental ballistic missiles?

I'm not stupid. (Well, not because of this, anyway.)

The world we live in is. Just. That. Weird.
"If you don't find reality fascinating, you just aren't paying attention." - Me, often.

Thursday, October 19, 2017

My WWVB Radio Clock

Obelisk, a.k.a. O-3, is the third (and probably final) digital clock I've designed and built that is ultimately synchronized to the atomic clocks that serve as our standard for time in the U.S. It synchronizes to the sixty kilohertz transmission from radio station WWVB in Fort Collins Colorado. WWVB is operated by the National Institute for Standards and Technology (NIST). NIST has one of their big facilities in Boulder Colorado, where the F2 master atomic clock, the source of ultimate time and frequency reference for the United States, is located. Fort Collins is just a hour or so drive north of where I am located near Denver Colorado. But the WWVB time signal reaches most of the continental U.S. and beyond, typically at night when the sun doesn't interfere with the long-wave transmission.

Untitled

The hardware of Obelisk is very similar to my prior two clocks. It is based on a Raspberry Pi 3 Model B running Raspbian, a Linux distro based on Debian. It uses an Adafruit LCD display. And it includes a real-time clock board with a battery backup, based on the DS1307 chip, to maintain the time when the system isn't running.

What sets this clock apart is that it uses a tiny SYM-RFT-60 AM radio receiver configured out-of-the-box to pick up the WWVB sixty kilohertz transmission. The receiver provides just six pins: one for power, one for ground, two for the antenna, one for reset, and the last for the digitized and inverted data derived from the modulated radio signal. The reset and data pins are connected to GPIO pins on the Pi.

The WWVB radio signal is amplitude modulated: the reference state is 0 dBr and the asserted state is -17 dBr.  The receiver inverts this so that the reference state is read by the GPIO as a zero, and the asserted state is read as a one. The transmitter emits one pulse per second, with the leading edge of each pulse aligned to the beginning of a second as determined by an atomic clock located at the transmitter. The duration of the pulse is used to encode three symbols: 200 milliseconds is a binary zero, 500 milliseconds is a binary one, and 800 milliseconds is a special marker used to identify the boundaries in the data frame.

A frame consists of exactly sixty pulses, so there is always one frame per minute. The leading edge of the first pulse in the frame indicates the beginning of the first second, :00, in the current minute, and the leading edge of the last pulse of the frame indicates last second, :59.

Values in fields are encoded as binary coded decimal (BCD) digits. So, for example, the minute of the hour is encoded as two BCD fields, the tens portion, and the ones portion. The UTC minute and hour, the day of the year (Julian day), and the last two digits of the year, are encoded in fields in the frame, along with other useful stuff like whether the current year is a leap year, whether a leap second is going to be added at the end of the current month, whether Daylight Saving Time is in effect, and the difference between the UTC time and UT1 time. (UTC is kept within +/- 0.9 seconds of UT1, a time reference based on celestial observations that is relative to the Earth's rotation, through the introduction of leap seconds.)

Obelisk merely samples the GPIO pin in user-space using a periodic timer, and measures the pulse width to determine the three symbols. Most everything else is just the bookkeeping involved with decoding the fields in the frame and turning them into a useful timestamp. Not that that was completely trivial.

The hardest part of this process was designing the state machine to parse the language whose grammar is implied by the layout of the symbols in the frame. While the AM radio signal can be received throughout all or most of the continental United States, it is extremely susceptible to interference. (This is what Steely Dan was alluding to when they sang "no static at all" in their 1978 song FM.) The WWVB signal can be corrupted by the placement of the radio antenna, by other electronic devices (particularly LCD displays), and even by strong sunlight. So the Obelisk parser tends towards paranoia, with a lot of stuff for error detection and recovery.

One technique I have found useful to measure the quality of my clocks is to run the Network Time Protocol daemon on them and use my implementation as an NTP reference clock. Obelisk emulates a GPS time source by emitting standard National Marine Electronics Association (NMEA) RMC sentences containing a UTC time stamp every second once it is disciplined to WWVB. It also generates a Pulse Per Second (PPS) on another GPIO pin; the PPS is synchronized, subject to some software delay and jitter, to the leading edge of the WWVB pulses that are timed to begin at the start of every second.

NTP continuously compares the reference clock with clocks from other NTP servers on the internet. Querying the NTP daemon on Obelisk for its statistics often reveals that my implementation is pretty usable. It will never be as good as a GPS-disciplined NTP server; the WWVB signal is just not that reliable. But as a desk clock, it's completely serviceable. Below you can see the output from the ntpq command that includes my radio clock masquerading as PPS and GPS reference clocks. This snapshot was taken while NTP was pretty happy with the output of my software.

Untitled

WWVB radio clocks are inexpensive and common; there are several such off-the-shelf devices scattered around the Palatial Overclock Estate. And WWVB radio clocks can be quite small. One of the wristwatches in my collection is a Casio Wave Ceptor with a built in WWVB radio receiver which it uses to set itself. (It's also solar powered.)

Casio Wave Captor Multi Band 6 Tough Solar

My first clock, Hourglass (O-1), synchronizes to the signals from the satellites in the Global Positioning System. Like other radio navigation systems I have studied - for example, LORAN, for Long Range Navigation - GPS is based on time: the accuracy of the GPS position fix depends on the precision of the timestamps generated by the GPS system. Each GPS satellite has multiple atomic clocks, each of which is synchronized to the U.S. Air Force master atomic clock at the GPS ground station near Colorado Springs Colorado. I described Hourglass in My Stratum-1 Desk Clock. It was a relatively straightforward project, using off the shelf hardware and open source software. If you want an inexpensive stratum-1 NTP server, this is the approach I would recommend; reasonably priced commercial products are also available and which I describe in this same article.

Hourglass on a Stand

My second clock, Astrolabe (O-2), similarly synchronized to GPS, but uses a miniature cesium atomic clock to maintain time if and when the GPS signal was lost or compromised. I described Astrolabe in My Stratum-0 Atomic Clock. The software for Astrolabe was almost identical to that of Hourglass, although I found the hardware design much more challenging; there was a fair amount of hardware hacking (for a software guy like me anyway) to integrate the board containing the chip-scale atomic clock with the rest of the system. The biggest problem was that the Raspberry Pi is a 3.3 volt device and the CSAC board is a 5 volt device with standard RS-232 12 volt serial output. The design was complex enough that I felt the need to design in test points through which I could examine both the cesium atomic clock and its uBlox GPS receiver independently of the rest of the system.

Untitled

Obelisk differs from these two clocks in that I had to write a substantial amount of software to decode the WWVB signal and put it in a form usable by the open source software in the clock. But that's what learning experiences are all about. Obelisk provides a user-space application I call wwvbtool. The lower level details of decoding the WWVB transmission is handled by functions in the Obelisk library, which can be used for other WWVB-based applications. wwvbtool also makes use of my Hazer (GPS sentence generation and parsing) and Diminuto (Linux systems programming) libraries. wwvbtool, Obelisk, Hazer, and Diminuto are all written in C. All of the code is open source - licensed under either the GPL or the LGPL - and can be found on GitHub.
https://github.com/coverclock/com-diag-obelisk 
https://github.com/coverclock/com-diag-hazer  
https://github.com/coverclock/com-diag-diminuto 
Although wwvbtool can be used from the command line, Obelisk runs it as a background daemon where it communicates with the GPS daemon gpsd, which in turn communicates with the NTP daemon ntpd. Obelisk also uses the same hourglass.py Python script I wrote for Hourglass and Astrolabe to maintain the LCD time display.

When Mrs. Overclock (a.k.a. Dr. Overclock, Medicine Woman) and I first moved to Colorado in 1989, I didn't appreciate that I was going to be living at a nexus of super precise time keeping, what with NIST's F2 atomic clock in Boulder, the WWVB NIST transmitter near Fort Collins, and the GPS ground station near Colorado Springs, all a relatively short drive west, north, and south respectively.

But precision timing and frequency sources keep cropping up again and again in my professional life, from the supercomputers and their associated peripherals and networks that I worked with at the National Center for Atmospheric Research in Boulder, to the traffic shaping and traffic policing firmware I wrote for ATM OC-3 optical networking products at Bell Labs north of Denver, to the geolocation and navigation systems in products I helped develop for the business aviation marketplace for a client near Denver.

The other day I was talking about Hourglass, Astrolabe, and Obelisk with a friend and colleague of mine. He repeated that aphorism "A man with more than one clock never knows what time it is." I had to remind him that my three clocks all display exactly the same time.