Tuesday, January 28, 2020

Frames of Reference III

Light travels about a foot in a nanosecond. That's a very rough approximation. But it makes it easy for users of the old Imperial system of measurements to grasp this idea.

I look across the table at my lovely wife of nearly thirty-six years, Mrs. Overclock. I think I'm seeing her in the present. But I'm not. She's about three feet away from me, so I'm really seeing her as she was three nanoseconds in the past.

Behind Mrs. Overclock, I see the far wall. It's about four feet behind her, or about seven feet from me. I'm seeing the far wall as it existed seven nanoseconds in the past.

Through the window in the far wall I see the Rocky Mountains in the distance. The closest mountain I can see, Table Mountain near Golden Colorado, is about three miles west of us, or in the neighborhood of 16,000 feet away. So I'm seeing a version of Table Mountain that's 16,000 nanoseconds, or sixteen microseconds, in the past.

In the night sky, I can see the Moon above the mountains. The Moon is an average of about 240,000 miles away. That's more than 1.2 billion feet. So I'm not seeing the Moon as it is in the present - which by now you've figured out is a highly abstract concept. I'm seeing the Moon as it was more than 1.2 seconds ago. It's even possible it no longer exists in this present, and the shock wave and debris field just haven't arrived yet. Unlikely. But possible.

Astronomers have recently speculated that the distant star Betelgeuse is about to end its life by going nova. Betelgeuse is seven hundred light years away. So when we observe Betelgeuse, we are not seeing it as it is today, but instead are looking seven hundred years into the past. Whatever happened to Betelgeuse, it happened a long long time ago, and the light from that event is just arriving here. This kind of time shift is true for all of the stars we see in the nighttime sky to varying degrees depending on their distance from us.

So the present that I have fooled myself into believing I see is not only not the present, it's not even the same time in the past. It's a mural of perceptions taken from a broad ensemble of times in the past. This is why Albert Einstein asserted that there was no such thing as absolute simultaneity in his Theory of Special Relativity. And none of this takes into account the latency of our nervous system, or the processing time in our brains.

Real-time systems work like this too, perceiving at best an approximation of the reality around them.

See Also

Chip Overclock, "Frames of Reference", 2018-03-14

Chip Overclock, "Frames of Reference II", 2019-04-19

Tuesday, January 14, 2020

Backing Up Raspberry Pis

It happened gradually. I built a GPS-disciplined clock/NTP server using a Raspberry Pi, a two-line LCD display, and a GPS receiver. Then I built another one incorporating a chip-scale cesium atomic clock. Then a WWVB clock. Then another GPS clock using a simpler USB-attached GPS dongle. Then an NTP/GPS monitoring station. Then a differential GPS base station. Then a web server.

Somehow, over the past few years, I have ended up with ten Raspberry Pi single board computers scattered around the Palatial Overclock Estate, all running 24x7, most on uninterruptible power supplies, almost all headless. And that doesn't count the ones in the basement that I get out from time to time for ongoing projects, like my IPv6 testbed or my differential GPS rover.

Having lost one of them to a failed micro-SD card, the Pi's boot media, it occurred to me that maybe I should start thinking about a way to back them up. You would think this was a solved problem. But not so much.

The Wrong Solution

The most common mechanism - and the one I had been using - was to simply to copy the entire raw disk image on the micro-SD card to a handy backup disk as a flat file using the Linux dd command. Then, in the unlikely event that the need arises, restore it by doing a reverse image copy to an unused micro-SD card using a similar dd command.

But as common as this approach seems to be on the Salmon of Knowledge, it turns out to be uglier than it sounds. Even though those micro-SD cards from SanDisk and Samsung both claim to contain sixteen gigabytes, they aren't actually identical, differing very slightly in terms of the actual number of logical blocks they contain. You can dd a smaller image to a larger card, but obviously not vice versa. This is all beside the fact that a full raw disk image copy takes a long time.

I addressed this in the past in two ways.

First: restoring, for example, a sixteen gigabyte image to a thirty-two gigabyte card. Sounds simple, but it's not scalable: when I make changes to the system, I have to make another image copy. This one will be thirty-two gigabytes. So if I have to restore it, it will be to a sixty-four gigabyte card, even though I was wasting most of that thirty-two gigabyte card; the image backup includes all of the blocks on the card I was backing up whether they were used or not.

Second: I went on Amazon.com and ordered some micro-SD cards identical in brand, model, and capacity to the one I was backing up. Also a simple solution, as long as I can find those cards, but it ignores the growing collection I have of unused micro-SD cards.

The Right Solution

Thanks (as usual) to the web site Stack Exchange, and specifically the user goldilocks from which I drew inspiration, I have written a set of shell scripts that use the rsync utility to make a file-by-file incremental backup of a Raspberry Pi. And to format an unused micro-SD card of any suitable capacity and restore those files to it, recreating the boot media. Which I've tested.

The first backup takes a long time because it is a full backup. Subsequent backups - done after I make a change - take almost no time at all. The approach I took was to run the backup on the Pi itself a.k.a. online (rsync can also be run remotely, but I elected not to do that). I bought a one terabyte solid state disk (SSD) with a USB interface. I hook it up to the Pi while it is running, mount the SSD, and run the backup script. Formatting and restoring to an unused micro-SD card is done on one of my desktop Linux servers a.k.a. offline.

The Gory Details

These scripts can be found here: https://github.com/coverclock/com-diag-bin . They are licensed under the GPL. When I install these scripts on a system, I typically clone the repository, then create soft links from a bin directory to the scripts I need in the repo directory. During that process I drop the .sh suffix from the script name. So in the repo the script is called pilocalbackup.sh but in the bin directory it is just pilocalbackup.

In the examples below, /dev/sdx is a stand-in for the device name of the micro-SD card from the Raspberry Pi, and /dev/sdy is a stand-in for the device name of the backup SSD. Your mileage may vary. The standard Raspbian micro-SD card will have two partitions, the boot partition in /dev/sdx1 and the root partition in /dev/sdx2. There are files in both partitions that have hard coded references to these partition numbers.

I always assume I'm restoring to a two partition Raspbian card, but I can backup from a Noobs card for which Raspbian was selected at install time; A Noobs card will have something like seven partitions, five of which are unused in the Raspbian instance. The conversion from Noobs to Raspbian requires some editing of files in both the boot and root partitions of the card to change the boot partition to 1 and the root partition to 2If you are restoring a micro-SD card to create a duplicate of an existing system, you might also want to change the hostname and IP address of the Raspberry Pi that you are restoring. This is shown below.

In the examples below, the name framistat is a stand-in for the host name of the Raspberry Pi which is used by default by the backup script. The name doodad is a stand-in for the release name - e.g. jessie for 8.x, stretch for 9.x, buster for 10.x - of the Raspbian version I am dealing with; the partitioning of the micro-SD card differs slightly from release to release. You can find the Raspbian (based on Debian) release version number in the file /etc/debian_version.

The mount point names I use below are purely my own personal convention. I use /mnt for the backup SSD, /mnt1 for the boot partition (1) on the micro-SD card, and /mnt2 for the root partition (2) on the micro-SD card. The backup script backs up both partitions, and the restore script restores both partitions.

All of the scripts necessarily have sudo commands embedded in them, so I encourage you to inspect them carefully; sudo is only used when necessary. 

The README.md for the repo has a bunch of examples, including dealing with full raw image files. Below I'll show just a few germane cases. The repo also has a lot of other unrelated but useful (to me, anyway) scripts.

Determine the Raspbian Release from Various Media

cat /etc/debian_version # This is online on the Pi itself.

or

sudo mount /dev/sdx2 /mnt2
cat /mnt2/etc/debian_version # This is an offline Pi micro-SD card.
sudo umount /mnt2

or

sudo mount /dev/sdy1 /mnt
cat /mnt/pi/framistat/etc/debian_version # This is on the backup.
sudo umount /mnt

Backup Local Files Online On A Pi Using rsync

sudo mount /dev/sdy1 /mnt
pilocalbackup /mnt/pi/framistat # This is the default.
sudo umount /mnt

or

sudo mount /dev/sdy1 /mnt
pilocalbackup # This uses the default.
sudo umount /mnt

Restore Files Offline To An Unused Micro-SD Card Using rsync

piimageformat /dev/sdx doodad
sudo mount /dev/sdx1 /mnt1 # This is the boot partition.
sudo mount /dev/sdx2 /mnt2 # This is the root partition.
sudo mount /dev/sdy1 /mnt # This is the backup drive.
pilocalrestore /mnt/pi/framistat /mnt1 /mnt2
sudo umount /mnt /mnt1 /mnt2

Check, Verify, And Repair a Raspberry Pi Image Offline

piimagecheck /dev/sdx /dev/sdx1 /dev/sdx2 # These are the defaults.

or

piimagecheck /dev/sdx # This uses the defaults.

Customize a Raspberry Pi Image Offline After Restoring

sudo mount /dev/sdx1 /mnt1
sudo vi /mnt1/cmdline.txt # Change the boot partition.
sudo umount /mnt1

sudo mount /dev/sdx2 /mnt2
sudo vi /mnt2/etc/fstab # Change the / root and /boot partitions.
sudo vi /mnt2/etc/dhcpcd.conf # Change the static IP address.
sudo vi /mnt2/etc/hostname # Change the host name.
sudo vi /mnt2/etc/hosts # Change the host name resolution.
sudo umount /mnt2


Wednesday, November 27, 2019

32,768

In yet another example of the class of bugs my old Bell Labs colleagues refer to as counter rollover - in this instance apparently an int16_t (signed sixteen-bit integer) variable used to count hours - Hewlett-Packard warns that some of their solid state drives will fail at 32,768 hours of use.
HP Warns That Some SSD Drives Will Fail at 32,768 Hours of Use
Bulletin: HPE SAS Solid State Drives - Critical Firmware Upgrade Required for Certain HPE SAS Solid State Drive Models to Prevent Drive Failure at 32,768 Hours of Operation
That's an MTBF of 32,767 hours, or 0x7FFF in hexadecimal, the largest signed integer you can fit into sixteen bits. That works out to short of four years, far less than the five year warranty offered on such drives.

Some users report several drives - presumably installed at the same time - all failed within a fifteen minute window. Bet we can guess how long it took the sysadmin to install those drives in a RAID (and so much for redundancy).

HP is providing a firmware update. (I've never updated the firmware on a solid state drive. I'm a little surprised it's even possible.)
The fact that this catastrophic rollover event only occurs between the third and fourth years of operation makes you appreciate the difficulty in testing such firmware. You can't run the devices for four years before you ship them. You have to find another way to ferret out bugs of this nature, such as code inspections, white box unit testing, simulation, effectively an accelerated wear testing of the firmware algorithms.

In the words of my former office mate at the Labs:
They missed a cardinal rule: when implementing a counter or timestamp, ensure its rollover happens only after your anticipated career EOL1.
To which I replied:
The advent of an efficient uint64_t data type on embedded processors was a huge boon to my apparent career success!
Footnotes

1 End Of Life

See Also

C. Overclock, "Time Flies", 2015-05-09

C. Overclock, "Time Flies Again", 2019-07-27

Updates

2019-11-28: minor edits, corrections, and reformatting.