Thursday, February 16, 2017

Better Never Than Late

These days it's difficult to impossible to work on a mobile device of any kind that doesn't have a Global Positioning System (GPS) function integrated into it. But although I've worked as a product developer on many such devices over the years, I've never actually had the excuse to dig into the details of GPS, what information it provides, its format, and how you interpret it.

Since I'm a hands-on learner, there was only one thing to do. Along the way I was delighted to come up with a simple, real-life example of why sometimes losing data is a good thing. But it will take some time to get to that lesson.

The Road Test

From just parts I had on hand, I threw together a little portable battery-powered remote unit that reads data from a GPS receiver and transmits it to a base station in my office. The base station feeds the data to the Google Earth application. The Google Earth app produces a moving map display showing the path of the portable unit. All this occurs in real-time, the map updating on the display in my office as I drive around the neighborhood with the remote unit.

This is a nine minute screen recording of the moving map display, plus the terminal window of the base station as my software parses the GPS data and displays the results. The report includes position, altitude, heading, speed, and information about all of the satellites in the GPS constellation that the receiver can see. The Google Earth app and the terminal window are all running on the desktop Mac sitting in my office. The bad news is this may be the most boring video ever. The good news is you can use the YouTube player controls to fast forward through the portions of the drive where I was stopped at traffic lights.

The Hardware

All of this was surprisingly easy to do.

Hazer Testbed

Here is a high altitude view of my test setup. The GPS device feeds data over a serial connection to the remote unit "Tin", which passes it via User Datagram Protocol (UDP) over the Long Term Evolution (LTE) modem to the IPv6 internet. The GPS data ends up on my IPv6 test network (the same one I wrote about in Buried Treasure) where it feeds into the base station "Lead". Because Google Earth expects to receive data over a serial port from an actual GPS device, that's exactly how the base station feeds it, using a USB-to-serial convertor, to my Mac desktop running the Google Earth application. Google Earth thinks it's talking directly to a GPS device; it is unaware that Tin and Lead (and a pile of other network gateways) are acting as proxies between the GPS device and Google Earth.

Hazer sender "tin"

This is the remote unit, Tin. It consists of
USGlobalSat BU-353S4

The USGlobalSat BU-353S4 GPS receiver is a remarkably small, inexpensive, and easy to use device. It enumerates as a serial interface, e.g. /dev/ttyUSB0, from which you read ASCII data. Although my application was written in C, you could probably use almost any language, like Python. If you want to experiment with GPS, the BU-353S4, which is based on the SiRF Star IV chipset, makes it easy. However, there are several equally capable USB-based GPS devices on the market (some of which use the ublox 7 chipset) and I tested my software with several of them.

2016 Subaru WRX Limited

Perhaps the most important component of this project was the automobile I used to drive the remote unit around the neighborhood: a 2016 Subaru WRX Limited with a two liter turbocharged engine and all wheel drive. A little overkill for this project, but it's important to have good tools. (Mrs. Overclock refers to the WRX as The Batmobile.)

Road Testing

This is my test fixture inside the WRX. You can see my Mac laptop and the remote unit on the passenger seat, and the GPS receiver up on the dashboard by the windshield.

Hazer receiver "lead"

This is the base station, Lead, another Raspberry Pi 3, in my office that receives the GPS data from the remote unit over the IPv6 internet and feeds it via a serial cable to Google Earth running on my Mac desktop.
There are commercial devices that do this kind of thing. They are typically much simpler, cheaper, and lower power: a micro controller, a GPS chipset and a patch antenna, a 3G or even 2G cellular radio, and data is transmitted using Short Message Service (SMS) text messaging. But I threw this together with parts I already had on hand.
The Technology

The Global Positioning System is a Global Navigation Satellite System (GNSS) owned by the U. S. Government and operated by the U. S. Air Force. It is not the only GNSS; Russia (GLONASS), China (BeiDou-2), Europe (Galileo) have equivalent systems. GPS is a constellation of a couple of dozen satellites plus spares distributed in six different orbital planes. The first GPS satellite was launched in 1973.

Surprisingly, GPS satellites actually know nothing about position. What they know about is time. Each satellite broadcasts its unique identification number plus a time stamp. Every GPS receiver has embedded in it a detailed and precise map of the orbit of every GPS satellite. When the receiver sees the ID and time stamp from the satellite, it knows the expected position of the satellite, and can compute the propagation delay of the message by the time difference, based on the speed of light and taking relativity into account. The receiver can then calculate the distance it is from the satellite. Now the receiver knows it lies somewhere on a sphere whose radius is that distance. The receiver makes this same calculation based on several satellites, and it can then compute its position based on the intersection of all those spheres.
This isn't as complex as it may seem. It's solving four simultaneous equations with four unknowns: x, y, z, and t, where t is the difference between the receiver's clock and GPS time. Since there are four variables to solve, the receiver must be able to see at least four satellites to compute a solution.
The more precise the time stamp from the satellite, the more precise the position calculation. This is why each GPS satellite carries multiple redundant atomic clocks. And why by commanding the satellites to introduce variation in the time stamps - jitter - the U. S. military can reduce the usefulness of the GPS broadcasts to civilian receivers while their own units continue to receive encrypted messages with the more precise time.
This makes GPS a remarkably useful technology for getting the current date and time. Even if you're not interested in where you are, you might be interested in when you are. Telecommunications and other systems now routinely synchronize their own sense of time to the highly accurate satellite time using GPS receivers.
All of this - the map of the orbits of the satellites, the precise internal clock synchronized to GPS time, the code to do the complex position calculations - is embedded inside the tiny commercially available GPS chip sets. All the heavy lifting is done for you.

That it works at all may seem like a little miracle. But this basic idea has been around for a long long time. Ships in coastal waters used to measure when they heard a fog horn sounded at regular known intervals, emitted from a station at a well known map location, against their own ship's clock to compute their distance from the shore.

The Software

Most GPS devices support the National Marine Electronics Association (NMEA) 0183 specification. (Version 4.10 is the one I used). They may also support their own proprietary data formats, and there are sometimes good reasons to use those. But the NMEA 0183 spec provides a data format that is simple to use and meets many needs.

NMEA 0183 defines sentences, records consisting of ASCII data formatted into comma-separated fields. Each sentence begins with a dollar sign ($) and is terminated by an asterisk (*), a checksum encoded as two hexadecimal digits (e.g. 77), a carriage return (\r), and a line feed (\n). These sequences defines the framing of every GPS sentence. (This will be important later).
The first field in every sentence contains a string identifying the specific talker (e.g. GP means a GPS device) and what kind of record it is. In the example sentences above, the records RMC, GGA, GSA, and GSV contain, respectively, the recommended minimum (basic position, heading, and speed), the position fix (latitude, longitude, altitude, and time), the active satellites (those that where used for the position fix), and the satellites in view (those that the receiver can see, whether it uses them or not). There are lots of other possible sentences, but these four seem to be the ones that most GPS chipsets emit by default.
Screen Shot 2017-02-15 at 8.55.15 AM
When my software parses and interprets the NMEA sentence stream, it displays a report on the terminal in a form that is a little more comprehensible than the raw NMEA sentences. This is a screen snapshot from my laptop inside the car as I was running the test. (The terminal display on the base station in my office is identical, as you can see in the video.)

The software displays the NMEA sentence it received from the GPS device, and the sentence it forwarded to the base station. Next is a summary of the data in a form I can understand: the date and time in Universal Coordinated Time (UTC); the latitude and longitude in degrees, minutes and seconds; the altitude in feet; the compass heading; and the speed in miles per hour. Following that is the same data in a more technical form: the latitude and longitude in decimal degrees, the altitude in meters, the heading in degrees, the speed in knots (nautical miles per hour), and some internal information about the how the data in the NMEA sentence was encoded. Next is a list of the specific satellites used in the position fix, and some information about how accurate the fix is based on the positions of the satellites. Finally is a list of all the satellites in view, their identification number, elevation, azimuth, and signal strength.

Oh, No, Not Another Learning Experience

This project, which I code-named Hazer (a rodeo term), is divided into two parts: a library of C functions that handle the parsing of NMEA sentences, and a command line utility, gpstool, whose original purpose was to provide a functional test of the library. Like most complex UNIX command line tools, using gpstool looks like a case of ASCII throw-up.
gpstool -D /dev/ttyUSB0 -b 4800 -8 -n -1 -6 -A lead -P 5555 -E
This is the command on the remote unit that reads from the GPS device on serial port /dev/ttyUSB0 at 4800 bits per second, forwards the NMEA sentences to the base station lead (whose IPv6 address is in the remote unit's /etc/host file) on port 5555, and displays a report.
gpstool -6 -P 5555 -D /dev/ttyS0 -b 4800 -8 -n -1 -O -E
This is the command on the base station that receives NMEA sentences from the remote unit on port 5555, emits the sentences on serial port /dev/ttyS0 at 4800 bits per second for Google Earth to read, and displays a report.

I wrote gpstool to use the User Datagram Protocol (UDP) to forward the NMEA sentences to the base station. UDP is a protocol implemented on top of the Internet Protocol (IP), both versions 4 and 6. It differs from its peer Transmission Control Protocol (TCP) in some important ways.

TCP guarantees delivery, and order. If the protocol stack on the receiver detects that that a TCP packet was lost, it notifies the sender to resend the packet. The entire packet stream from the sender is delayed: the pending packets behind the lost packet are held in buffers on the sender, until the lost packet makes it through to the receiver.

This sounds like maybe a good thing, and sometimes it is. But for time-sensitive real-time data, this is a real problem, particularly on mobile radio links where the communication channel is routinely disrupted. GPS location data effectively has an expiration date; it goes stale quickly. The GPS device is constantly emitting location information in the form of NMEA sentences, a new position fix about once per second (or as often as five times a second on some of the GPS devices I tested). If an NMEA sentence is delayed due to a transmission error, there is another sentence just a second behind it containing a more current fix. And another even more current fix a second behind that one. It is better to toss a delayed sentence and wait for the next one, instead of processing old news.

My IPv6 testing with the LTE modem (see Buried Treasure) demonstrated that there is occasional packet loss in the mobile network. Using TCP would lead to filling up kernel buffers on the sender dealing with the retransmissions, to congestion and perhaps even more packet loss on the network as bandwidth is used to transmit the same packet over again, sometimes more than once, sending data that is ultimately no longer useful.

UDP, on the other hand, is best effort. UDP does not guarantee delivery. UDP doesn't even guarantee that the order of arriving packets is maintained, since successive UDP packets can take different routes through the network with different bandwidths, latencies, and propagation delays. gpstool doesn't get upset with lost NMEA sentences; it just picks up with the next position fix it receives. And because each position fix comes with a UTC timestamp, it can detect time running backwards and reject sentences that arrive out of order.

  • Sometime it is better to lose data than to receive it with absolute reliability.

This was not a new lesson for me. I first encountered this twenty years ago when I worked at Bell Labs on a telecommunications system that used Asynchronous Transfer Mode (ATM) to carry voice calls between distributed portions of the system on optical communication channels, sometimes even across international boundaries. We leveraged features of ATM like traffic management, class of service, and quality of service, so that frames of voice samples that arrived late were tossed in favor of the next frame that was sure to arrive shortly. And I encountered it again a few years later, when ATM was replaced with Voice over IP (VoIP), where frames of voice samples were carried on the Internet using Real Time Protocol (RTP).

For people that have never worked in telecommunications, the complexities of reconstructing, in real-time, analog sound from digitized voice samples always involved a lot of explanation. Or a lot of hand waving. For sure a lot of white board diagrams. But the real-time nature of a position fix of a car driving around the neighborhood is something I think is pretty easy to grasp.

But Wait, There's More

TCP implements a byte stream with no implicit framing. That is, what the receiver reads from its TCP socket in a single operation isn't necessarily the same bytes that the sender wrote to its TCP socket in a single operation, although the complete byte stream is guaranteed to be the same from beginning to end across all reads and writes.

UDP implements datagrams, where every read of the UDP socket by the receiver delivers a block of bytes with the same beginning and end that was written by the sender in a single operation, if that datagram successfully made it from sender to receiver at all.

When I first wrote gpstool, I intended it to just be a handy way to functionally test the Hazer library. When I got the idea of forwarding NMEA sentences to Google Earth, I initially tried using socat, the handy utility available for Linux/GNU and MacOS. My socat command on the remote unit looked something like
socat OPEN:/dev/ttyUSB0,b115200 UDP6-SENDTO:lead:5555
with a similar socat running on the base station.

But socat doesn't know anything about the framing of NMEA sentences from the GPS device. It just reads as many bytes as are available on the serial port, packages them up into a datagram, and sends them off.

When UDP packets were occasionally lost, it was a disaster on the base station, since what was lost wasn't typically a complete NMEA sentence, but the middle of one, or the end of one and the beginning of the next one. The state machine in Hazer responsible for reconstructing NMEA sentences could recover from this. But the stream of NMEA sentences was badly corrupted, and a lot of time was spent dealing with it, with a lot of lost sentences.

So instead, I added the UDP forwarding feature to gpstool, which sent UDP datagrams containing only single whole NMEA sentences. This made the reconstruction of the NMEA sentence stream, and dealing with the occasional lost sentence, almost trivial.

  • Sometimes losing data isn't so bad, if you have some control over what data you lose.

As a result, I have run gpstool on the remote unit and on the base station, the former sending NMEA sentences from the GPS device to the latter, for hours with no problems.

You Can Do This Too

The Hazer library, along with gpstool, is available as an open source project on GitHub.
Although the Hazer library only relies on the standard C and Posix libraries, gpstool makes use of Diminuto, my library of C systems programming functions, which is also available as an open source project on GitHub.
Roundhouse, my IPv6 test bed, is yet another open source project on GitHub.
The availability of inexpensive and reliable GPS receivers with easy to use USB-to-serial interfaces, and the simple ASCII format of NMEA 0183 sentences, makes using the Global Positioning System straightforward and rewarding. It's your tax dollars at work. Isn't it time you added some geolocation capability to your project?


SiRF, NMEA Reference Manual, 1050-0042, Revision 2.2, SiRF Technology, Inc., 2008-11

NMEA, Standard for Interfacing Marine Electronic Devices, NMEA 0183, version 4.10, National Marine Electronics Association, 2000

E. Kaplan, ed., Understanding GPS Principles and Applications, Artech House, 1996

Wikipedia, "Global Positioning System",, 2017-02-12

No comments: