Thursday, May 21, 2020

Practical Differential Geolocation

Today, I have seven Global Navigation Satellite System (GNSS) antennas in the window of my home office at the rear of our house. I have an eighth one in the window of our living room in the front of the house. A ninth one in the skylight near the peak of the roof above our kitchen. And a tenth one sitting on my lab bench. All of these are hooked up to active GNSS receivers.

This isn't as crazy as it sounds. Five of those are used, not for geolocation, but for precision timing, disciplining the clocks in Network Time Protocol (NTP) servers. Two are used to continuously monitor the various GNSS constellations. Two are for functional and regression testing my own software. And one is connected to the base station for my Differential GNSS project. Differential GNSS (or, formerly, Differential GPS) is another approach for squeezing more accuracy and precision out of the GNSS constellations for Positioning, Navigation, and Timing (PNT) applications.

Differential GNSS

In Pseudorange Multilateration and Dilution of Precision I talked about just some of the ways errors are introduced into the GNSS position fix solution, and some of the mechanisms through which these errors can be addressed. In Improvisational Engineering I touched on another technique that uses corrections transmitted from a fixed base station to a mobile rover.
Differential GPS - or, again more generally, Differential GNSS - takes advantage of the fact that the sources of error (typically jitter) in the received satellite signals are uncorrelated with one another. Which is to say: random, or at least not systemic. So if a GNSS receiver is stationary, it can take advantage of its fixed location to compare each successive solution with either its own predetermined location, or with its own long term weighted average positioning solution. In the latter technique, called Real Time Kinematics (RTK), the receiver's position solution can become more and more accurate over time. Depending on a number of factors like antenna placement and how many satellites it can see overhead at a time, a stationary receiver, or the Base in DGNSS parlance, can achieve an accuracy down to centimeters or better.
This sounds great, but it doesn't help the mobile receiver, or Rover. Except if the Rover is close enough to the Base - in the same neighborhood - the two receivers will see the same satellites overhead, and therefore suffer similar degradation in the signals. The Base can transmit its corrections to one or more Rovers, and each mobile Rover can then potentially achieve a similar level of accuracy. 
DGNSS products have been around for decades, costing on the order of thousands of dollars. Late in 2018, u-blox, the well known Swiss-based manufacturer of excellent (in my professional experience) PNT solutions, introduced their ninth-generation of GNSS receiver chips, the ZED-F9. This family of high-precision GNSS chips support DGNSS RTK directly, in the form of messaging conforming to the Radio Technical Commission for Maritime Services (RTCM) Special Committee (SC) 104 standard. This brings a DGNSS capability down to a price range of hundreds of dollars, potentially affordable to hobbyists, enthusiasts, experimenters, and consumers of applications like non-military drones and remotely piloted or autonomous vehicles.
I have been using u-blox PNT solutions for years professionally. And the National Center for Atmospheric Research (NCAR) in Boulder Colorado had an R&D group decades ago when I worked there that was using DGNSS that was good enough that they could measure the swinging of instrument packages tethered to high altitude weather balloons. So when I first read about the ZED-F9 on the Time Nuts mailing list, my ears perked up.
The corrections generated by the Base are made on the assumption that the Base is stationary, so that any differences between its latest computed solution and its known location must be due to timing variations in the received signals. Because the Rover is moving, it cannot make any such assumptions. But as long as the GNSS receivers in both the Base and the Rover are seeing the same signals, the Rover can take advantage of the Base's corrections.

Differential techniques like this have been in use for a long time. Early versions required that the position of the Base be established through meticulous manual surveying.

Then consumer GPS and GNSS receivers began to support a survey-in mode in which the Base established its own position over time (hours or even days) to a required level of precision. But this still required a substantial amount of software development to implement the RTCM stack or equivalent messaging.

Today, there are a variety of established augmentation systems, ranging from fixed reference stations to even geosynchronous satellites, that provide differential corrections. Most of them are specific to the GPS constellation. Some of these are already automatically used by GPS receivers. Some of them are commercial services that you pay to subscribe to. Some of them aren't that useful.

The big breakthrough for folks like me came when affordable GNSS modules like the ZED-F9P came on the market, supporting RTCM SC 104 messaging natively, either providing corrections via RTCM after completing a survey-in, or accepting those corrections, even from some other RTCM source than another ZED-F9P module, and applying them in real-time to its own positioning solution. The ZED-F9P was also highly configurable in both hardware and firmware, able to be adapted to use a variety of communication ports to provide and accept RTCM messaging.

To be clear: virtually everything I have done on this project is merely glue code. It is the firmware in the ZED-F9P that does all the heavy lifting.


(You can click on any image to see a larger version.)

USGlobalSat BU-353S4

I have been experimenting with affordable consumer GPS receivers that provide a computer interface (typically USB) for a few years. This resulted in Hazer, my open source Linux/GNU/C-based software stack to parse and interpret the National Marine Electronics Association (NMEA) 0183 standard messages generated by most GPS receivers. Hazer provides a libhazer containing functions to handle NMEA, and a gpstool utility that can be used to functionally test it. Hazer can be found in the repository com-diag-hazer on GitHub.

GlobalSAT BU-353W10

I had encountered u-blox GPS receiver modules professionally on various embedded product development projects a decade before I began working on Hazer. When I started trying receivers with later generations of u-blox modules with Hazer, I expanded my support for those devices by adding Yodel, a parallel software stack in the Hazer repository, that handles messages in the proprietary UBX protocol, used by u-blox to provide more complex capabilities. gpstool evolved to not only handle the real-time configuration of these devices using UBX, but to incorporate features that supported experiments like integration with Google Earth for a real-time vehicle tracking.

SparkFun GPS-RTK2

When I started playing with the ZED-F9P module a year ago, I added Tumbleweed, yet another parallel software stack in the Hazer repository, that provides basic support for RTCM SC 104 messaging. I also added the rtktool utility that handles the routing of RTCM corrections from a Base to any number of Rovers across the internet.

Theory of Operation

This is the architecture into which Tumbleweed has evolved.


The Base is a Linux/GNU host that runs gpstool, a process which communicates with a USB-attached ZED-F9P module. Following a successful period of surveying in, the ZED-F9P establishes its fixed location to a configured level of accuracy. It then begins to transmit RTCM corrections to the host via USB. gpstool forwards the RTCM corrections via User Datagram Protocol (UDP) to the RTK Router.

The Router runs rtktool, a process which listens on a well known UDP port for incoming RTCM messages. These messages can either be corrections from the Base, or empty messages that serve as keep alive messages from one or more Rovers.

When the Router receives its first RTCM correction from a Base, it registers the IP address and UDP port number for that Base as the sole source for RTCM corrections. Any subsequent corrections from other Bases are ignored until the original Base quits transmitting corrections and its registration times out. Then the next Base whose corrections are received has its IP address and UDP port number registered as the source of RTCM corrections.

The Rover periodically sends an RTCM message with no payload to the Router via UDP as a keep alive. They are sent frequently enough to prevent the Router from aging out the Rover's IP address and UDP port number from its cache. The Rover receives RTCM corrections via UDP, addressed by the Router to the IP address and port number in its cache. gpstool forwards the RTCM corrections to the ZED-F9P via USB.

When the Router receives an empty RTCM message via UDP from a Rover whose IP address and UDP port number it does not already have in its cache, it adds that information to the cache. When the Router receives an RTCM correction from the one and only Base, it forwards that correction to any and all Rovers in its cache via UDP. If the Router does not receive a keep alive from a Rover within a time out period, the information for that Rover is removed from the cache.

In my implementation at the Palatial Overclock Estate, the Base sits on my home WiFi network, communicating wirelessly with the Router. The Router has a direct wired connection to my home WiFi access point/router, and forwards RTCM corrections to Rovers via our home internet service provider. Rovers in the field use a USB-attached cellular modem, and send keep alives and receive corrections via my LTE mobile provider.

The architecture is agnostic as to the communication mechanism, as long as it supports UDP/IP.

Useful variations in the architecture are possible.
This simpler scheme combines the Base and the Router by simply running gpstool and rtktool on the same Linux/GNU host. I have used this approach, but it didn't turn out to be convenient for my physical setup, which will be shown below.


There are a few details of which anyone trying to duplicate this setup should be aware.

The use of UDP instead of the guaranteed reliable Transmission Control Protocol (TCP) is important. As I discussed in the article Better Never Than Late, real-time applications like this become dysfunctional when messages arrive too late. RTCM corrections that arrive too late are useless. Not only do they have to be discarded by the receiver if they arrive after their "expiration date", the presence of useless messages in the pipeline delays the more applicable messages that are waiting behind the clog, often making them too late to be useful.

I routinely observe dropped messages, and occasionally out-of-order messages, particularly on the relatively complex Router to Rover link. When I initially did testing using TCP with Hazer prior to Tumbleweed, clogging of the data path and delayed messages were also routinely observed.

The Base prepends its outgoing RTCM messages with an unsigned thirty-two bit sequence number, as does the Rover. This allows the Router and the Rover to discard out of order messages, and to detect when messages have been dropped.

Never the less, the use of UDP is problematic. There is no authentication between the Base and Router, or between the Router and Rover. This means Denial of Service (DoS) attacks, spoofing, or other mischief are trivially possible. There is no encryption either, which could reveal sensitive location information.

However, usable encrypted UDP is an unsolved problem in my opinion. The obvious solution, Datagram Layer Transport Security (DLTS), works in part by implementing packet ordering and retransmission, effectively providing TCP-like services on top of UDP. This defeats the purpose of using UDP in the first place, bringing with it all the real-time problems of TCP.

Authentication and encryption are issues I am still pondering. I am frankly hoping someone else eventually solves this problem for the general UDP case.

The Router is necessary because none of the Linux/GNU hosts in Tumbleweed - Base, Router, or Rover - have static IP addresses. As is typical, my internet service provider dynamically assigns an IP address via Dynamic Host Configuration Protocol (DHCP) to my home WiFi access point/router. IP packets to and from hosts on my home network, like the Base and the Router, are routed though a firewall that does Network Address Translation (NAT).

Similarly, the Rover is dynamically assigned an IP address via DHCP by my LTE mobile provider. To further complicate matters, testing in the field has shown that the Rover's IP address (or sometimes just its UDP port number), once established, can be dropped and a new, different, one assigned, by my mobile provider, perhaps as a result of cell site handover; this happens even when the Rover is stationary.

I subscribe to a Dynamic DNS (DDNS) service. My home access point/router supports DDSN directly. When it boots and is assigned an IP address by my ISP via DHCP, the device automatically notifies the DDNS service of this fact, and the DDNS service then updates the global DNS database so that a fixed internet domain name points to this address. This, plus a firewall rule configured into my access point/router, allows me use a fixed domain name to point to the RTK Router when I start up a Rover. (The Base simply uses a local static IP address for the Router that is only valid behind the firewall.)

If the Rover's IP address and/or UDP port changes while it is being used in the field and receiving corrections, the next time it sends a keep alive to the Router, its IP address and UDP port number will be cached by rtktool as a new Rover, and it will receive the next RTCM correction using this new information. The entry with the old IP address and UDP port number will eventually become stale and be removed from the cache by rtktool. In the meantime, some RTCM corrections may be sent to the old address, but will presumably be dropped by the network.

(Another problem with the lack of authentication and encryption is that those orphaned RTCM corrections can actually be delivered to some unrelated host that has an application that just happens to be listening to the same UDP port. Other distributed systems I've helped develop do not immediately reuse dynamically assigned IP addresses for exactly this reason.)

Practical Differential Geolocation

This is the Tumbleweed setup I'm using today. It took me a long time to arrive at this specific setup, as I related in Improvisational Engineering. Some of this will convince you that Mrs. Overclock is a remarkably understanding woman.


This is the fixed Base station running gpstool. It is hidden in a drawer of a narrow dresser tucked inconspicuously into the corner of our living room. The host is a Raspberry Pi 3B+ Single Board Computer (SBC) running Raspbian, a Linux/GNU distro derived from Debian. The SBC is inside a plastic enclosure with a fan. A SparkFun GPS-RTK2 board with a u-blox ZED-F9P module is connected to the host via USB. The Base is plugged into an Uninterruptible Power Supply (UPS) hidden in another drawer below it.


This is the survey-grade multi band GNSS active antenna. It is mounted on a wall mount intended for a security camera in a skylight above our kitchen. The skylight is very near the peak of that section the roof, so the antenna has a good view of the sky. A small coaxial cable snakes down from the antenna and crosses a low wall from the kitchen into the living room to enter the dresser adjacent to the wall from the rear.

Screen Shot 2020-04-03 at 10.31.13 AM

The Base runs headless - that is, without a monitor, keyboard, or mouse - but the output from gpstool can be viewed on demand when using ssh to log into the host across our home network. Here, gpstool shows that the ZED-F9P self-reported a horizontal position accuracy of 0.0204 meters (a little less than an inch) and a vertical position accuracy of 0.0144 meters (a little more than half an inch). It used twenty-five different satellites from all four GNSS constellations (GPS a.k.a. NAVSTAR, GLONASS, Galileo, and Beidou a.k.a. COMPASS) for its position solution. On at least one occasion since gpstool was last started, the ZED-F9P was able to use a maximum of thirty-two satellites for a solution.


This is the Router running rtktool. Is is a similar Raspberry Pi 3B+ SBC running Raspbian. It has no special hardware. You can see the plastic enclosure with the fan running, but not much else. It is tucked into a shelf of our A/V cabinet in our family room (right where the cable from our ISP enters the house) where it is directly connected via a CAT5 cable (visible as the yellow RJ45 connector) to our home WiFi access point/IP router in the same cabinet. Like much of the stuff in our A/V cabinet, the Router is powered via a UPS.


This is a Rover running gpstool. It is a CEED pi-top [3] laptop shell. The pi-top [3] looks like a laptop, with a display, keyboard, and trackpad, but it has no internal processor, memory, or storage. Attached to the Rover is another survey-grade multi band GNSS antenna. The antenna is mounted on a selfie-stick intended for small cameras and mobile phones.


Sliding the keyboard and trackpad bezel down reveals the guts of the pi-top [3]. It has a glue board, heat sink, and system connector that accommodates a Raspberry Pi 3B+ SBC. (The SBC is hidden under the silver heat sink/system connector.) On the right, connected via a USB port on the glue board, is another SparkFun GPS-RTK2 board with the ZED-F9P module. The board is attached magnetically to an accessory rail. All of this fits, hidden, underneath the keyboard and trackpad bezel when it is closed.


On the rear of the pi-top [3] to the left is an SMC connector to which the GNSS antenna is attached. (I drilled a hole for that.) On the right on an external USB port is a NovaTel USB730L global LTE modem. This device enumerates to its host as an Ethernet dongle and self-configures by acting as a DHCP server to the host; all of the mobile network functions are hidden.


This is the Rover in the field, peeking out from underneath a laptop sunshade, geolocating over U.S. National Geodetic Survey (NGS) marker KK1446 near my neighborhood.

Screen Shot 2020-05-20 at 2.08.55 PM

During this same field test, gpstool indicates that the ZED-F9P self-reported a horizontal accuracy of 0.0141 meters (a little over half an inch) and a vertical accuracy of 0.0100 meters (less than a half an inch). It was using twenty-eight different satellites from four different GNSS constellations for its position solution, and had used twenty-nine satellites at some time since gpstool had started.


Although the Raspberry Pi is my go-to SBC for projects like this, the software isn't married to it. Here is a second Rover that I stood up. It is an ancient HP Mini 110 netbook running Linux Mint, another Linux/GNU distro based on Debian. This tiny laptop has a 32-bit Intel Atom N270 i686-class processor. Plugged into a USB port on the left, you can see another USB730L LTE modem. Plugged into a USB port on the right, inside a 3D printed enclosure, is an Ardusimple SimpleRTK2B board which also has a ZED-F9P module. Attached to the GNSS receiver is a helical GNSS antenna. This laptop, which a friend of mine accurately described as "dirt cheap and dead slow", is running gpstool and differentially geolocating like a boss, the u-blox module doing all of the heavy lifting.


One of the challenges I had when I built my own NTP server with a cesium atomic clock was: how do I test a device that may be far more accurate/precise than any tool I have with which to test it? I feel like I more or less solved that problem for the clock, at least to my own satisfaction. What kind of real-world results can I expect from Tumbleweed? How will I know that it even works?

That is a topic for another article.

Updated (2020-09-21)

I recently re-ran the long term survey-in of my Base station. The survey-in took a little over eight days to arrive at my desired mean accuracy of 2.5 centimeters or a little less than an inch. Here is a graph of the mean accuracy over the survey-in duration in seconds.

Screen Shot 2020-08-25 at 5.07.22 PM

This is pretty much what I expected: the mean accuracy converges with a long tail.

No comments: