Thursday, November 19, 2020

Is the Virtuous Cycle really that virtuous?

In October 2020 I wrote a longish article titled Layers about the layered implementation that emerged organically from the fifteen years (so far) of work I put into my Diminuto systems programming library that is the underlying platform for much of my C-based work. After thinking more about the "virtuous cycle" I described in that article - where feedback from actual use of Diminuto lead to its evolution and expansion and then subsequent use in even more projects (including some shipping products) - I added an afterword. Upon even more reflection, the afterword got longer and longer. Eventually I decided it merited its own short blog article. This is it.

I wonder from time to time whether I'm the only person still working in C. Then I see the TIOBE Index where C is the #1 most used programming language. I find this remarkable, since my years of PDP-11 experience eons ago lead me to still think of C as a kind of structured PDP-11 assembler language.

C places around fifth in other surveys, which seems about right to me. Although I find Rust challenging to work in, I couldn't criticize anyone for using it for systems-level work. And maybe using Go instead of Java for application work. And I like Python too for non-performance-sensitive high level work.

One of the things I find really mysterious is that organizations who work in C, GNU, and Linux don't seem to put any effort into making C easier, simpler, more reliable, and more productive to use. That's exactly what using Diminuto does for me. I'm not trying to talk C shops into using Diminuto. But but I am arguing that they should do something like what I do with Diminuto. And they could do a lot worse than looking carefully at the layered implementation approach I talk about in this article.

I believe this is another case of the cognitive bias towards prioritizing the short term over the long term. Putting code that is likely to be commonly useful in other projects into a reusable library, along with unit tests, is a pattern that has worked very well for me. But in organizations larger than my tiny one-person company, it requires that the folks managing the project and holding the purse strings see the long term value in that. For most project managers there is no long term. And certainly nothing worth spending resources - schedule and budget - outside of their immediate project and silo.

It also requires enough experience to know what is likely to be more generally useful in the long term; I haven't always gotten that right myself. And once you have a library that is used in multiple projects, changes and maintenance on that library requires more diligence and incurs a higher price, since it affects multiple code bases. An API change that breaks legacy code requires careful consideration indeed to balance the cost of breakage against the cost of technical debt.

Line managers and developers are typically not incentivized to think long term. Agile processes do not encourage this either. I find it hard to argue against this. It's easy to get side-tracked and take your eye off of the immediate goal. Long term thinking is the job of senior architects and upper management. And as I like to say: if you don't survive in the short term, there is no long term. It's just that in my tiny corporation, I'm both the line developer and the senior architect. I have to pay attention to the tiny details and the big picture.

Balancing short term and long term needs is a hard problem.

Tuesday, October 27, 2020

Layers

 "Either you design a layered architecture, or you'll wish you had."
-- Ken Howard, Ph.D., Bell Labs

When Colorado got hit recently with an early winter storm bringing several inches of snow and 10┬║F temperatures before Halloween had even arrived, I found myself wearing four layers of clothing while shoveling my driveway. It gave me time to think about the software layers that developed organically in Diminuto,  my open source C library and framework that has been the basis of almost all of my personal projects, and some or all of which has found its way into several shipping products. It will become obvious I'm using the term layers here a little differently than you might have seen before when referring to software architecture or network protocol stacks.

History

My work on Diminuto ("tiny") started in earnest in early 2008, with portions of it borrowed from earlier work I did in C++ in 2005, and based on even older C code I wrote in 1994. I began with a physically large and relatively expensive Atmel development board with an ARMv4 processor, a chip that seems laughably underpowered compared to the current ARMv7 and ARMv8 offerings that are in your mobile phone. This board predated the Raspberry Pi (2012), and even the BeagleBoard (2008). Having spent years writing C and C++ code for one real-time operating system (RTOS) or another like VxWorks, I was experimenting with building a minimalist Linux-based system that could run comfortably on the (then) memory-limited ARM processor. My testbed consisted of:

  • a custom Linux kernel configured with only those features I really needed;
  • uClibc, a stripped down C library;
  • BusyBox, that combines a bunch of commonly used Linux/GNU utilities into a single binary executable; and
  • my C-based application.

(You can click on any image to see a larger version.)

Board Setup

Even though I was shooting for a tidy KonMari-style system that would make Marie Kondo proud, I still found it useful to organize some of my code into a library and set of header files. Even then it had occurred to me that I might reuse some of this stuff in another context one day.

Diminuto, and my work that used it, eventually migrated to more capable platforms, like the BeagleBoard.

Cascada

And then to the first Raspberry Pi.

image

Eventually I found myself using later, even more powerful Raspberry Pis, and that Single Board Computer (SBC), with Diminuto, running on its Debian-based Linux/GNU distro, became my standard deployment platform for a lot of later projects, like Roundhouse, my IPv6 testbed.

Untitled

I occasionally found myself porting Diminuto to more powerful platforms, like the Nvidia Jetson TK1 development board.

Nvidia Jetson TK1

Diminuto played a key role in even more exotic projects, like building Astrolabe, my own stratum-0 NTP server using a chip-scale cesium atomic clock (CSAC).

Astrolabe (O-2)

Over the span of twelve years, Diminuto continued to grow organically in a kind of positive feedback loop, as the C library and framework became more useful than my quest for a tiny embedded Linux system. I would work a gig to develop some commercial product, and encounter a software feature in either the Linux kernel or the GNU library that I thought was useful - like the run-time loadable user-space modules that are used by the Asterisk open source PBX - and I would fold support for that feature, along with unit and functional tests and sometimes command line tools, into Diminuto. Or I would use some or all of Diminuto in some commercial development, and find some things in my code that didn't work so well, so I would fix it. And I would continue to leverage that growing base of reusable code in my own personal work.

Today Diminuto amounts to over 56,000 lines of C and bash code distributed amongst almost 350 files, implementing 84 different features and 38 different command line utilities, with 97 unit test programs and 28 functional test programs. It forms the underpinnings for almost all of my Linux/GNU-based personal projects in C or C++, like Hazer, my project that explores technologies like Differential Global Navigation Satellite Systems (DGNSS) and Inertial Measurement Units (IMU).

(To give you another way to think about this evolution, decades ago I started out using the Source Code Control System (SCCS), and then the  Revision Control System (RCS), to manage my source code changes to the precursors to Diminuto. Then I migrated to Concurrent Versions System (CVS). Then to Subversion (SVN). And finally git, with the repository hosted on GitHub. There are pieces of Diminuto that came from that earliest work that have been managed by five different code management systems.)

Different people learn in different ways. I envy my friends and colleagues who can pick up a book on some new technology, read it, and claim to have learned something. (Liars.) I can only learn by doing. I'm a very hands-on student. (Which is why I also have git repositories of PythonGo, and Rust code.)

Working on Diminuto has been fundamental to my ability to learn not just new (to me) and evolving C, Linux, GNU, and POSIX features, but also to experiment with Application Program Interface (API) design. I've learned a lot about porting Diminuto between processor architectures (I use it almost daily on ARMv7, ARMv8, and Intel x86_64 machines), and between different versions of Linux and GNU, and even different operating systems and C libraries; in the past I've ported all or parts of it to uClibc, Cygwin, Android/Bionic, and MacOS/Darwin. It has given me a way to write useful code once, test it throughly, and then reuse it, either in my own projects or in products I've helped my clients develop. As much as I like being paid by the hour, there are more fulfilling things to do than to write the same underlying code over and over again.

Even though I have at times ported Diminuto to other platforms, it is not portable, nor is it intended to be. It is specific to Linux and GNU, and to the kinds of work that I am routinely called upon to do. I regression test changes only against relatively recent versions of the kernel and the C library.

As Diminuto evolved incrementally in its virtuous cycle, without really much of a long-term plan on my part, I none the less noticed that it had developed a layered architecture, driven by the needs of whatever I was using it for at the time. Maybe this was an accident, or maybe it's my forty-plus years of experience talking. But in any case, as my old friend and colleague at Bell Labs, Dr. Ken Howard, would surely remark, a layered architecture is a good thing.

Layers

Layer 0: Documentation

Like most software developers, I'm more interested in writing code than I am in writing documentation. But like most software developers of a certain level of experience, I have come to realize that I have to document my work, if not for others, then at least for myself.  Plus, documentation is its own reward: while writing comments I sometimes think: "Hey, wait a minute. That doesn't make any sense at all." It's like a one-person code review. Also, like other developers of a certain age, I've looked at code I wrote months or years ago and wondered: "What the heck does this even do? And WHY?"

Documentation for Diminuto (and for most of my other projects) comes in three flavors:

  • blog articles like the one you are reading now;
  • Wiki-style documentation like that in the Diminuto README; and
  • Doxygen comments embedded directly in the code.

Doxygen is a software tool that takes your source code containing comments written in Doxygen's markup language, and produces (with a little help from other tools) documentation in the form of HTML pages, traditional UNIX manual pages, and PDF files.

Doxygen supports comments for files, functions, types, structures, variables, constants, preprocessor symbols, pretty much anything you can do in C. (Doxygen supports other languages as well.) The PDF reference manual for Diminuto currently runs to 784 pages! Of course, the resulting documentation is only as good as the Doxygen comments from which it is generated.

My approach is to document each Diminuto feature with Doxygen comments in its .h header file - the part of Diminuto that is publicly accessible even in a binary installation - which defines the API for that feature. Doxygen comments may also appear in .c translation units to describe functions that are not part of the public API.

Screen Shot 2020-10-27 at 11.11.29 AM

Even if I never refer to the generated documentation, Doxygen provides a canonical comment format for documenting my code. I routinely use the Doxygen comments to remind myself how my own code works.

Layer 1: Types

As much as possible I use the standard pre-defined data types provided by GNU and Linux, like the following:

  • size_t and ssize_t defined in <stddef.h>;
  • fixed-width integer types like int32_t and uint64_t defined in <stdint.h>;
  • bool defined in <stdbool.h>; and
  • pid_t in <sys/types.h>.

But it quickly became clear that it would be a good thing to have some standard Diminuto types that could be used broadly across all feature implementations. The header file

"com/diag/diminuto/diminuto_types.h"

not only automatically includes the standard header files cited above, but also defines some types that are used in Diminuto, like:

  • diminuto_ticks_t that holds the common Diminuto unit of time;
  • diminuto_ipv4_t that holds an IP version 4 address;
  • diminuto_ipv6_t that holds an IP version 6 address; and
  • diminuto_port_t that holds an IP port number.

All the higher layers are based on these types.

(Why the long #include path above for the Diminuto header file? It's to make integration into other code bases easier. See Some Stuff That Has Worked For Me In C and C++.)

Layer 2: Logging

I wanted to write reusable C code that could be part of a command line tool, part of a system daemon, or even part of a loadable module that runs in kernel space. To do that, I had to abstract away the specific mechanism used to display error messages. The Diminuto logging API defined in the header file

"com/diag/diminuto/diminuto_log.h"

does that. It borrows its design from

  • the standard syslog(3) interface;
  • from the Linux kernel's printk interface;
  • from Android/Bionic's logging mechanism; and
  • even from non-C logging facilities like those I've used in Java.

Diminuto defines eight log levels, from highest to lowest severity:

  • emergency;
  • alert;
  • critical;
  • error;
  • warning;
  • notice;
  • information; and
  • debug.

Whether log messages are displayed or not is determined by an eight-bit log mask. The default value of the log mask is 0xfc, which enables all log messages except information and debug. The log mask can be set programmatically or from an environmental variable. As a special case, if the environmental variable, which is called COM_DIAG_DIMINUTO_LOG_MASK, has the value "~0", all log messages are enabled.

So far, this sounds pretty mundane. Here's the interesting part.

  • If the log function call is compiled as part of a kernel module, the Diminuto log functions automatically revert to using the kernel's printk logging mechanism; each severity levels is translated to the appropriate printk level; and their emission is controlled as usual by the value of /proc/sys/kernel/printk.
  • If the calling process is one whose parent is the init process a.k.a. process id 1 (this typically means the child process is a daemon that has been inherited by init), or if that process is the session leader (which typically means it has no controlling terminal on which to display a log message), the log function routes the formatted message to the standard syslog(3) API; its severity level is translated to the appropriate syslog(3) severity; and its emission can be controlled via the standard setlogmask(3). For most systems, this means the message will be passed to the syslog daemon (however that may be implemented) and written to a file such as /var/log/syslog.
  • In all other cases,  the formatted message is written to standard error.

This vastly simplified the handling of log messages. It made Diminuto code much more easily reusable in a broad variety of contexts. The same function, using Diminuto logging, can be used in a command line utility or in a daemon; either way, output messages (in particular, error messages) will be handled appropriately. Some functions can even be used in the context of a kernel-space device driver. But more importantly, it helps insure that error messages - and, as we will see below, a lot of useful context for debugging - are not lost. Furthermore, the emission of each Diminuto log message is serialized under the hood by a POSIX mutex, insuring that messages from different threads in the same process don't step on each other in the error stream.

Layer 3: Errors

The standard function perror(3), defined in <stdio.h>, prints an caller-defined error message on standard error along with a system-defined error message associated with  the global variable errno, which is typically set by failing system calls and C library functions. For example, the code snippet

errno = EINVAL;

perror("framistat");

where EINVAL is an integer constant defined in <errno.h>, displays the message

framistat: Invalid argument

on standard error.

The code snippet

errno = EINVAL;

diminuto_perror("framistat");

on the other hand displays

2020-10-27T19:28:02.889470Z "nickel" <EROR> [25416] {7f981766e740} foo.c@3: framistat: "Invalid argument" (22)

on standard error, or logs it to the system log, as appropriate. (It is displayed as one line; the rendering here folds it up.)

Because diminuto_perror goes through the normal Diminuto logging mechanism, the message includes the date and time in UTC with microsecond resolution, the host name of the computer, the severity level, the process identifier of the caller, the thread identifier of the caller, the name of the translation unit, the source line within the translation unit, and even the numeric value of the error. 

The diminuto_perror API is defined in the header file

"com/diag/diminuto/diminuto_log.h"

and is part of the Diminuto log feature.

Layer 4: Assertions

The standard assert(3) function, defined in <assert.h>, evaluates a caller-specified scaler expression, and if the result of that expression is false, it displays an error message on standard error and ends the execution of the caller using abort(3) defined in <stdlib.h>.

foo: foo.c:2: main: Assertion `1 == 2' failed.

Aborted (core dumped)

The diminuto_assert API that is defined in the header file

"com/diag/diminuto/diminuto_assert.h"

does all that too, but it emits the error message using the Diminuto log feature, so all that logging goodness applies here too.

2020-10-27T19:48:15.562081Z "nickel" <EROR> [25564] {7f5ed5040740} foo.c@2: diminuto_assert(1 == 2) FAILED! 0=""

Aborted (core dumped)

(It also prints the value of errno, just in case it's useful to the developer debugging this failure.)

This is a lot more useful than it might first appear. Truth is, most serious errors - like error returns from systems calls - hardly ever happen, and if they do, they aren't typically realistically recoverable. A bunch of "error recovery" code often just serves to clutter up the source code, complicate testing and debugging, and add to the cognitive load of the developer. Most of the time, the best thing to do is to dump core and exit, and leave the recovery to higher minds - like the user sitting at the terminal, or maybe systemd. Using diminuto_assert can really clean up your code by reducing both error handling and error messaging to one function call.

Here's a tiny code snippet from the gpstool utility in my Hazer GPS project, which is built on top of Diminuto.

Screen Shot 2020-10-27 at 1.55.00 PM

If the standard malloc(3) function fails to allocate memory, the entire program just goes toes up. Not a lot of drama, just an error message and a core dump, without a lot of other error handling code cluttering up the translation unit.

Layer 5: Unit Tests

I've used a lot of different unit test frameworks over the years for C, C++, Java, and Python development. The Diminuto unit tests didn't need a lot of complex support for mocking and what not. I wanted something dirt simple but functional. Most of all, I didn't want users of Diminuto to have to install an entirely different package to support the unit tests. So an almost trivial unit test frame became part of Diminuto itself. (And before you ask: yes, there is a unit test in Diminuto for the Diminuto unit test framework.)

Here is a code snippet that is one simple unit test from the suite that exercises the Diminuto InterProcess Communication (IPC) feature.

Screen Shot 2020-10-27 at 2.23.07 PM

The unit test framework is nothing more than a series of C preprocessor macros. Here you can see that a new test is declared, a mixture of assertions (where failures are fatal) and expectations (where failures are logged and counted) are made, and the outcome of the test is displayed. If the unit test succeeds, the following is displayed.

2020-10-27T20:27:08.354153Z "nickel" <NOTE> [25741] {7fbabe66a000} tst/unittest-ipc.c@237: TEST: test=8

2020-10-27T20:27:08.354193Z "nickel" <NOTE> [25741] {7fbabe66a000} tst/unittest-ipc.c@246: STATUS: test=8 errors=0 total=0 SUCCESS.

The unit test framework API is defined in the header file

"com/diag/diminuto/diminuto_unittest.h"

and makes use of the Diminuto log feature.

Layer 6: Wrappers

Most of the features in Diminuto have fairly complex implementations. Some don't. Some are not much more than wrappers around system calls and C library calls. But for something that is used often, a little wrapping can lead to fewer surprises (contrary to the usual case with wrapping).

Consider the code snippet below, from the Hazer gpstool utility. 

Screen Shot 2020-10-27 at 4.14.09 PM

You can see if the standard fopen(3) call fails, the error leg calls both diminuto_perror, which prints the name of the file that failed to open and also the error string corresponding to the errno set by fopen(3), and diminuto_assert, which core dumps the application.

Compare that to this code snippet, also from the Hazer gpstool utility,

Screen Shot 2020-10-27 at 4.20.16 PM

in which only diminuto_assert is called.

Why the difference? Because the failing function in the second example is itself part of Diminuto. It already called diminuto_perror before it returned to the application. Even just having simple wrappers around frequently used system calls and library calls that automatically print an error message - using the smart Diminuto log-based features - reduces the amount of clutter in the source code of the application.

Of course, there maybe circumstances which the application should to print its own error message with more specific contextual information. But most of the time, the error emitted by the Diminuto code is sufficient.

Layer 7: Models

Why do some Diminuto features have complex implementations? Because they implement a specific model or design pattern of using a particular system call or C library facility. That model may have complex behavior. Typically the model is based on forty-plus years of hard won experience on my part on doing exactly this kind of thing. Or sometimes just on my opinion of the right way to do things. Or on horrifically tragic mistakes I've made in the past.

Here's an example that I already talked about in Timer Threads. Diminuto has a timer feature whose API is defined in

"com/diag/diminuto/diminuto_timers.h"

and which is based on POSIX timers. There is a potential race condition in POSIX timers between the application that starts (arms in POSIX-speak) and stops (disarms) a timer and the callback function that is invoked by the timer. POSIX timer callbacks run in a thread-like context - which is just a man page way of saying it is in fact a POSIX thread, separate from the application thread. When the application stops the timer, the callback may already be running, especially with a periodic timer, and on a multi-core (which these days is virtually any) processor. If the application releases resources - for example,  frees memory, or closes files - that it shares with the timer callback function, that function can have those resources pulled out from under it. Wackiness ensues.

Diminuto incorporates a POSIX mutex and condition in every Diminuto timer. When the application calls the Diminuto function to stop the timer, the function calls the POSIX function to stop the timer (not shown below), enters a critical section, indicates to the timer callback that it is placing it in a disarmed state, then waits for the timer callback to acknowledge it.

Screen Shot 2020-10-19 at 07.43.00

When the Diminuto timer callback (a private proxy function that invokes the user-specified callback) has finished its current business, it enters a critical section, checks to see if it has been disarmed, and if so, acknowledges that fact and signals the waiting application.

Screen Shot 2020-10-21 at 11.24.35 AM

This prevents the race condition. This is some fairly complicated stuff, and it's all transparent to the application.

What if you don't want to use that model? Fine. Don't use Diminuto in that particular instance. There's nothing keeping you from coding to the native Linux or GNU API. Hey, knock yourself out. You can still make use of the Diminuto log facility. You can make your own horrifically tragic mistakes. In these here parts, we call that a learning experience.

Layer 8: Functional Tests

Besides unit tests, Diminuto also includes a host of functional tests. These form the last line of defense against some boneheaded blunder on my part. They also provide, along with Diminuto's unit tests and command line tools, some really useful examples (I use them myself) of how to use Diminuto in the real world.

Some of the functional tests use special test fixtures I fabricated just for this purpose, to test hardware-centric features like General Purpose Input/Output (GPIO) pin input and output, serial port reading and writing, and pulse width modulation (PWM).

Untitled

When I make a change to any of this hardware-related code, or its dependencies, or just when I'm feeling lonely, I get the text fixture out, hook everything up, and run some functional tests that do things like light up LEDs in response to button presses, or adjust the illumination of an LED to achieve some specific value of lux on a light sensor.

The functional tests and command line tools form the outermost layer of the Diminuto framework.

End Matter

Diminuto has been enormously useful to me (and, sometimes, to my clients) for a wide variety of reasons, not the least of which are the lessons I carried away from the layered architecture which emerged organically from its development.

If you want to see non-trivial examples of Diminuto in action, checkout the gpstool and rtktool applications that are part of my Hazer GPS/GNSS project. They each make good use of a variety of Diminuto features. gpstoolrtktool, and their Diminuto underpinnings, are in constant use, 24x7, as part of my Differential GNSS work.

Supporting Material

Chip Overclock, "Being Evidence-Based Using the sizeof Operator", 2015-03, <https://coverclock.blogspot.com/2015/03/being-evidence-based-using-sizeof.html>

Chip Overclock, "Buried Treasure", 2017-01, <https://coverclock.blogspot.com/2017/01/buried-treasure.html>

Chip Overclock, "Some Stuff That Has Worked For Me In C", 2017-04, <https://coverclock.blogspot.com/2017/04/some-stuff-that-has-worked-for-me-in-c.html>

Chip Overclock, "renameat2(2)", 2018-04, <https://coverclock.blogspot.com/2018/04/renameat22.html>

Chip Overclock, "When The Silicon Meets The Road", 2018-07, <https://coverclock.blogspot.com/2018/07/when-silicon-meets-road.html>

Chip Overclock, "When Learning By Doing Goes To Eleven", 2020-03, <https://coverclock.blogspot.com/2020/03/when-learning-by-doing-goes-to-eleven.html>

Chip Overclock, "Headless", 2020-06, <https://coverclock.blogspot.com/2020/06/headless.html>

Chip Overclock, "Clock Time", 2020-10, <https://coverclock.blogspot.com/2020/10/clock-time.html>

Chip Overclock, "Timer Threads", 2020-10, <https://coverclock.blogspot.com/2020/10/timer-threads.html>

Afterword (updated 2020-11-19)

This afterword got so long I promoted it to its own blog article: Is the Virtuous Cycle really that virtuous?

Wednesday, October 21, 2020

Timer Threads

Way back in 2009, I wrote some code under Linux/GNU to use timers. Back then, it was the setitimer(2) feature, originally from BSD. But that approach has a flaw: real-time clock adjustments, for example via Network Time Protocol (NTP), can jitter the timer period. The newer timer_create(2) et al. POSIX timers can be configured to use a monotonic clock that makes no such adjustment.

POSIX timers can also be configured to either send a kill(2)-type signal, like SIGALRM, just like the BSD timers, or to invoke a callback functionWhile refactoring my code to use POSIX timers while maintaining the original signal-based API, I added the ability to also use timers with callback functions. I functionally tested the function callback feature by using it to do Pulse Width Modulation (PWM) on a General Purpose Input/Output (GPIO) pin. The functional test ran fine for hours... until it segfaulted! Pointing the GNU Debugger (GDB) at the core dump showed the callback had used a pointer that it shared with the application, and that pointer was NULL.

It only took me a few moments to realize it was a race condition: the callback runs in the context of its own POSIX thread. When the main thread in the application disarms (stops) the timer, then releases resources associated with it, the timer callback can be running, especially on a multicore processor. The timer callback had the resources pulled out from under it. I added a POSIX mutex and condition so that the main thread waits until the callback thread acknowledges that it's being stopped.

Here's the code snippet from the timer stop function in Diminuto, a framework of C code useful for the kinds of work I am typically called upon to do. (You can click on an image to see a larger version.) The stop timer function informs the timer callback that it is being disarmed, and waits for the callback to acknowledge that fact, which it will do by transitioning to the idle state and condition signaling the application. The POSIX mutex and condition objects are embedded inside the Diminuto timer object.

Screen Shot 2020-10-19 at 07.43.00

Here's a code snippet from the callback function. Note that non-periodic (a.k.a. one-shot) timers place themselves in the idle state as soon as their callback completes.

Screen Shot 2020-10-21 at 11.24.35 AM

As soon as I saw the backtrace in GDB, I knew I was going to learn something useful. That made for a good day.