Monday, January 05, 2009

House M.D. and Evidence-Based Troubleshooting

Mrs. Overclock (a.k.a. Dr. Overclock, Medicine Woman) and I got sucked into the television series House M.D. over the holidays. The USA cable network was running a marathon of House reruns. We haven't watched the first-run episodes (which run on the Fox cable network) except mostly by accident, and the last thing we need is to watch more television. But Mrs. Overclock is understandably interested in medical dramas (and, unfettered by the need to fill an hour of air time, will often beat House and his diagnostic team to the solution). I was a little surprised to find that House tickles certain centers of my brain, too.

If you're not familiar with the show, Gregory House (played by British comic actor Hugh Laurie in an impressive dramatic turn) leads a diagnostic group of physicians at a hospital in New Jersey. Each week they to try to cure some critically ill patient with a mysterious illness. They don't always succeed. It's a tribute to evidence based medicine. They have to choose tests that won't kill the patient, without really knowing what's wrong with the patient. They keep eliminating possibilities, and racking their brains to think up new ones. Frequently there is more than one problem, and they interact in strange ways. And always, the clock is ticking.

It's exactly like troubleshooting large, complex, distributed, real-time, high-availability, production systems, like a PBX.

I've done my share of field support of such systems. I've lived in a small room with several other developers, clustered around a workstation looking at a remote customer system. And I've gotten on a plane with a laptop and a protocol analyzer in my checked luggage. It's just like House. You keep brainstorming ideas of what could be going wrong. You try to think of tests to isolate the problem, to indict a particular hardware or software component, without crashing the production system. Your pour through log files, frequently inventing filtering tools on the fly, to make sense of the fire hose of information, to eliminate possibilities. You have to always keep track of what you know you know, of what you know you don't know, always being prepared to discard a much loved hypothesis in the face of new evidence. And always, the clock is ticking.

Frequently when I do this kind of work, and particularly when I am doing development for these kinds of systems, I have to keep reminding myself that not only are careers at stake, but lives. Customers may depend on the my software to dial 911 when someone has chest pain. Even in a non-safety critical situation, rebooting to see if it fixes the problem isn't an option if it means dropping hundreds of in progress calls. House's team can't afford to take a cavalier attitude, and neither could I.

I found a lot to relate to, watching House M.D. If you want to know what developing for high-availability systems is like, you would be well advised to check it out.

Saturday, January 03, 2009

Abstraction in C using Sources and Sinks

The problem with telling people that I do embedded software development is that no one really knows what that means. Including me. I've worked on embedded systems that were so resource constrained that they were written completely in assembly code, had no operating system, and in order to fix a bug we had to mine the code base to find instructions we could eliminate in order to make room. I've worked on embedded systems that had a full blown multi-user Linux system as their operating system, and the code base was hundreds of thousands of lines of bleeding-edge C++ that looked like the poster child for the use of templates and dynamic memory allocation.

Somewhere in between there is a sweet spot in which the application is written in C, but could still benefit from an object oriented design and implementation. OO scares a lot of the old school embedded developers (you know who you are). Yet my introduction to the use of those classics of OO - abstraction, encapsulation, and inheritance - was in 1974 when I was programming in assembly and using IBM's OS/360 Input/Ouput Subsystem, which was written in the late 1960s. Many years later when I was studying how C++ worked under the hood, the idea of the virtual pointer and the virtual table seemed quite familiar. I'd seen it all before in the OS/360 IOS. You get as old as I am, you begin to wonder if you're ever going to see anything that's truly innovative.

One of the most useful applications of OO in embedded systems, in my not-so-humble opinion, is that of abstracting out I/O: the ability to write applications that are agnostic as to from where their input comes and to where their output goes. This is exactly what the IOS and its myriad of pointer-filled control blocks (what C programmers would call structures) accomplished.

I used this idea for one of my clients when I was hired to implement a wide assortment of bitmap-graphic processing functions for an existing C-based product whose digital hardware could fit in your shirt pocket. I quickly realized that a lot of the code I had to write - like a deflate-based decompression algorithm, or a decoder for Portable Network Graphics (PNG) files - would be used in many places in the system. I also realized that the data for these algorithms could be buffered in memory, could come streaming in from a serial, Ethernet, or USB ports, or could be stored in a flash-based file system. I badly needed some abstraction for sources and sinks that I could implement on this tiny, non-POSIX platform.

As I implemented it, a Source is an abstract interface that requires the implementation to provide a read byte and a push byte method. The read produces a byte if it is available, End Of File (EOF) if the Source is empty and no byte would ever be available, and End of Data (EOD) if the byte isn't available now but might be in the future. The push pushes a single byte back into the Source where it is guaranteed to be the next byte produced by a subsequent read. (This vastly simplifies any algorithm that can be described by a grammar with one-character look-ahead, for example LL(1) or LALR(1). Such grammars are extremely useful ways of describing any algorithm that does parsing or data transformation or indeed any state machine. Both the deflate and PNG applications mentioned above can be described as grammars, and from there the implementation in code is almost a mechanical process. But that is an article for another day.)

int readSource(Source * that);
int pushSource(Source * that, char data);

A Sink is an abstract interface that requires the implementation to provide a write byte method. The write consumes the byte provided by the caller if it can, returns EOF if the Sink is full and no byte will ever be consumed, and EOD if the byte cannot be consumed now but might be in the future.

int writeSink(Sink * that, char data);

There are also a close method for both Sources and Sinks, which may or may not do anything, depending on the underlying implementation. No open method? Nope, because as we shall soon see the open is part of the implementation, not the abstraction.

int closeSource(Source * that);
int closeSink(Sink * that);

As you can see, these simple byte operations expect a pointer to a Source or a Sink structure (what C++ programmers would call an object). What these pointers actually point to depends on the implementation. But the application doesn't need to know this. It just takes whatever Source or Sink pointer you give it and implements the application (what my younger colleagues would call the business logic) in an I/O agnostic manner.

What might an application look like that uses Sources and Sinks?

Here is a trivial one that just copies all of the data from a Source to a Sink, until either the Source is empty or the Sink is full, without having any idea what the Source and Sink really is (and not doing much in the way of error checking).

size_t copy(Source * source, Sink * Sink) {
size_t total = 0;
int data;

while (!0) {
if ((data = readSource(source)) < 0) {
if (writeSink(sink, data) < 0) {
pushSource(source, data);

return total;

Here's a code snippet of a calling sequence that uses the function above to copy data from a file to an existing socket represented by an open file descriptor. It does so by providing the function with a concrete implementation of its required source and sink.

FileSource file;
DescriptorSink descriptor;
Source * source;
Sink * sink;
size_t total;

source = openFileSource(&file, "/home/coverclock/data");
sink = openDescriptorSink(&descriptor, sock);
total = copy(source, sink);

What kind of Sources and Sinks might one implement?

A BufferSource implements a Source that produces its data from a fixed size memory buffer.

Source * openBufferSource(BufferSource * that, void * buffer, size_t size);

A BufferSink implements a Sink that consumes its data into a similar memory buffer.

Sink * openBufferSink(BufferSink * that, void * buffer, size_t size);

A FileSource produces data from a file in the file system (in Linux this could be just about anything, thanks to the /proc and /sys file systems).

Source * openFileSource(FileSource * that, const char * path);

A DescriptorSink consumes data to any file descriptor (like a TCP/IP socket).

Sink * openDescriptorSink(DescriptorSink * that, int fd);

A NullSink consumes all data without out any complaint and tosses it into the bit bucket.

Sink * openNullSink(NullSink * that);

A CompositeSource concatenates two Sources into a single Source.

Source * openCompositeSource(CompositeSource * that, Source * primary, Source * secondary);

An ExpanderSink writes each data byte to two different Sinks. (Yes, of course you can use another ExpanderSink as one or even both of the Sinks.)

Sink * openExpanderSink(ExpanderSink * that, Sink * primary, Sink * secondary);

A Fletcher8Sink computes a Fletcher 8-bit checksum as the Sink is written and automatically appends the checksum to the Sink when it is closed. (And before you ask, of course there is a Fletcher8Source that automatically verifies the data as it is read from the Source.)

Sink * openFletcher8Sink(Fletcher8Sink * that, Sink * primary);

A RingBuffer implements a circular buffer that exposes both a Source and a Sink interface.

RingBuffer * openRingBuffer(RingBuffer * that, void * buffer, size_t size);
Source * sourceRingBuffer(RingBuffer * that);
Sink * sinkRingBuffer(RingBuffer * that);

How are these Sources and Sinks implemented?

Easily. Most are a few lines of code, a few maybe a page. I'll be writing more about the implementation. But you can find the source code for these and other Sources and Sinks (and their unit tests) now in the
Digital Aggregates Concha distribution. Concha is a clean-room open-source implementation of the Source and Sink design pattern (which my younger colleagues will recognize as a form of dependency injection). It is licensed under the Desperado modified LGPL which allows static linking without any viral licensing implications.

Update (2011-02-21)

I improved the example slightly since the original publication of this article. Apologies as usual for the poor code formatting. (The Blogger editor doesn't respect leading spaces, even those explicitly coded as HTML special sequences, nor does it respect the pre HTML tag.)