Thursday, August 16, 2012

Big Things In Small Packages

In All the Interesting Problems Are Scalability Problems I remarked that the Mac Mini on which I am writing this article runs at 150 times the processor speed of the AVR microcontroller for which I was developing, but has 16,000 times the memory. This radical disparity led to some interesting design decisions and tradeoffs. That observation was reinforced -- in spades -- recently on a gig for which I was writing firmware in C for a PIC (for Peripheral Interface Controller) microcontroller.

Like the Atmel AVR ATmega2560 I was originally writing about, and typical of many other microcontrollers, the Microchip Technology Inc. PIC16F1823 is a Harvard architecture: the executable code and the data reside in two completely different memories. Instructions live in on-chip flash from which they are executed directly. Data lives in on-chip RAM. This particular PIC has an 8MHz instruction clock, so it executes an instruction every 125 nanoseconds. Long ago but within my memory (although I am so old I am basically a brain in a jar), that would have been considered impressively fast, considerably faster in fact than the original IBM PC. The PIC16F1823 has a scant two kilowords of flash, a word being equivalent to a single machine instruction. And ninety-six bytes of RAM.

Let me repeat that: ninety-six bytes; there is no K there. It has a 128 byte address space by virtue of its whopping seven bit addressing. But thirty-two of those bytes are dedicated to registers used to configure and control its several I/O controllers, the PIC being a system on a chip (SoC). Just naming the controllers that I have written drivers for, the PIC provides eight and sixteen-bit timers, a pulse width modulation (PWM) generator, analog to digital converters (ADC), an asynchronous serial port, an I2C serial bus interface, and the usual general purpose I/O (GPIO) input and output pins. The package I am using has sixteen pins, of which it uses fourteen; the remaining two are unconnected and serve merely to hold the chip down to the printed circuit board.

How complex an application can you write in ninety-six bytes? Quite complex, as it turns out. One of the boards for which I've written code, all in C, has six interrupt service routines and can exhibit quite sophisticated behavior. Not bad for a processor with a unit price of about a buck fifty U.S., and quantity pricing under a dollar. But every single line of code I write requires some agonizing. Do I really need this variable? Does it really need to be two bytes? Can it be one byte? Can I do without it completely?

I say I'm writing in C, but it's a dialect of C specific to this device, one that provides capabilities beyond that of ANSI C or even the GNU enhancements. For example, there is a bit data type that, you guessed it, takes up a single bit. As you might expect under the circumstances, I use those whenever I can. When I recently wrote a function that really really needed a four-byte integer, I nearly had a stroke.  Just one of those variables takes up more than four percent of the entire available RAM.

In Hitting a Moving Target While Flying Solo I talked about the challenges of using compilers and other tool chain elements for embedded targets where those elements were not nearly as widely used as those of more mainstream processors like the Intel Pentium. Those words came back to haunt me, as I was using the Hi-Tech PICC C compiler provided to me by the customer.

I was debugging my I2C state machine, which sure seemed to be doing impossible things. I was fortunate enough to have a debugger with which I could single step through the state machine as it was stimulated, and simultaneously watch its state variable change value. I was -- stunned is the only word for it -- to see the state machine, which was implemented as a C switch statement, entering the wrong case statements based on the value of the state variable. I assumed the debugger was lying to me. So I used a few precious bytes to instrument my code. Nope, the debugger was right: the C switch statement did not work correctly.

I came to find out that this is a known problem in the 9.81 compiler I was using, documented in the release notes for the 9.82 version. WTF? When's the last time you used a C compiler in which the switch statement didn't work? Ever? This is what I'm talking about.

However, the Hi-Tech PICC compiler is quite clever in other respects. Other processors I've used have a stack that is used by languages like C and C++ on which to push return addresses during function calls, to create a stack frame containing function parameters, and to allocate memory for automatic variables within the scope of a function.

The PIC16F1823 has a stack implemented in a memory that is separate from either program or data memory. It is used solely to push return addresses for functions and interrupt service routines. It has a fixed depth of sixteen words. The microcontroller has no data stack.

The Hi-Tech PICC compiler deals with this by performing a static call-tree analysis at link time, determining the maximum possible depth of function call and interrupt service routine nesting, and warns you if you exceed it. It also uses this information to allocate space for function parameters and automatic variables at fixed memory locations, just as if they were static variables. It uses the call-tree information to overlap these allocations such that there is no call path for which there is a memory use conflict. It is for this reason that C functions for this target cannot be reentrant, and cannot be called recursively.

So whether you exceed your generous allotment of ninety-six bytes depends not only on how many static variables you have, but also on the exact pattern of function call nesting you implement. This creates a conflict in trade-offs: you are highly motivated reduce all duplicated code to a separate function to save program space, as long as the code generated in doing so is shorter than the duplicated code. But you constantly run the risk of creating a function call path that cannot be supported by the compiler. This definitely results in a simpler is better approach to developing code for this target.

In Welcome to the Major Leagues I mentioned how hard it was to define embedded development exactly, and about the surprising (to some anyway) overlap between developing for the very small and the very large. This latest effort has been a great learning opportunity for me to add a few new tricks to my toolbox.