Monday, September 29, 2014

What You Don't Know Can Hurt You

Below is a little snippet of C code from [Krebbers 2014]. Peruse it and see if you can predict what two values it will print. It's only a handful of lines long. Go ahead, take your time. I'll wait.

#include <stdio.h>
void main() {
    int x;
    int y;
    y = (x = 3) + (x = 4); 
    printf("%d %d\n", x, y); 

So let's compile it on my build server that's sitting a few feet away from me. It's a Dell x86_64 system with four 2.4GHz Intel cores running Ubuntu 14.04 with the 3.13 Linux kernel and the GNU 4.8.2 C compiler. It's old but still mighty.

coverclock@ubuntu:~/src/misc$ gcc -o foo foo.c

Good; no warnings, no errors.

coverclock@ubuntu:~/src/misc$ ./foo
4 8


This code isn't multi-threaded. It's barely single threaded. In fact, the code snippet is so simple, it hardly qualifies as anything beyond the classic "Hello World!" program.

Here's the thing: you may have gotten completely different results, if you used a different compiler. Or a different version of the same compiler. Or maybe even different compiler options for optimization or debugging levels. As Mister Krebbers points out in [Krebbers 2014]:
By considering all possible execution orders, one would naively expect this program to print 4 7 or 3 7, depending on whether the assignment x = 3 or x = 4 is executed first. However, the sequence point restriction does not allow an object to be modified more than once (or being read after being modified) between two sequence points [ISO C, 6.5 p. 2]. A sequence point occurs for example at the end ; of a full expression, before a function call, and after the first operand of the conditional ? : operator [ISO C, Annex C]. Hence, both execution orders lead to a sequence point violation, and are thus illegal. As a result, the execution of this program exhibits undefined behavior, meaning it may do literally anything.
Okay, so maybe not a huge surprise to folks who have memorized the ISO C standard. Or who are tasked with debugging problematic code by occasionally resorting to looking at the assembler code. Using a symbolic JTAG debugger that monitors the program at the hardware level, I've seen the program counter single step backwards in a sequential piece of C code, as the debugger traced the execution path the processor took through the optimized machine code and then tried to correlate it to the original source.

This is why you don't write tricky C code, playing games like trying to smash as much stuff into a single statement as you can. Because it can belie any kind of rational analysis. Because it becomes a debugging nightmare for the developer tasked with current engineering who comes after you. Because its behavior may change with your next compiler update. Or when it's ported to a project using a different compiler suite altogether.

Because it can bite you in the ass.


R. Krebbers, "An Operational and Axiomatic Semantics for Non-determinism and Sequence Points in C", 41st ACM SIGPLAN -SIGACT Symposium on Programming Languages, January 2014

International Organization for Standardization, ISO/IEC 9899-2011: Programming Languages - C, ISO Working Group 14, 2012

Lambda The Ultimate, "An Operational and Axiomatic Semantics for Non-determinism and Sequence Points in C", September 2014

Saturday, September 20, 2014

The Very Big and the Very Small

K. Asanovic at al., The Landscape of Parallel Computing Research: A View from Berkeley, EECS Department, U. C. Berkeley, UCB/EECS 2006-183, December 2006
(a paper I've cited here before) its authors, which include David Patterson (as in Patterson and Hennesy), remark
Note that there is a tension between embedded and high performance computing, which surfaced in many of our discussions. We argue that these two ends of the computing spectrum have more in common looking forward than they did in the past. [Page 4]
That's been my experience too, although perhaps for different reasons than the authors cite. I’ve made an excellent living flipping back and forth between the high performance and embedded domains. It turns out the skill sets are mostly the same. In particular, developers in both domains are constantly concerned about very low level details in the realm where software runs close to bare metal, and are always dealing with issues of real-time, asynchronicity, parallelism, and concurrency. These are relatively rare skills that are hard to come by for both the employer and the employee.

I was reminded of this as my tiny one-man company, Digital Aggregates Corporation, buys its fourth Android tablet to use as a development system. These tablets contain powerful multi-core ARM-based processors as well as other embedded microcontrollers and devices. And increasingly I am seeing the embedded and mobile device domains adopt technologies originally developed for large-scale systems, like Security Enhanced Linux (SELinux) and OS-level containerization like Linux Containers (LXC).

I’ve seen large development organizations axe their firmware developers as the company decided to get out of the hardware business to focus on large multi-core server-side software applications. What a remarkable lack of insight into the nature of the technologies on which their businesses depend.

Thursday, September 11, 2014

I've C++11ed and I can't get up!

(Updated 2014-09-14)

C++11 is the latest iteration of the standard for the C++ programming language. This is the 2011 version of the standard that was known as C++0x in its draft form. (C++14 is forthcoming.) There were some new features of C++11 that I thought I’d play around with since I have a little bit of time between gigs. I'm a big believer in using C++ for embedded and even real-time applications whenever possible. But it's not a slam dunk. The language is complex, and growing more complex with every standards iteration.

Using C++ effectively has many benefits, even in the embedded/real-time domain. But it can place a burden on the development team; I have found it relatively easy to write C++ code that is nearly incomprehensible to anyone except the original author. Try debugging a complex problem in code that you did not write and that uses the Standard Template Library or the Boost Library to see what I mean.

My little test program that I've been futzing around with can be found here

which is useful since Blogger seems to enjoy hosing up the angle brackets in my examples below that use templates.

I like the decltype but I wish they had used typeof to be consistent with sizeof. I cheated.

#define typeof decltype

    long long int num1;
    typedef typeof(num1) MyNumType;
    MyNumType num2;

    printf("sizeof(num1)=%zu sizeof(num2)=%zu\n", sizeof(num1), sizeof(num2));

I really like the ability for one constructor to delegate to another constructor (something Java has always had). I also like the instance variable initialization (ditto).

    class Thing {
        int x;
        int y = 2;
        Thing(int xx) : x(xx) {}
        Thing() : Thing(0) {}
        operator int() { return x * y; }

    Thing thing1, thing2(1);

    printf("thing1=%d thing2=%d\n", (int)thing1, (int)thing2);

An explicit nullptr is nice although a 0 still works.

    void * null1 = nullptr;
    void * null2 = 0;

    printf("null1=%p null2=%p equal=%d\n", null1, null2, null1 == null2);

The new alignas and alignof operators solve a problem every embedded developer and systems programmer has run into and has had to resort to proprietary, compiler-specific directives to solve.

    struct Framistat {
        char a;
        alignas(int) char b;
        char c;
    printf("alignof(int)=%zu sizeof(Framistat)=%zu alignof(Framistat)=%zu\n", alignof(int), sizeof(Framistat), alignof(Framistat));

    Framistat fram1[2];

        , &fram1[1].a - &fram1[1].a, sizeof(fram1[1].a)
        , &fram1[1].b - &fram1[1].a, sizeof(fram1[1].b)
        , &fram1[1].c - &fram1[1].a, sizeof(fram1[1].c)


I like the auto keyword (which has been repurposed from it’s original definition). You can declare a variable to be a type that is inferred from its context.

    auto foo1 = 0;
    auto bar1 = 'a';

    printf("sizeof(foo1)=%zu sizeof(bar1)=%zu\n", sizeof(foo1), sizeof(bar1));

You can use {} for initialization in many contexts, pretty much anywhere you can initialize a variable. (Yes, the missing = below is correct.)

    int foo3 { 0 };
    char bar3 { 'a' };

    printf("sizeof(foo3)=%zu sizeof(bar3)=%zu\n", sizeof(foo3), sizeof(bar3));

Here’s where my head explodes.

    auto foo2 { 0 };
    auto bar2 { 'a' };

    printf("sizeof(foo2)=%zu sizeof(bar2)=%zu\n", sizeof(foo2), sizeof(bar2)); // WTF?

The sizeof(foo2) is 16. 16? 16? What type is foo2 inferred to be? I haven’t figured that one out yet.

I like the extended for statement where it can automatically iterate over a container or an initialization list. The statements

    enum class Stuff : uint8_t {

    for (const auto ii : { 1, 2, 4, 8, 16, 32 }) {
        printf("ii=%d\n", ii);



    for (const Stuff ss : { Stuff::THIS, Stuff::THAT, Stuff::OTHER }) {
        printf("ss=%d\n", ss);

    std::list<int> mylist = { 1, 2, 3, 5, 7, 11, 13, 17 };

    for (const auto ll : mylist) {
        printf("ll=%d\n", ll);

do exactly what you would expect. Also, notice I can now set the base integer type of an enumeration, something embedded developers have needed forever. And I can use a conventional initialization list to initialize the STL list container. But if there's a way to iterate across all of the values in an enumeration, I haven't found it.

I’m kind of amazed that I figured out the lambda expression stuff so easily (although I have a background in functional languages going all the way back to graduate school), and even more amazed that it worked flawlessly, using GNU g++ 4.8. Lambda expressions are a way to, in effect, insert a portion of control of the calling function into a called function. This is much more powerful than just function pointers or function objects, since the inserted lambda can refer to local variables inside the calling function when it is being executed by the called function.

const char * finder(std::list<std::pair
<int, std::string>> & list, const std::function <bool (std::pair<int, std::string>)>& selector){
const char * result = nullptr;

for (auto ll: list) {
if (selector(ll)) {
result = ll.second.c_str();

return result;


    std::list<std:pair<int, std::string>> list;

    list.push_back(std::pair<int, std::string>(0, std::string("zero")));
    list.push_back(std::pair<int, std::string>(1, std::string("one")));
    list.push_back(std::pair<int, std::string>(2, std::string("two")));
    list.push_back(std::pair<int, std::string>(3, std::string("three")));

    for (auto ll : list) {
        printf("ll[%d]=\"%s\"\n", ll.first, ll.second.c_str());

    int selection;
    selection = 0;
    printf("list[%d]=\"%s\"\n", selection, finder(list, [&selection] (std::pair<int, std::string> entry) -> bool { return entry.first == selection; }));
    selection = 1;
    printf("list[%d]=\"%s\"\n", selection, finder(list, [&selection] (std::pair<int, std::string> entry) -> bool { return entry.first == selection; }));
    selection = 2;
    printf("list[%d]=\"%s\"\n", selection, finder(list, [&selection] (std::pair<int, std::string> entry) -> bool { return entry.first == selection; }));
    selection = 3;
    printf("list[%d]=\"%s\"\n", selection, finder(list, [&selection] (std::pair<int, std::string> entry) -> bool { return entry.first == selection; }));
    selection = 4;

    printf("list[%d]=\"%s\"\n", selection, finder(list, [&selection] (std::pair<int, std::string> entry) -> bool { return entry.first == selection; }));

Lambda expressions appeal to me from a computer science viewpoint (there's that graduate school thing again), but I do wonder whether they actually provide anything more than syntactic sugar over alternatives like function objects whose type inherits from a predefined interface class. Lambdas  remind me of call-by-name and call-by-need argument evaluation strategies, both forms of lazy evaluation.

Where C is a portable structured assembler language, C++ is a big, complicated, high-level programming language that can be used for applications programming or for systems programming. It has a lot more knobs to turn than C, and some of those knobs are best left alone unless you really know what you are doing. In my opinion it is much easier to write poor and/or incomprehensible code in C++ than it is in C. And this is coming from someone who has written hundreds of thousands of lines of production C and C++ code for products that have shipped, and who was mentored by colleagues at Bell Labs, which had a long history of using C++ in embedded and real-time applications. One of my old office mates at the Labs had worked directly with Bjarne Stroustrup; I sucked as much knowledge from his brain as I could.

C++, and especially C++11, is not for the faint hearted. But C++ is an immensely powerful language that can actually produce code that has a smaller resource footprint than the equivalent code in C... if such code could be written at all. C++ is worth considering even if you end up using a limited subset of it; although having said that, I find even widely used subsets like MISRA C++ too restrictive.