Thursday, March 20, 2014

Python, Bash, and Embedded Systems

Lately I've been working on a little development project, Hackamore, as an excuse to learn the programming language Python. Hackamore is a multi-threaded framework that connects to one or more Asterisk PBXes via their Asterisk Management Interface (AMI) ports with the goal of dynamically modeling their channel and call states, including calls that cross PBXes via SIP trunks.

candidates = [ candidate for candidate in self.sources.values() if candidate.fileno() >= 0 ]
effective = 0.0
while candidates:
    # Service all pending I/O on every open Socket. Our goal here
    # is to consume data in the platform buffers as quickly as possible.
    for source in select.select(candidates, NONE, NONE, effective)[0]:
        source.service()
    active = False
    # Process queued events on every open socket. Should we process all
    # Events on each Source before moving onto the next one, or should
    # we round robin? There's probably no answer that will be right
    # every time. The code below does the former. Mostly we want to
    # stimulate the Model with each Event as quickly as possible
    # regardless of the Source.
    for source in candidates:
        while True:
            event = source.get(self)
            if event == None:
                break
            active = True
            message = Event(event, source.logger)
            yield message
    effective = 0.0 if active else timeout
    if effective > self.effective:
        self.logger.debug("Multiplex.multiplex: WAITING. %s", str(self))
    self.effective = effective

It's definitely a work in progress. Hackamore is more of a developer's and tester's tool than it is any real means for an administrator to monitor their PBX activity. It's output is just a straightforward ASCII report with some minimal ANSI terminal control that works with an X terminal. And mostly it was a way to figure out how to meet all my usual needs -- multithreading, synchronization, sockets, and general source code organization -- with a programming language I had never used before.


I like Python. It reminds me somewhat of the experimental languages like FP, BADJR, and BAFL that I played with the early 1980s when I was in graduate school and working in a research group that was investigating programming language and operating system architectures for hypothetical hardware inspired by the Japanese Fifth Generation project. Python's evolution started in 1989, so I sometimes wonder if it's creator, Guido van Rossum, had similar inspiration.

But there's a more pragmatic reason to like Python. Being the Type A sort of personality, I start every morning with a list of things I want to accomplish. The other day I settled in to my office around 0800 with my list in mind to work on Hackamore. By 1000 I was done.

"Huh," I thought, "guess I better start making a longer list."

That's happened a lot on this project. And it's why embedded developers need to start thinking beyond assembler, C, and even C++, and start considering the use of interpreted languages, byte code languages, and shell scripts whenever possible. While  my colleagues that work on tiny eight-bit micro-controllers may remain firmly in the assembler and C camps, and those that work on resource constrained platforms with an RTOS may never get past C++ (although I would encourage them to at least consider going that far), the rest of us need to think beyond int  main(int argc, char ** argv).

With the growing number of embedded systems that run Linux, it is becoming increasingly possible that a significant portion of our applications can be written in languages that feature an enormous productivity improvement. This improvement comes from a vastly shortened duration of the compile-test-debug iteration, from better development tools, from a capability to develop off target, from the availability of a large number of open source frameworks, and from an ability to work at a significantly higher level of abstraction where an application takes a few dozen lines of code instead of a few thousand. Python may not be your embedded tool of choice. But a shell script, even the relatively simple ash shell that is implemented in the ubiquitous BusyBox embedded tool, might be sufficient.

Years ago I remember talking to a colleague about some code we needed for a commercial Linux-based embedded telecommunications product. This little piece code was something that would only be run very occasionally, and only on demand by someone logged into the system. It became clear in the conversation that my colleague wanted to go off and start writing C code so we could have something to use in a few days. "Or, " I said, "we could write a twenty line bash script and be done in an hour." And that's what we did. It might have taken a wee more than an hour.

That happens a lot too. When all you have is a hammer, everything looks like a nail. And many a embedded developer's first instinct is to go straight to Kernighan & Ritchie. It doesn't help that most project managers are too clueless technically to know that this is a really expensive decision. I've had developers argue with me about the performance of scripting languages, but when you're talking about a program that will only be run occasionally and has no real-time requirements, the difference in total  cumulative execution time between a script and a compiled C program may only total to minutes over the entire lifetime of the commercial product in which it is used.

Even applications that talk to hardware can be scripted. That's why I wrote memtool, a utility written in C that makes is easy to read, write, and modify memory-mapped hardware registers from the command line. For sure, memtool is useful interactively. But where it really pays off is in shell scripts where you can do stuff like manipulate an FPGA or interrogate a status register without having to write a single line of new C code.  (The shell output below was scraped right off a BeagleBoard of mine running Android.)

bash-3.2# memtool -?
usage: memtool [ -d ] [ -o ] [ -a ADDDRESS ] [ -l BYTES ] [ -[1|2|4|8] ADDRESS ] [ -r | -[s|S|c|C|w] NUMBER ] [ -u USECONDS ] [ -t | -f ] [ ... ]
-1 ADDRESS    Use byte at ADDRESS
-2 ADDRESS    Use halfword at ADDRESS
-4 ADDRESS    Use word at ADDRESS
-8 ADDRESS    Use doubleword at ADDRESS
-C NUMBER     Clear 1<<NUMBER mask at ADDRESS
-S NUMBER     Set 1<<NUMBER mask at ADDRESS
-a ADDRESS    Optionally map region at ADDRESS
-c NUMBER     Clear NUMBER mask at ADDRESS
-d            Enable debug mode
-f            Proceed if the last result was 0
-l BYTES      Optionally map BYTES in length
-o            Enable core dumps
-r            Read ADDRESS
-s NUMBER     Set NUMBER mask at ADDRESS
-t            Proceed if the last result was !0
-u USECONDS   Sleep for USECONDS microseconds
-w NUMBER     Write NUMBER to ADDRESS
-?            Print menu

Even if your embedded target is too small to host even a simple shell interpreter, learning programing languages that are not natively compiled to machine code will prove valuable. This is true of Python in particular. Python is so easily interfaced with C-based  libraries that hardware vendors are starting to provide Python bindings for libraries that interface with their chips so that developers can trivially write code to monitor and manipulate their product. My friend and occasional colleague Doug Gibbons was just telling me the other day that he was using Python to monitor the performance of his signal processing code on DSPs. Python and other similar languages offer such an enormous productivity boost that I expect this trend  to continue upwards. I'm also seeing Quality Assurance testers using Python more and more to automate functional testing of the embedded systems on which I work. Knowing a little Python helps me relate to them.

If you're an embedded  developer, you are quickly running out of excuses for not learning some of the new programming languages,  even if you never expect to run those languages directly on the embedded target for which you're developing.

I'm old as the hills. If I can do this, so can you.

3 comments:

Fernando Mondello said...

I started programming in Python a year ago, and instantly I was amazed by the easiness that you can do almost everything from automate a test to process data or generate C code.

I have to admit that even knowing the benefits of a high-level programming for these simple tasks, C still was my go-to language to do everything. I refused to write code with no variable type. My brain was refusing to make the transition after many years of embedded C programming.

Nowadays I use Python a lot and don't feel any regret for the transition. I think that it's all about being efficient, and programming in C or assembler is not always the case.

It's not a coincidence that there are more and more blog posts or articles about the use of Python in the embedded field.

Fazal Majid said...

I love Python and have been using it since 1994. For embedded systems, other languages worth exploring are Javascript, as implemented in Node.js (the non-Android Beaglebone Linux distro is built around it) and Lua is another scripting language, with a very efficient runtime (specially if you use LuaJit). Forth was invented specifically for embedded applications and is also very efficient, but has lost popularity.

Extending Python in C is fairly easy, and you can call directly into shlibs/DLLs using ctypes. The best of both worlds is when you write the outer shell of your program in Python for productivity, and optimize only the core loops in C for speed.

Chip Overclock said...

I agree with both Fernando and Fazal: as embedded systems gain in horsepower (which some may argue aren't really embedded in the traditional sense), developers have a responsibility to control costs by working as high as possible level of abstraction when performance and/or resource footprint aren't a concern.

I recently spent a year working on a telecommunications product which was physically about the size of a hardbound book, but which contained four ARM processors each running Linux. And as Fazal suggests, we used node.js (an efficient JavaScript interpreter) on the non-real-time portions to implement our OAM (operations, administration, management) layer to an external web browser which would be run from the maintenance technician's laptop. It was suggested by some of the younger guys on the project, and it worked great. (I can't claim to know JavaScript, but even I did a little hacking on the code from time to time.)

And Fazal, I'm an old FORTH hacker since the 1980s, having even had a paper at one of the FORTH conferences on using that language on LSI-11s (!!) for real-time control. Some years ago I integrated a FORTH interpreter written in C into a C++ application to use as a kind of embedded shell. It's a great idea.

As always, thanks for the insightful comments.