Thursday, July 19, 2012

Good Day Sunshine On My Arduino

Here's an update on my article Sunshine On My Arduino Makes Me Happy. This is part of my Amigo project, which in part experiments with alternative power sources for eight-bit AVR microcontrollers.

I went from a 1.5W to a larger 5W solar panel, then to much larger 15W solar panel. We'll see if that's enough to charge the 12V battery during the day and let the system run all night. I also had to go to Xbee Series 2 radios with external antennas on both the Arduino Uno with the Xbee shield in the "instrument pod", and on the Xbee Explorer that is USB connected to my desktop, in order to get the range I needed. The chip antennas couldn't cover the few tens of yards from the south-western edge of my back yard through a wall to my home office on the south side of my house.

Here's the instrument pod with its external antenna visible at the lower left. The pod looks fluorescent green in the photograph. It was originally a translucent white that I spray painted fluorescent yellow. The pod looks a lot better in the photograph than it does in real-life; it definitely has a cobbled-together look about it.

Instrument Pod: Cover Off

Crammed into this little box is the Arduino Uno with the Xbee shield, a battery meter activated by a tiny pushbutton switch, the solar charge controller, and a 12V sealed battery. The software on the Arduino merely pings my desktop every second. (In the past I've written about various environmental sensors I've already tried on this platform.)

Here is the instrument pod at the edge of my back yard connected to the largish (about 42" x 15" or 105cm x 38cm) 15W solar panel.

Instrument Pod and 15W Solar Panel

Here is the tiny Xbee Explorer, dwarfed by its own external antenna, USB attached to my desktop Mac.

Xbee Coordinator on Xbee Explorer

I'll monitor this for the next few days and see what happens.

Update (2012-07-23)

The instrument pod has been up continuously for over four days now. It pings my desktop once a second. The state of charge meter I have hooked up to the 12V battery, activated by a little pushbutton inside the pod, shows the battery to be completely charged. So far so good.

Update (2012-07-28)

The solar-recharged instrument pod has been continuously wirelessly pinging my desktop via its Zigbee radio for just short of nine days now, having survived several rain storms. Here's a snippet from the log file below. The ISO 8601-style time stamp is generated by the logging script on my desktop Mac; the duration timestamp, showing eight days and twenty-three hours, is generated by the remote Arduino in the instrument pod, and indicates how long the Arduino has been running since it last powered up.

2012-07-28T10:00:03 8:23:36:30
2012-07-28T10:00:04 8:23:36:31
2012-07-28T10:00:05 8:23:36:32
2012-07-28T10:00:06 8:23:36:33
2012-07-28T10:00:07 8:23:36:34
2012-07-28T10:00:08 8:23:36:35 

Update (2012-08-09)

The instrument pod finally lost power after running continuously for more than ten days. I could tell by periodically checking the battery's state of charge that the solar panel wasn't keeping up during the day with the loss of charge during the night. But while I would have preferred it stay up, this gives me some hope that by just mounting the solar panel in a better, more sunny, location, it might work continuously during the summer. Whether it do so during the winter is another matter.

Tuesday, July 17, 2012

The Death of Hard Power Off

You've already noticed this: when you hit the power button, it takes several seconds for the device to go dark. If it has a display, your device might show a little spinning or blinking icon to indicate it is doing something. This is true for your hand-held mobile devices: your tablet, your mobile phone. It is also true for your laptop and your desktop, and for the rack-mounted servers at your data center. It is also the case for consumer devices you perhaps don't give that much thought to: your digital video recorder, your MP3 player, or, perhaps, even your automobile.

Increasingly, digital devices implement a soft power off. Which is to say, when you press the power off button, you are not turning the power off. You are informing a piece of software of your desire for it to turn the power off. A million lines of executed code later, the power turns off. Usually.

Compare this to a hard power off, which is more like a simple light switch: the instant the purely mechanical switch separates a set of metallic contacts, the electrical circuit is interrupted and power is immediately removed from the device.

Soft power off has permeated our digital devices, more or less without us users thinking about it, for one reason: the need to maintain a consistent state.

This state could be your device remembering your web page history. Or its position in your music play-list. Or where you paused your movie. Or the last number you dialed. But state can also be something a lot more abstract, data the device has to save as part of some function or service it is doing on your behalf, or even something in the realm of routine maintenance, the details of which might make your eyes glaze over if you actually had to know about it. For example, devices with global positioning system capabilities - which is nearly everything now - like to save information about the GPS satellites used during the last position fix because this can vastly speed up acquisition of the same satellites the next time you turn the device on, providing you haven't moved very far or it hasn't been turned off for very long. You appreciate this capability even if you don't know about it.

This state could be saved on a remote server for network attached devices, whether they are wireless or wireful. But more often than not these days, state is saved on a persistent  read-write storage drive embedded directly in your device. The growth of read-write storage in embedded devices has exploded in recent years. Very early mobile digital devices actually had tiny surface-mount spinning disk drives. But the introduction of less expensive flash memory, read-write persistent semiconductor memory with no mechanical parts, now dominates the mobile device market, and is beginning to dominate even the less mobile laptop market.

Sometimes this flash memory is used directly by the device; the operating system uses a file system implementation like Journalling Flash File System 2 (JFFS2) and Yet Another Flash File System (YAFFS) that makes the flash behave less like memory and more like a disk drive, and which provides the usual functional capabilities like directories and files and permission bits and the like.

Some read-write persistent storage devices, like a USB memory stick, or a microSD memory card, offer a slightly more disk-like hardware interface on top of the underlying flash, and the operating system conspires to make the storage device behave like a disk to the application software. My little shirt-pocket-sized ODROID-A4, a battery-powered and WiFi-connected platform reference device produced by Hardkernel for developers writing low level code for Samsung's Galaxy Android smartphones and tablets, uses a microSD card for its persistent storage. But the A4 layers on top of it disk partitioning and multiple EXT4 file systems, something you would have in the past expected to find on a server at the data center.

Solid state disks (SSDs) are storage devices which emulate a full blown disk hardware interfaces on top of the underlying flash memory. Not even the operating system may be able to tell the difference between the SSD and a spinning disk. I've built embedded products using SSDs that used the stock disk drivers in Linux. On these systems I had no choice but to use file system implementation tailored for disk drives, like EXT3, because that's the hardware interface I had to work with.

The introduction of read-write persistent disk-like semantics to mobile devices brings with it not just all the convenience and capabilities of having spinning disks, but all the issues the plague our mobile devices' bigger cousins that traditionally use those spinning disks. I've already written about issues of data remanence and solid-state storage devices. But here, I'm talking about basic reliability.

Perhaps you have learned the hard way to put your desktop system on an uninterruptible power supply. You may not appreciate the fact that your laptop has its own built-in UPS, but you depend on that fact just the same. And pulling the power cord out of a running server at your data center is a good way to get escorted to the door by your organization's security apparatus. There is a reason why all of these devices now implement soft power off. And why Google added a twelve volt battery to each individual server.

The reason is that as application software has become more and more complex, its demands of its underlying storage system has increasingly become more and more like a database transaction, either in fact (because it uses an actual database) or in function (because it requires atomically consistent behavior to be reliable). It is for this reason that, no matter what the nature of the underlying storage device, file system implementations like EXT3 and EXT4 have borrowed from the database world and are journalled file systems: a single atomic write to a sequential file or journal on the storage device is first done to record the intent of the following more complex multiple write operations which may be spread across the storage device. If a failure occurs during the multiple writes, the journal is consulted during the restart to repair the file system. (Log-structured file systems do away with the subsequent multiple write operations completely and merely reconstruct the vision of the file system as seen by the application software from the sequential log file as a kind of dynamic in-memory hallucination, with some performance penalty.)

Update 2016-03-09: Something I failed to make clear here is that in the case of journalled file systems, only the meta-data -- that is, only the writes done to modify the structure the file system itself -- are saved in the journal, not the writes of the data blocks. This allows the file system to be repaired following a power cycle, such that the file system structure is intact and consistent. But the data writes in progress at the time of the power cycle are lost. One of the symptoms I've seen of this is zero-length files. The file entry in the directory was completed from its record in the journal, but the actual data payload was not.

The need for consistent file system semantics has lead to a lot of research in file system architectures, because techniques like journalling are not perfect, and sometimes not adequate for applications software that have more complex consistency requirements than just knowing whether a particular disk block has been committed reliably to the storage device. But more practically, it has lead to the end of hard power off as a hardware design. Soft power off gives the software stack time to commit pending writes to storage to insure a consistent state on restart. (And for network connected devices which may depend on consistent state on remote servers, it allows for a more orderly notification of the far-end and shutdown of communication channels.)

The web is full of woeful tales of users who bricked their devices by cutting the power to them at an inopportune moment. And I have my own horror stories of products on which I've worked with read-write persistent storage but architected with only hard power off.

Hard power off is such an issue in maintaining the integrity of SSDs that the more reliable ones (by which I mean, the only ones you should ever use) implement their own soft power off, in the form of a capacitor-based energy storage system, to keep the device running long enough to reach a consistent internal state. There are a lot of SSDs that don't do this. Those SSDs are crap. As you will learn the hard way once you've cycled power on them just a few times. (If you are using SSDs in any context, adding a UPS to the system in which they are used is not sufficient. As long as power is applied, the tiny controller inside the SSD is doing all sorts of stuff, all the time, asynchronously, whether or not your system is using it, even if your system has been shutdown. Like garbage collecting flash sectors for erasure as a background task. Only the controller inside the SSD knows when it's reached a consistent state; neither the operating system nor even the BIOS has any visibility into that activity.)

This is just going to get worse. The decreasing cost of solid-state read-write persistent storage makes it more likely that it will be used in less and less expensive (and hence a greater and greater number of) small digital devices. Increasing memory sizes on digital devices allows more complex software, which places greater demands on the storage system. Larger memory also increases the amount of data cached there, typically for reasons of performance, which stretches the latency in committing the modified data to storage, and increases the likelihood that an inconsistency will happen should a failure should occur. (One of the principle differences between the EXT3 and the EXT4 file systems is the latter caches data more aggressively.) We should have expected this just by looking at the disparate technology growth curves.

As you consider expanding your product line to include more digital control in your embedded products, it will occur to you that adding some solid-state read-write persistent storage would be a really good thing: to store user settings, to allow firmware to be updated, to implement more and smarter features. Once you take that step, remember that you now face the issue of soft power off where perhaps you didn't before. Because with today's digital devices, hard power off is dead.

Update 2016-03-09: Nearly four years after having written this article, I am still trying to convince my clients that they cannot design their embedded products with a hard power off switch, even as the complexity of these products, in both software and hardware, evolves to look more and more like data center servers. And failing that, helping them try to figure out how to make their products more recoverable in the field when their file systems -- much much larger than they were four years ago -- are hopelessly scrogged.

Thursday, July 12, 2012

STEM, RPN, and Market Forces

There was a time when giants walked the Earth. Giants who were into science, technology, engineering, and mathematics, or what today is known as STEM. Giants who landed men on the moon and brought them back. And when those giants needed to solve some problems, problems so hard they couldn't just solve them in their heads (which were mighty problems indeed), they pulled out their trusty Hewlett Packard calculators and made mysterious incantations in what is known as Reverse Polish Notation or RPN.

HP made an entire line of domain-specific calculators just as today we have domain-specific programming languages, calculators designed for specialized job functions. Here are three of them that I own and still use on a regular basis.

Hewlett Packard 11C Scientific Calculator

The 11C is a scientific and engineering calculator. It features all the usual logarithmic,  trigonometric, and exponential functions. It's programmable: you can save sequences of steps, including iteration and conditional expressions, to calculate long equations automatically. If you took science and engineering courses in high school or college, this calculator would have been your faithful servant. You can even use it to balance your check book.

Hewlett Packard 12C Financial Calculator

The 12C is a financial calculator. It computes depreciation, loan amortization, and net present value (NPV), stuff I actually have to do now and then on those occasions when I have to put on my management hat. It also does a bunch of stuff that I know little or nothing about, but the MBAs who read my blog (remarkably, there are one or two) will recognize. You can tell this calculator was meant for guys on Wall Street because it's metal trim is gold instead of silver. No, I'm not kidding.

Hewlett Packard 16C Programmer's Calculator

The 16C is a programmer's calculator. It handles calculations in decimal (base ten), hexadecimal (base sixteen), and octal (base eight). The ability to do octal arithmetic will be appreciated by my old comrades from my PDP-11 days, since that minicomputer insisted on organizing everything in three-bit units to match the fact that it had a whopping big set of eight registers. The 16C handles logical operations like and, or, and exclusive-or, bit shifts and rotations, and both one's and two's bit complement. These are the kinds of calculations you routinely do when you do the kind of work I do. I use my 16C on nearly a daily basis.

People will tell you that the software calculator they have on their laptop or tablet is just as good as these old HP calculators. Those people are wrong. What they really mean is that once in a while they have to add numbers bigger then they can do in their head, and the little calculator that came on their Windows or Mac laptop suffices. I use those calculators too, and will even admit that the Calculator program found in the Applications directory on Mac OS X (not the dumb as a brick calculator widget) is actually pretty good. I once had an excellent calculator utility on my Palm Pilot for which I paid real money; it emulated an HP calculator right down to the colors, shape, and placement of the buttons. But for the most part, calculator software applications seem more like someone's freshman computer science project. When the going gets tough, the tough scientists and engineers get out their HP calculators and start slinging RPN.

Or at least, they used to. I use my 16C so often that it occurred to me that maybe for not too much money I could own two of them, one to keep in my office at home and one to keep in the briefcase I take to client sites. After some web perusal what I discovered is guys like me (both in age and profession) paying ridiculous prices for used HP 16C calculators on eBay. The cheapest one I saw today was US$71, the most expensive US$389. Three hundred and eight-nine dollars! They've become fraking collectors' items!

I immediately turned around in my office chair and put all three of my HP calculators in my floor safe.

But I don't mean to imply that you can't buy any of these calculators new. HP still manufactures one of them, and it looks exactly like mine that is pictured above. Can you guess which one?

If you guessed the 12C financial calculator, then you probably have some insight into what all the fuss is about when people express concern that not enough students are majoring in STEM-related fields. Seriously. The fact that HP, once the premiere manufacturer of scientific and engineering calculators, now just makes one of these little shirt pocket sized marvels for those folks with a business school background tells you something about how they see the market for their products.

To be fair, HP does make some pretty nice looking larger models; I find the 35s scientific calculator to be kind of sexy, and it still supports RPN. But I don't see anything even remotely like my beloved 16C. I'm guessing those Java developers with their new fangled multi-core servers and their fancy sixty-four bit addressing just don't need to do hexadecimal arithmetic anymore.

I hate how I sound when I get this way. Hey, you kids! Get off my lawn! I blame Congress.

Wednesday, July 11, 2012

Embassykernel: Overlaying GNU on Android on the ODROID-A4

In The System & The System: Overlaying GNU on Android on the Beagle Board I described my Contraption project where I ran both the Android Frozen Yogurt (FroYo) software stack and a small GNU software stack side by side on the BeagleBoard-xM. Both stacks ran under on a common Linux 2.6.32 kernel on a TI OMAP system-on-a-chip with an ARM Cortex-A8 core. It was a hack, no doubt about it, but for me it was a useful one. I'll go into why in a bit.

For my Conestoga project, I have worked the same hack: this time running both the Android Ice Cream Sandwich (ICS) software stack and a small GNU software stack side by side on the ODROID-A4. The A4 is a Samsung platform reference device made by Hardkernel with the same form factor as the Samsung Galaxy Android smartphones. Both stacks run under a common Linux 3.0.15 kernel on a Samsung Exynos system-on-a-chip with dual ARM Cortex-A9 cores.

You can see the A4 below displaying the time (as it happens, while I am SSHed into it from my desktop Mac), with a debug board attached to its TTA20 debug port. The debug port has a USB cable attached at its bottom providing access for the Android Debug Bridge (adb), and a cable attached to its console serial port at its left. Also visible to the upper right of the A4 is the all important reset tool (a bent paperclip).

ODROID-A4 with Reset Tool and Debug Board

Below is a screen snapshot of a terminal window on my desktop Mac through which I SSHed into the A4, logging in with a guest account with a password just as if the A4 were a server. My first ssh attempt was before I started the dropbear SSH server on the A4.

Mac Terminal: ssh to ODROID_A4 as guest

Below is a screen snapshot of the Android serial console in which a ps command shows the dropbear SSH server running along with the login bash shell used by a guest account. (You can click on the image to see a larger version.)

Android Debug Console: ps output

In China Miéville's Hugo-nominated novel Embassytown, he weaves a tale of a small human colony on a distant planet attempting to live among a race whose entire approach to language is so alien that neither individual humans nor their machines can be understood by the Arikei even when it sounds as if we speak their language perfectly. The cognitive barrier between the humans and the Arikei reminded me of the separation between the Android and GNU software stacks, living side by side on the same Linux kernel but for the most part unable to communicate.

This hack on Conestoga differed substantially from the same one on Contraption, not just because the former was running the ICS Android release instead of FroYo, but because the underlying A4 platform  was quite different from that of the BeagleBoard.

The BeagleBoard uses a two-stage boot process, starting with X-Loader loaded from flash on-board the OMAP chip itself, then to U-Boot. U-Boot booted up and ran Linux and the Android stack (and eventually the GNU stack) directly off its microSD card containing an EXT3 file system.

The A4 boots U-Boot from its microSD card, then loads the Linux kernel and a RAM-resident disk image from the microSD card. This ramdisk has most of the files needed by the platform layer that sits below Android and is mounted read-only. The A4 separates Android into two separate EXT4 file systems, system and userdata, and mounts the former read-only. There is also an EXT4 cache file system, and a FAT32 (a.k.a. VFAT) file system for user media like music, videos, and ringtones.

Because of the way I plumbed Conestoga into Android, I had to reverse-engineer, modify, and repackage the ramdisk and system filesystems, and push (via adb) an overlay (really: a tarball) onto the userdata file system which is created by the target itself when necessary (for example, on a new microSD card on which it does not already exist). There was a lot of web searching, lengthy sessions of source code pondering, and a few hex dumps. In the end, most of the changes to ramdisk and system were merely adding soft-links; the bulk of Conestoga resides in the read-write userdata file system under the directory /data/conestoga.

One piece of useful collateral was a tool I wrote to unpack the compressed EXT4 file system format generated by the A4 software build process. Files in this format are what is transmitted to the A4 via the fastboot mechanism, a way to download and install new file system images to the microSD card on the A4 via USB using only U-Boot. Below is a snippet of commands illustrating how decompressext4 might be used on the system.img produced by the A4 build.

decompressext4 -v < system.img > system.ext4
fsck.ext4 system.ext4
mkdir mnt
sudo mount -o loop system.ext4 mnt
ls -lR mnt
sudo make_ext4fs -s -l 524288000 system-new.img mnt
sudo umount mnt
rm -rf mnt
fastboot flash system system-new.img

The C source code for decompressext4 is open source and is included in the Conestoga distribution (which otherwise is not much more than a big Makefile), a link to which can be found on the project web page.

Why Android?

For reasons of cost and schedule you must to work as high on the abstraction ladder as possible, even when you are working in embedded systems. This means you don't write in assembler when you can use C. Don't use C when you can use more advanced languages like C++. Don't use compiled languages at all when you can use managed languages like Java or scripting languages like bash, perl or even python.

The fact that a high-capability open-source software stack like Android, mostly written in Java, can run on a pocket-sized smartphone makes it a potential framework for other embedded applications, mobile or not. It's also why I build the bash shell as part of both Contraption and Conestoga: not so much for its use as an interactive shell, but to use as a scripting language.

Why GNU?

There is a wealth of software written for the GNU environment. GNU is in fact the API against which developers are writing when they say they are writing for Linux. Maybe porting C and C++ GNU code to the Android Native Development Kit (NDK) is what you want to do in the long run. But the NDK environment, with its bionic C library, is quite different from GNU. The ability to quickly sling some existing legacy GNU code onto an Android-based project for purposes of prototyping is a big win in my book. Once you prove it can work, you can look at the cost of the NDK port.

Example: I got my GNU-based memtool application built and running on the A4 in just a few minutes using Conestoga. memtool is a command line tool that makes it easy to read and modify the memory mapped registers of hardware devices. I've written about its use in my own Linux/GNU-based embedded systems and on Android on the BeagleBoard. It's found its way into a number of products I've worked on for paying clients.

Tools like memtool are a huge win for debugging, integrating, and even for production use in system control and maintenance scripts. Using utilities like memtool and leveraging the tighter iteration cycle affording by scripting using bash, I've slung together scripts in an afternoon that would have taken days to develop, test, and debug in C, and produced utilities that I was later able to modify and adapt to changing circumstances while testing in the field. If you've never done this kind of work, you might be surprised how many times you want to read or set a bit in a register in an FPGA if only you had access to it somehow from the command line.

Why the ODROID-A4?

The A4 has much of the same hardware as the Samsung Galaxy Android smartphones; it's the platform reference device recommended by Samsung on their web site. To say that the Galaxy smartphones are popular is putting it mildly; let's just say that when another manufacturer of a popular smartphone tries to suppress your device, you just can't buy that kind of advertising. Samsung might as well have ads saying "Endorsed by Apple, manufacturer of the iPhone".

The A4 has a dual-core ARM-based processor. I've written before how important I think it is that all developers, including embedded developers, get on the multi-core train as quickly as possible. We all need to get experience with developing multi-threaded (concurrent) software on true multi-processing (parallel) hardware. Writing multi-threaded software for uni-processor systems opens the doors for race conditions that may never be detected. Multi-core is clearly the future, even in the embedded realm, as processor manufacturers try to continue climbing up the legacy performance curve by any means necessary.

The A4 runs Ice Cream Sandwich and the Linux 3.0.15 kernel. This was my first experience with both the ICS Android release and with the Linux 3.0 kernel. It was a useful learning experience to build, modify, and run both of them, as well as the A4's unique flavor of U-Boot, from scratch.

Oh, and the ODROID-A4 is a pretty usable Android mini-tablet all by itself.

Why me?

I make a good living mostly down in the platform layer. Application developers routinely depend on me to help them port and debug their code. I am often called upon to reverse engineer systems about which I formerly knew nothing, and to play a key role in hardware-firmware-software integration and testing.

This is the kind of stuff I get paid to do, and undertaking projects like Contraption and Conestoga keeps my skills up to date and gives me an opportunity to test my hypothesis about using Android as an embedded framework.