Monday, January 31, 2011

Dynamically Linked Executables on the Beagle Board with Android

I've had success running dynamically linked executables built with the standard Code Sourcery ARM tool chain (as opposed to Android's modified version) on the Beagle Board under Android. I've added the details as an update to my article Cohabiting with Android on the Beagle Board. My article Memory Mapped Devices on the Beagle Board with Android describes building and running device drivers on that platform using the same standard Code Sourcery ARM tool chain.

Monday, January 24, 2011

Memory Mapped Devices on the Beagle Board with Android

If you're an embedded software developer, you already know that memory is not always as it appears. Sometimes it is really an interface into a hardware device merely pretending to be memory but which may exhibit behavior that is very non-memory like. (And thanks to memory models crippled by compiler optimizations and high-performance memory subsystems, this is unfortunately sometimes true of real memory as well.)

This is not that different conceptually from the Linux /proc and /sys directories, which contain objects that pretend to be files in a file system, but are in fact interfaces into various subsystems and device drivers in kernel-space. While some hardware architectures have specialized machine instructions to interface with I/O devices, most of the ones I use (going all the way back to the PDP-11 in the 1970s) expect I/O devices to expose control and data registers that can be read and written by software in a memory-like way. The mapping of hardware interfaces into the physical memory space is called memory mapped I/O. This is typically how the Linux kernel and device drivers interface with I/O devices. It might surprise you to learn that Linux provides system calls such that user-space applications can do this too.

I recently got around to porting Diminuto, my C-based systems programming library and toolkit, to Contraption, my project to experiment with the ARM-based Beagle Board running the Android software stack. This included the mmdriver device driver and the utilities memtool (no relation to any other memtool) and mmdrivertool.

memtool is a user-space application that uses the Linux system calls mmap and munmap to map an arbitrary region of physical memory space specified in a command line argument into virtual memory space. It then allows you to manipulate that physical memory space with operations like read, write, set, and clear. If that physical memory space happens to be an I/O device, you find yourself controlling actual hardware. Since such activities are fraught with peril (you can crash your system, as I have done many times, or even damage it) you have to be root to run memtool.

Here is the help menu that memtool displays. (This was scraped off the console of my Beagle Board running the FroYo release of Android Rowboat. Apologies in advance for any violence done to it by the Blogger editor that I miss.)

bash-3.2# memtool -?
usage: memtool [ -d ] [ -o ] [ -a ADDDRESS ] [ -l BYTES ] [ -[1|2|4|8] ADDRESS ]
[ -r | -[s|S|c|C|w] NUMBER ] [ -u USECONDS ] [ -t | -f ] [ ... ]
-1 ADDRESS Use byte at ADDRESS
-2 ADDRESS Use halfword at ADDRESS
-4 ADDRESS Use word at ADDRESS
-8 ADDRESS Use doubleword at ADDRESS
-a ADDRESS Optionally map region at ADDRESS
-d Enable debug mode
-f Proceed if the last result was 0
-l BYTES Optionally map BYTES in length
-o Enable core dumps
-t Proceed if the last result was !0
-u USECONDS Sleep for USECONDS microseconds
-? Print menu

The Beagle Board has a couple of tiny intermittent-contact buttons on it. One of them is the RESET button. Play with the Beagle Board enough like I do and you'll learn where the RESET button is. Right next to it is a button labelled USER which is connected to pad AE21 on the Beagle Board's Texas Instruments OMAP3530DCBB72 processor. You can read the state of this button by inspecting bit seven of the GPIO_DATAIN register for the GPIO1 module. GPIO1 is one of the many I/O controllers built into the OMAP System On a Chip (SoC).

(In this photograph you can see the USER button, and just above it, the RESET button, down at the lower left.)


It will probably not surprise you to know that the GPIO_DATAIN for GPIO1 is mapped into physical memory space, at location 0x48310038 on this OMAP processor, and that I can use memtool to read it. Bit seven would be represented by the mask (1<<7) or 0x80. When the USER button is pressed, this bit reads as a one. (If you're not used to this type of thing, it's kind of magical.)

Here I am using memtool to read location 0x48310038 while pressing and releasing the USER button. memtool prints the value of the memory location to standard output in decimal so it can be more easily used in shell scripts. The little Diminuto utility hex prints it in hexadecimal to make our lives easier.

bash-3.2# memtool -4 0x48310038 -r | hex
bash-3.2# memtool -4 0x48310038 -r | hex
bash-3.2# memtool -4 0x48310038 -r | hex

Field Programmable Gate Arrays (FPGAs) are programmable digital logic devices. They have a broad range of capabilities and applications, depending on the brand and model, from glue logic to connect discrete hardware devices together to implementing entire microprocessors and I/O cores. FPGAs have been an intrinsic part of every embedded system I've worked on in the past fifteen years.

A common approach to interfacing an FPGA with a processor is, you guessed it, memory mapped I/O. The FPGA exposes a set of registers, anywhere from a few to dozens, into the physical memory space of the processor. You can use memtool to communicate with an FPGA in such an architecture. I have routinely done this during board bring up, testing, and troubleshooting.

Another approach is to use the Diminuto generic memory mapped device driver mmdriver and its associated utility mmdrivertool. The mmdriver device driver is a loadable module that lives and works in kernel-space. It uses the kernel functions ioremap and iounmap to map a physical memory space into virtual memory space just once when it is installed. It then takes commands via the ioctl Linux system call to read, write, set, and clear portions of that physical memory space. The mmdrivertool is a user-space utility that allows you to send commands to mmdriver via command line arguments in a fashion similar to how I used memtool.

Here is the help menu that mmdrivertool displays.

bash-3.2# mmdrivertool -?
usage: mmdrivertool [ -d ] [ -U DEVICE ] [ -[1|2|4|8] OFFSET ] [ -r | -[s|S|c|C|w] NUMBER ] [ -u USECONDS ] [ -t | -f ] [ ... ]
-1 OFFSET Use byte at OFFSET
-2 OFFSET Use halfword at OFFSET
-4 OFFSET Use word at OFFSET
-8 OFFSET Use doubleword at OFFSET
-C NUMBER Clear 1<<NUMBER mask at OFFSET
-U DEVICE Use DEVICE instead of /dev/mmdriver
-d Enable debug mode
-f Proceed if the last result was 0
-r Read OFFSET
-t Proceed if the last result was !0
-u USECONDS Sleep for USECONDS microseconds
-? Print menu

Because mmdriver maps a fixed region of physical memory space when it is installed, areas within that space are specified on the mmdrivertool command line as offsets from the beginning of that space, instead of as physical memory addresses as with memtool. Otherwise, the two commands look very similar.

Here I am installing the mmdriver device driver module and specifying what region of physical memory space it is to map. In this example, I just map the GPIO_DATAIN register of the GPIO1 module. But you could just as easily map an entire FPGA register set into virtual memory. Because mmdriver is built on top of two other Diminuto kernel modules, I install those first.

bash-3.2# lsmod
omaplfb 8882 0 - Live 0xbf032000
pvrsrvkm 137146 31 omaplfb, Live 0xbf000000
bash-3.2# insmod diminuto_kernel_map.ko
bash-3.2# insmod diminuto_kernel_datum.ko
bash-3.2# insmod diminuto_mmdriver.ko "begin=0x48310038 end=0x4831003c"
bash-3.2# lsmod
diminuto_mmdriver 3633 0 - Live 0xbf03d000
diminuto_kernel_datum 883 0 - Live 0xbf037000
diminuto_kernel_map 1057 1 diminuto_mmdriver, Live 0xbf02d000
omaplfb 8882 0 - Live 0xbf032000
pvrsrvkm 137146 31 omaplfb, Live 0xbf000000

By default, mmdriver registers itself as a miscellaneous device and allows the Linux miscellaneous device driver misc to dynamically assign its minor device number. I can use the /proc file system to see what the major (10) and minor (51) device numbers for mmdriver are. (You can also specify your own major device number for mmdriver at install time.)

bash-3.2# cat /proc/devices
Character devices:
1 mem
4 /dev/vc/0
4 tty
4 ttyS
5 /dev/tty
5 /dev/console
5 /dev/ptmx
7 vcs
10 misc
13 input
29 fb
81 video4linux
89 i2c
90 mtd
116 alsa
128 ptm
136 pts
153 spi
180 usb
189 usb_device
250 pvrsrvkm
251 omap-resizer
252 omap-previewer
253 usbmon
254 rtc

bash-3.2# cat /proc/misc
51 mmdriver
52 network_throughput
53 network_latency
54 cpu_dma_latency
55 log_system
56 log_radio
57 log_events
58 log_main
59 binder
130 watchdog
60 alarm
223 uinput
1 psaux
61 android_adb_enable
62 android_adb
63 ashmem

Now I create a character device node in the file system for mmdriver using its major and minor device number. Because Android doesn't provide a mknod command, I'll cheat and use busybox, which I described in a prior article.

bash-3.2# busybox mknod /dev/mmdriver c 10 51
bash-3.2# ls -l /dev/mmdriver
crw-rw-rw- root root 10, 51 2011-01-24 22:17 mmdriver

This may seem like a lot of work, but the point is you only have to install the modules and make the device node once at boot time, and it can be automated in a script.

Now I can use mmdrivertool to do exactly the same thing as I did with memtool while pressing and releasing the USER button on the Beagle Board.

bash-3.2# mmdrivertool -4 0 -r | hex
bash-3.2# mmdrivertool -4 0 -r | hex
bash-3.2# mmdrivertool -4 0 -r | hex

All of the source code for what I have described here is available in the latest Diminuto release. It is all licensed under the GPL or the LGPL.

Not to make things more complicated than they need to be, but I've used the term memory mapping in two different senses in this article. I/O devices map their control and data registers into a physical memory address space such that they appear to be at a physical memory location. But in most Linux systems, real physical random-access memory is itself mapped. Both I/O registers and physical memory are mapped into a virtual memory space using a hardware memory management unit (MMU) that is part of the OMAP processor. Software, whether it be running in kernel-space or user-space, accesses I/O registers and real memory using virtual memory addresses, not the actual physical memory addresses.

The value 0x48310038 cited in the examples above is a physical address. The register it points to is mapped by both memtool (using user-space system calls) and mmdriver (using kernel-space functions) into a virtual address space, and it is the virtual address of this space that this these tools actually use, merely accessing it as a pointer in C. The tools themselves handle this and hide the details from the user, but you can see it in the source code.

Remember earlier when I said memory mapped devices may exhibit non-memory-like behavior? This is a lot less common than it used to be, but three thousand page processor reference manuals still bear close scrutiny lest wackiness ensue.

Inter-Integrated Circuit (I2C, pronounced I-squared-C) is a two-wire serial bus standard commonly used to interrogate simple off-chip devices like temperature or power sensors. An I2C implementation in a sensor is not much more complicated than a simple state machine and a shift register. Here is a sentence from Freescale Semiconductor's MPC8349E PowerQUICC II Pro Integrated Host Processor Family Reference Manual on a memory mapped hardware data register for its I2C controller.

In master receive mode, reading the data register allows the read to occur, but also allows the I2C module to receive the next byte of data on the I2C interface. [Rev. 1, page 17-9]

What this is saying is that reading this memory mapped register on the PowerPC not only returns the most recent value read from the I2C bus, but clocks in the next data byte from the bus. By reading a location that looks like memory, and which is accessed in C just as if you were reading a variable, not only is the value of that location changed by your read, but it changes the state of a slave device that is completely outside of the PowerPC chip. If you don't have a lot of experience in embedded development, this could be a really surprising behavior.

I have written much already in this blog about memory models on modern microprocessor architectures (for example, here and here). I'll write more in the future about how to tell when memory, or something that looks like memory, is actually being accessed in your C or C++ program. When you're dealing with memory mapped hardware devices, it pays to know these things.

Update (2011-01-27)

Just days after writing this article I was trying to set some bits in an FPGA register and my software was getting a bus error, which on this platform meant I was trying to access a physical address that didn't exist. It turns out the firmware developer had made that register mapped into memory by the FPGA write-only by virtue of not implementing a read interface. (And they had documented it. I was just too lazy to look closely at the comments in the header file while writing my software. My bad.) That means I could not set a bit by doing

fpga->field |= (1<<21);

but instead had to

fpga->field = (1<<21);

The difference is subtle enough you may have to be an embedded wonk to know the difference: the former does a read-modify-write memory operation, while the latter just does a write.

Business as usual in the field of embedded development.

Wednesday, January 19, 2011

Visualization: World Population, Wealth, Longevity

In this five minute BBC video, physician and statistician Hans Rosling uses a clever graphical visualization to show how the world has evolved in terms of population, wealth, and lifespan in the past two centuries. It's a great example of how to condense a huge amount of information into something that can be grasped immediately.

This is part of an hour long BBC show: The Joy of Stats. A big thank you to my homie Brian for passing this along.

Saturday, January 15, 2011

Bob Dixon on the Founding of Wright State University

If you're a doddering old alumnus of Wright State University near Dayton Ohio like I am, you may remember Bob Dixon, who among many other things was instrumental in the development of the mathematics, computer science, and computer engineering departments at WSU. He was also my mentor and thesis advisor, and along with Joe Kohler developed the infamous Real-Time Software Design course. Also known as CS/CEG 431, that one course has informed my professional career for the past thirty-five years. The Wright State University Library and the WSU Retirees Association are sponsoring an oral history project. Here's a link to an interview with Bob Dixon by Lewis Shupe about the start of Wright State when it was nothing but a corn field and a farm house.

Sunday, January 09, 2011

What Web Sites Know About You

Many who are more knowledgeable about web services than I am have already written about how much information web sites can glean about you when you simply click on a link. They get information about your operating system, your browser, your display, and your IP address. Because many web services are designed and implemented using an architecture known as Representational State Transfer (ReST) in which the software on the server is stateless, much information may be encoded in the URL that forms the link on which you click, and this URL is also passed to the web service, but may be visible to other software along the way.

What kind of software? I subscribe to a service called Site Meter who, for a few bucks a month, collects statistics about the people who read my blog. How is this done? I just embed some magic HTML code in the template that is used for every blog article that I write. This HTML code is executed every time you read one of my articles. Site Meter collects and summarizes this data and provides me with a way to access it. Other sites, including Google Analytics (which I also use), do similar things. Companies use this information to fine tune their web sites (Search Engine Optimization or SEO), improve their marketing and advertising, and ultimately to improve their bottom line.

I'm not nearly that ambitious. But I have used results from Site Meter to improve my blog. For example, I wrote an article on Chip's Instant Managed Beans, a way to easily instrument Java code so you could interrogate and communicate with it using the GUI-based jconsole tool that is part of the Java JDK. It's not unlike using the Simple Network Management Protocol (SNMP) with a good Management Information Base (MIB) browser, except it's a whole lot simpler. I changed the original title of that article because I had unwittingly used the brand name of a snack food company in India. (No, I'm not kidding.) Most of the traffic reaching that article through search engines were folks in India looking for a way to order junk food over the web. I couldn't see any reason for my article on a fairly obscure technology like Java managed beans to be the number one search result for Indian snack chips.

The other day while reviewing my Site Meter data, I stumbled across the following result about a visit someone made to my blog. Here is a screen shot right from my Site Meter control panel. The amount of data gleaned from this search is really interesting.

What Web Sites Know About You

What can we tell about this visitor?
  • They used Google to search for the strings "john l sloan" and "asshole" appearing on the same web page.
  • They found a blog article from July 2007. That's because "John L. Sloan" appears in the copyright of all of my blog articles, and that month I wrote an article in which I mentioned the book The No Asshole Rule by Stanford engineering professor Robert Sutton. (Sutton is one of my favorite business authors and bloggers.) You can trivially duplicate this search yourself right now. Go ahead. I'll wait. The quotes are important, otherwise you'll get a lot of unrelated results. In fact, when I do this search, my July 2007 blog page is the only result I get.
  • Just from their IP address we can see they use Road Runner, an internet service provider on Time Warner Cable.
  • Geolocation based on their IP address suggests they are located in my old stomping grounds, Fairborn Ohio USA.
  • Their preferred language, for web browsing anyway, is English.
  • They use Microsoft Windows NT, or more likely some Windows variant that self-identifies as Windows NT.
  • They were using Internet Explorer 8.0, along with an alphabet soup of options and add-ons (this alphabet soup will be important later).
  • They were using JavaScript 1.3.
  • They have a display with a resolution of 1344 by 840 pixels and 32-bit color.
  • They performed the search at around 8PM local time on January 6th.
  • The zero second visit length is an artifact of the fact that they didn't leave my blog by clicking on a link in my blog, which is the only way the Site Meter software can tell when they leave. Had they clicked on a link, I would not only know how long they remained on my blog, but where they went when they left.
The fact that I, someone you may not even know, or at least only know from this blog, have access to all of this information about you every time you visit my blog might be enough to worry you. But the fact is, all of this information is recorded about you when you visit any web site. That's right: any web site can, and probably does, collect this information about you. This does not entail using tracking cookies, viruses, spyware, botnets, or any other possibly malignant technology. It's built into the basic software architecture that is the web.

Now you might think that all this information isn't enough to actually identify you as a particular user. That's where the alphabet soup of software versions reported by your browser comes in. It turns out that, as research studies by Peter Eckersley of the Electronic Freedom Foundation and others have shown, the collection of software installed in, around, and with your browser is almost as unique as your fingerprint. In fact, this technique is known as browser fingerprinting. Using this information, it is very possible that sufficiently clever and motivated individuals can identify your specific machine, using it to correlate your movements across all the web sites you touch. Or even, warrant in hand, identify your specific desktop or laptop as having been the one to visit a particular web site.

Welcome to the Internet.

Saturday, January 08, 2011

Cohabiting with Android on the Beagle Board

I recently migrated Contraption, my Android platform, from version 1 of the TI Android development kit for the Beagle Board, which was based on the Eclair Android release, to version 2 of the kit, based on the Froyo release. Both development kits run on top of the 2.6.32 Linux kernel. As before, I configured and built a custom version of the Android Linux kernel to support the additional hardware of the Zippy2 expansion board. The Zippy2 gives you a second SD card slot and an RJ45 Ethernet port.

I used the pre-built root file system, although I've also had good luck building the Android port to the Beagle Board from the Rowboat distribution sources. Note that the Rowboat distribution requires the older Java 1.5 JDK, so you might not want to delete it quite yet from your system.

I also got busybox, bash, dropbear, and strace running with Android on the Beagle Board, which simplified a lot of things. busybox is a very commonly used (one might say: indispensable) utility for embedded Linux-based systems that includes a boat load of commonly used commands and utilities all in one executable binary. It's your one stop shopping source for embedded Linux goodness. Although busybox implements a quite usable shell itself, porting bash gives me a more complete scripting capability. dropbear is a secure shell server that provides SSH and related services. strace (different from the Android strace) is a Linux system call tracing utility that makes it much easier to figure out why things aren't working.

Below you can see my set up. On the big display on the left, connected to my desktop system, you can see a small white window to the left where I have SSHed into the Beagle Board, and a small black window on the right where I am connected to the Beagle Board through its serial port and am running bash. On the smaller display on the right, connected to the Beagle Board, I am web cruising using the Android web browser. The tiny red Beagle Board sits barely visible in between.

Configuring and building tools based on Linux and GNU in a cross-compilation environment can be a little challenging. I was able to leverage a lot of prior work porting tools to the Atmel evaluation board for my Diminuto and Arroyo projects. Because Android and its C/C++ tool chain doesn't include the normal GNU libraries, I had to use the excellent CodeSourcery GCC tool chain and statically link the resulting executable binaries. (In the near future, I'll look at dropping the GNU libraries on the Beagle Board and going back to dynamic linking.) Because Android lacks all the usual Linux and GNU password and user identification infrastructure, dropbear required patches developed by Jakob Blomer at CERN. (A big Thank You to Jakob.)

I probably could have used the native Android utility adb to push binaries to the Beagle Board, but that would have been too easy. Instead, I built busybox, then dropped it into the Android root file system that I built on an SD card. Once I had Android running, I used the tftp function in busybox to copy over the other binaries. Once that was done, it was easy to get a dropbear SSH server running, and from there I had ssh and scp.

Was all this really necessary? Probably not. But I earn a good living in product development as a systems programmer, and that means I am interested with how systems work under the hood.

Most of this effort is captured in a single Makefile that can be found in the Contraption release. If you want to duplicate it, you'll also need to download a ton of other software distributions off the web, the URLs for which are in the Makefile comments.

The Froyo release on the Beagle Board seems to take a little longer to come up than Eclair. USB devices sometimes come and go. And once in a while the Java stack seems to get stuck coming up. But if it were easy, anyone could do it, and it wouldn't be any fun. Once I get over these minor hurdles, I'll be back into the Android software development kit and trying to remember how to write in Java.

Update (2011-01-24)

I've successfully built Android from the Rowboat sources, instead of using the pre-built root file system from the developer's kit. It installed and came up. There are a few elements of weirdness, but not nearly as many as I expected. I'm impressed.

Update (2011-01-31)

I've had some luck running dynamically linked executable binaries built with the standard Code Sourcery tool chain (as opposed to Android's modified version of it) under Android on the Beagle Board. It was fairly straightforward. I haven't tried it yet with the utilities I described here, but I have done so with my Diminuto library and toolkit, the porting of which to Android I have described in another article.

I had to copy all the requisite shared objects to the Beagle Board. For me, that meant /lib/,, and from the arm-none-linux-gnueabi/libc/lib directory of the Code Sourcery tool chain, plus my own For all but the first one I was able to put the shared objects in a new directory and point the LD_LIBRARY_PATH environmental variable to it from my bash session. For the first one, which had an absolute path name embedded in the resulting binary executables, I had to create a /lib directory in the Android file system and placed it there. (I'm trying not to unduly contaminate the Android file system, keeping all my own stuff under my /Contraption directory; sometimes that's not possible.) I also had to make all the shared objects executable, which is required by the dynamic loader.