Tuesday, February 03, 2015

The More Things Change, The More They Change

Stick a fork in RadioShack because it's done, reports Bloomberg. 

How a chain founded on selling parts from the then new-fangled electronics industry can up and die in the midst of the biggest DIY high-tech "maker" movement in history is almost beyond me. But besides competition from the Internet and the difficulty in stocking inventory for the "long tail", I have a sneaking suspicion this is yet another example of short term optimization -- "Let's sell phones, it'll be a lot cheaper and easier, and everybody loves phones!" -- over long term optimization -- "Let's sell complex electronic components and kits at low margins and that require knowledgeable sales people".

I used to routinely buy electronic parts at the Shack at our local outlet mall -- "I need some 120 ohm resistors to terminate a CAN bus." -- where I dealt with an old guy (older than me!) who was clearly retired from one of the Denver area's high-tech manufacturers and who was always interested in what I was working on. "Bring it in, I'd like to see it!" he'd tell me. Wish I knew where he moved on to. But I'm part of the problem, not the solution. Me, I'm the guy that bought an oscilloscope off Amazon.com.

On the plus side, at the coffee shop yesterday morning a college-age guy sitting at the same communal table out of the blue asked me if I was familiar with the programming language Haskell. "Is that the one that's purely functional?" I asked, which was all that was necessary for us to nerd bond. In his defense, I was reading an IEEE book written by a capital theorist that was an economic analysis of the software development process. So he might have had a clue that we were of the same tribe.

It is both the end, and the beginning, of an era. Just like always

Monday, January 19, 2015

Exceeding the Speed of Obsolescence

I drive a 1998 Subaru Legacy GT sedan. I bought it way back in December 1997. It's been a great car, but as much as I love it, I've been looking into replacing it. Along the way, I stumbled into an unexpected connection with my professional life.

Back when I worked for the Scientific Computing Division at the National Center for Atmospheric Research in Boulder Colorado, I spent some time looking at the rate at which supercomputer horsepower increased over the years. When you get a supercomputer, everything else has to be super too, networking, storage, I/O, otherwise you are just wasting your money heating the data center. Same is true for computing at all scales, but with supercomputers (or, today, cloud data centers), you're talking real money. I came up with this logarithmic chart that may seem familiar to long time readers. It shows the rate of growth of a number of computing technologies over time, normalized to unity. For example, if you bought a mass storage device of size n this year, in about eleven years for the same money you will be able to buy a mass storage device of size 10n.

Power Curves

The data on which this chart is based is probably obsolete, but that's not the point. The point is that different technologies grow at different rates. Achieving a balanced architecture is a moving target. As you upgrade one component or subsystem, perhaps because the newer technology is more cost effective, or better performance, or maybe because the manufacturer has discontinued it, the newer component or subsystem is so much more capable that it stresses the rest of the architecture. The memory bus can't keep up with the new CPU cores. The disk farm can't keep up with the higher rate of data generation. You ended up wasting a lot of money, unable to take advantage of the full capability of the new hardware.

The folks at NCAR rightfully worried about this. And it's one of the reasons, I believe, that laptops have replaced desktop computers. It used to seem that desktops, like the old component stereo systems, offered the capability of incremental upgrade. But in the long run, it made a lot more sense to replace the entire system, under the assumption that the new system -- a laptop -- would have a balanced set of components chosen by designers and architects that knew a lot more about it than you did.

This insight came back to haunt me years later when I left NCAR to work at a Bell Labs facility near Denver Colorado. The Labs had a long history of producing large distributed telecommunications  systems, either for the public switched telephone network or for large enterprises, as well as lots of other pretty useful things like C, C++, and UNIX.

NCAR was an organization that never saw a high performance computer it didn't like, and seemed to have one of everything. I had become accustomed to equipment becoming obsolete in just a short few years. Sometimes it seemed like the doors into the NCAR data center should have been revolving, with new computers coming and going all the time. I routinely walked out onto the floor of the main computer room at NCAR's Mesa Laboratory to find some new computer system I didn't recognize being installed.

But organizations that bought large telecommunications systems thought differently about obsolescence. They expected to amortize the expense of their new telecom equipment over a much longer period of time, typically a decade. That placed interesting design constraints on the hardware and software that we developed. We all knew stuff would come and go, because's that the nature of high technology. So the entire system had to be built around the assumption that individual components and subsystems were easily replaceable. Making it more complicated was the assumption -- and sometimes the regularity requirement -- that systems have five nines reliability: that is, the system was up and available 99.999% of the time. This was the first place I ever worked that built products that had to have the capability of patching the software on a running system, not to mention almost all of the hardware being hot-swappable.

Just like NCAR, the disparate rates of growth of different high technologies drove a lot of the design and architecting done by the folks at Bell Labs, but in a completely different way.

The other day I read an article that advised folks thinking of buying a new automobile not to purchase the in-dash navigation system. This made a lot of sense to me. Whether or not I use a navigation application on my iPhone 5, or the Garmin nĂ¼vi GPS unit that when Mrs. Overlock and I take a road trip we refer to as our "robotic overlord", such devices are on a faster high technology track to obsolescence than most other technology in my car.

That's when it struck me that the future of vehicle infotainment systems isn't to put more and more capability into the automobile dashboard. It's to make your automobile a peripheral of your mobile device. Because while I may still drive the Subaru I bought seventeen years ago, Americans replace their mobile phones every couple of years. Although it has been argued that this rate of replacement is environmentally unsustainable, it still means that my new vehicle purchase has to be considered in the context of the high technology growth curves that so affected my work at both NCAR and Bell Labs.

While many automobile manufacturers provide a mechanism to upgrade the software in their vehicle telematic systems, replacing all that obsolete hardware is a big ticket item. It's bad enough that my old Subaru doesn't have traction control, or continuously variable transmission, or LED headlights; replacing its ancient head unit, the in-dash component that not only controls the FM radio and the CD changer but is so old it actually has a cassette deck, is more than a thousand bucks. That's a chunk of change for a car as old as mine.

What I really want is a set of peripherals -- display, microphone, amplifier and speakers, maybe some buttons on the steering wheel -- that can be slaved to my iPhone while it is plugged into a USB port to power it. And I want it all to work with my new iPhone or Android when I replace my mobile device. The less the car implements itself, the better. Investing in a high-zoot in-dash infotainment system just doesn't make sense, no matter what value proposition the auto manufacturers believe it has.

The broader lesson here: beware of coupling technologies in your product that have very different growth rates. If you must, make sure you can replace components incrementally. If that's infeasible, be prepared for a forklift upgrade. Even so, few devices operate these days standalone; what seems like an independent device is probably just a component in a larger ecosystem.

Coincidentally, I generated this particular version of my chart of technology growth curves the same year that I bought my Subaru. Both continue to serve me well.

2015 Subaru WRX Sedan

But I could totally see myself in a new Subaru WRX sedan.

Sunday, December 21, 2014

Is the General-Purpose Processor a Myth?

In a recent article in ACM Queue, the Association of Computing Machinery's newsletter for practicing software engineers, David Chisnall argues that "There's No Such Thing as a General Purpose Processor" [12.10, 2014-11-06]. And he has some interesting stuff to say along the way.
It's therefore not enough for a processor to be Turing complete in order to be classified as general purpose; it must be able to run all programs efficiently. The existence of accelerators (including GPUs) indicates that all attempts thus far at building a general-purpose processor have failed. If they had succeeded, then they would be efficient at running the algorithms delegated to accelerators, and there would be no market for accelerators.
His argument is that modern processors implement a specific memory and processing model suited for the execution of C-like languages, which makes them unsuited for the kinds of applications for which we use various specialized hardware like graphical processing units (GPUs) and digital signal processors (DSPs). If modern CPUs were indeed general purpose, they would be able to run GPU- or DSP-style applications efficiently.

I don't agree with him -- I would say being general purpose means that modern processors can run graphical or digital signal processing applications at all, not that they are necessarily optimal for doing so -- but I get his point. Modern processors as as specialized as GPUs and DSPs in the sense that they are designed around a particular application model.
The ability to run an operating system is fundamental to the accepted definition. If you remove the ability to run an operating system from a processor that is considered general purpose, then the result is usually described as a microcontroller. Some devices that are now regarded as microcontrollers were considered general-purpose CPUs before the ability to run a multitasking, protected-mode operating system became a core requirement.
I kinda sorta like this too as the definition of a micro controller. True, I am among many who have run FreeRTOS on a eight-bit micro controller, but FreeRTOS isn't a protected-mode operating system. And, remarkably, Dmitry Grinberg actually booted Linux on an eight micro controller, although it wasn't pretty.
Parallelism in software comes in a variety of forms and granularity. The most important form for most CPUs is ILP (instruction-level parallelism). Superscalar architectures are specifically designed to take advantage of ILP. They translate the architectural instruction encodings into something more akin to a static single assignment form (ironically, the compiler spends a lot of effort translating from such a form into a finite-register encoding) so that they can identify independent instructions and dispatch them in parallel.
Any mention of single assignment gets my attention since it was the basis of my master's thesis a few decades ago. In a single assignment form, a variable can be assigned a value once and only once. Single assignment languages (like the one I implemented) sound impossible to write software in, but in fact it is simply a different coding style. For example, iteration can be done by recursion, where different values of a variable in each iteration are in fact held in different locations in the stack frame. I was surprised, years later, to discover that compiler writers depended upon translating conventional programming languages into a single assignment form for purposes of parallelization (which was exactly why I was using it in the early 1980s).
It's worth noting that, in spite of occupying four times the die area and consuming four times the power, clock-for-clock the ARM Cortex A15 (three-issue, superscalar, out-of-order) achieves only 75-100 percent more performance than the (two-issue, in-order) A7, in spite of being able (theoretically) to exploit a lot more ILP.
Chisnall is applying Amdahl's Law here: no matter how and at what level in the hardware architecture parallelism is implemented, an application has to be specifically designed to take advantage of it, and only a portion of it is likely to be able to do so. My former supercomputer colleagues would recognize his argument immediately, and would understand that algorithms designed to run efficiently on GPUs and DSPs are as different as those that run well on supercomputers, in the latter case by virtue of being embarrassingly parallel.

Chisnall's article is worth a read. He has made me ponder how we may be missing out on radical improvements in efficiency because of the blinders we have on as we continue to design around special purpose processors like Pentium and ARM.