Sunday, April 27, 2008

Globalization and Amazon.com

I had a The World Is Flat moment the other day. Some random synapse in my brain decided that I should read the old Alistair MacLean cold-war thriller Ice Station Zebra. I've always been a fan of the movie (me and Howard Hughes) and the soundtrack by Michel Legrand, but had never read anything by MacLean. So I hied me over to Amazon.com to order the book.

I was a little surprised to find it was out of print. But thanks to Amazon's Marketplace service, several used and even new copies were available from any number of sources. I'm no stranger to this service, in which independent booksellers use Amazon as a storefront. In Outsourcing Technical Services for Small Businesses, I wrote about how this service was a great way to get barely used copies of expensive or out of print technical books (the only thing that depreciates as quickly as high tech is fresh fruit). Nearly instant gratification was only a few clicks away.

It was only after I got the verification e-mail that I realized I had ordered the book from a bookstore in the U.K. Well, okay, I'm no stranger to that either. It seems that many of my favorite authors these days are Scottish: Iain Banks, Ian McDonald, Ken MacLeod, Charles Stross. I discovered that I can order one of their books a year in advance of it appearing in the U.S. by getting it through Amazon.co.uk, which is no more click-distant for me than its U.S. counterpart. (Note to U.S. publishers: the cover art on the U.K. editions is far superior too.)

It wasn't until the MacLean book arrived at the Palatial Overclock Estate (a.k.a. the Heavily Armed Overclock Compound) in an envelope covered in Deutsche Post stickers that I discovered that it came from a warehouse in Germany.

Man, you just gotta love the web. It isn't just that these things are possible. It's that they're so easy that you don't even realize that they're happening.

But it isn't just the web. It's the huge number of independent vendors who use the infrastructures provided by Amazon and eBay to reach out and e-touch someone. It's the amazing international shipping services that somehow transport tangible goods almost as fast as the Internet transports intangible bits. And, in the case of my book, they can do it all for under ten bucks.

I'm amazed. But apparently, some bean counters are disturbed.

The April 28th issue of BusinessWeek had an article on "The World's 50 Most Innovative Companies". Jeff Bezos was on the cover, Amazon being #11. Web services developers are already quite aware of the infrastructure Amazon can provide, quite cost effectively, to e-commerce web sites.

A week later, in the feedback section of the May 5th issue, a reader identified by the screen name "Beejat" writes:

If e-retailers like Amazon are offering computing on tap, doesn't it indicate that they have overinvested in technology? ... I am convinced Amazon has too much firepower - more than it needs to run its business. Its emphasis on its corporate computing market looks like a positive spin on a rather poor capital investment strategy.

Beejat, I know where you're coming from. You're coming from a bottom line, quarterly results, improving shareholder value, reducing costs point of view. The point of view that makes investors money in the short term, while eliminating any vision of the long term.

But Beejat, here's where you're confused. Selling books isn't Amazon's business. Creating the future of e-commerce is their business. Selling me stuff is just the way they unit test it.

Sunday, April 20, 2008

The Future Has a Way of Happening

In his blog Coding Horror, Jeff Atwood answers the question: Should All Developers Have Manycore CPUs? Probably to no one's surprise, the answer is "yeah, duh". Jeff concentrates mostly on the performance and productivity issues. For me, it's all about economics too, but not necessarily the economics Jeff is talking about.

Let's suppose a company has a product that generates, directly or indirectly (as, through service contracts) the bulk of its revenue. I'm talking hundreds of millions of dollars a year. It's the kind of revenue stream most of us poor sods working for tiny start-ups just dream about. This single product is without question the crown jewel of their product line. The installed user base is huge, global, and growing. It is considered by its customers to be a critical and strategic piece of their enterprise infrastructure, and has been for decades.

But, see, here's the thing: this product evolved from code written as far back as the late 1970s. By some estimates, the code base is eight million lines of code, mostly C. Much of it was designed and implemented long before multi-processor servers were cost effective, existing only here or there in various corporate and government labs. Although the application is multi-threaded, most of it was written before there was any way to run it on a platform where the race conditions that might occur in a truly concurrent environment could be exercised.

It wasn't (necessarily) that the developers -- and there would be hundreds of them over the years -- knowingly wrote code that wouldn't work on a multi-core platform. It's just that, well, it wasn't an issue. These developers were more worried about being attacked by a saber-toothed tiger or trampled by a herd of mastadon on the long commute home across the ice pack, then some hypothetical science-fictional multi-core future.

And, of course, even if the occasional forward thinking developer was worried, there was no way to test it.

Fast forward thirty years later. Multi-core servers are now cheap. So cheap, server manufacturers will soon see no point in making a server that isn't multi-core. Even if they wanted to, the day is coming when the microprocessor chip manufacturers will no longer make single-core microprocessors except perhaps for specialized, embedded applications. Chip manufacturers have been pushed into the multi-core future by having hit the wall for CPU clock rates, due mainly to issues in power dissipation. And they're dragging the rest of us in to the multi-core future, some of us kicking and screaming.

This hypothetical company's product, the thing that pays all the bills, won't run reliably on the current generation of servers. And they can't make it work. Finding all the race conditions that show up in a thousand lines of code is a challenge. Doing so in an eight million line code base is for all practical purposes impossible. They've also hit the CPU clock speed wall, and are reduced to limiting their application to run on a single core, while the other cores run (occasionally) an Apache web server for the system administration interface, or maybe blinks some LEDs.

In other words, they are screwed.

I don't want to make it sound like anyone is at fault here. For sure, my crystal ball hasn't been any better than theirs was over the past thirty years. At one time, I actually worked at a national lab which had bunch of those multi-processor, shared-memory systems. We called them supercomputers. But it never occurred to me that I'd have one of those systems in my home. In my basement. Under a table. And that it would have cost only about a grand, U.S. That's just crazy talk.

But for this company, the future caught up with them. When it came to designing for future multi-processor architectures, they effectively, and correctly, practiced an agile development methodology. Unfortunately, the future has a way of happening, and the agile principle of "you ain't gonna need it" eventually turned into "you need it right now, and there ain't no way to get it".

(There is some irony here. This same hypothetical company let go -- or otherwise drove away or simply ignored -- nearly all of their firmware developers. These were developers who were quite experienced in designing code for multi-processor shared-memory architectures, albeit for embedded applications. These guys and gals dreamed concurrently in their sleep, and could have been a valuable development resource for the server-side application architecture. But that amazing short-sightedness is a topic for another article.)

So with all due respect to Jeff Atwood, here's why I think all of your developers and testers should have a multi-core platform on which to write and test their applications: if they don't, and if your product has any success at all in the marketplace, sooner or later you will be screwed too. It's simply a matter of developing and testing on the platform you expect to deploy to the field. Or, as the rocket scientists say, "test what you ship, ship what you test".

I've written about some of the issues in the multi-core future in Small Is Beautiful, But Many Is Scary and its follow-up, and my alter-ego has given a talk on problems software developers are running into in writing multi-threaded code for multi-core (and, in the case of hyper-threading, sometimes even single-core) platforms. The company that issues me my paycheck, Digital Aggregates, put its money where my mouth is: we just added a Dell 530 with an Intel 6600 2.4 GHz quad-core processor to the server farm.