Saturday, November 21, 2009

Post-Modern Deck Construction

The Palatial Overclock Estate (or what the liberal media insists on calling The Heavily-Armed Overclock Compound) is getting on in years. As a result we are faced with gradually replacing many of its three decades old features, including its rear wooden deck. This has led me to ponder the latest in deck construction techniques.

See, back in the day, maybe fifty years ago, it was traditional to build a deck using a single carpenter. Back then there were very expensive master carpenters who were affordable only to a very large pool of homeowners who needed decks. But even so, there were a lot of advantages to having even a small share of a single master carpenter. They had the benefit of coming with a complete set of detailed deck plans, and were very experienced at following those plans. They had a lot of apprentices who made sure the carpenter never ran out of nails or lumber. The carpenter and his apprentices were a tightly knit group who were used to working together and were pretty efficient at doing so. The deck plans were pretty straight forward to follow because everyone working on the deck knew their place and their role. The carpenter did all the actual deck building, and the apprentices were relegated to fairly minor roles. Guys like Thomas J. Watson built entire companies just on the training of master uni-carpenters, and did pretty well for a long time.

Not everyone could afford a master carpenter, so a single master carpenter was shared among many folks who needed decks. Since there was always some dead time while building a particular deck, due to weather or lack of materials or what not, the master carpenter could go next door and work on that deck for a while. The carpenter could slice his time up amongst several deck projects. Even so, it took longer to get your deck built because even when the weather cleared the carpenter wouldn't be immediately available to continue on your project. And of course the priorities of the other homeowners with whom you shared the master carpenter weren't always your priorities.

In time, maybe forty years ago, journeyman carpenters became available. They were cheaper but still fairly expensive, and not nearly as fast as the master carpenter. Their apprentices weren't as smart either. But a journeyman carpenter, sometimes referred to as a mini-carpenter, was cheap enough that a few neighbors could band together and hire one to share to get some of their rear decks built faster than if they had been sharing a single master carpenter among a larger pool of homeowners, even though the journeyman carpenter wasn't nearly as fast as the master carpenter. This worked pretty well for a long time, and guys like Ken Olsen made a lot of money training journeyman carpenters.

But about thirty years ago, high school shop classes started turning out wood working students. At first, they were pretty bad. They were very slow, they weren't very smart, they bent more nails than they hammered in, they didn't make very efficient use of lumber. But they were dirt cheap. So some individual homeowners started hiring these shop students to do small jobs that the master and journeyman carpenters never seemed to have time to do or care about. That worked pretty well. Shop students, or micro-carpenters, could build a fence, maybe build a small deck, maybe a set of stairs for an existing deck built by a master carpenter.

Around twenty years ago, someone got the bright idea of banding a bunch of shop students together into a large team, in a variation of the "nine woman can have a baby in a month" strategy. This is not as crazy as it sounds. Owners of very large homes, mansions and estates and resorts and such, had been doing just that for some time with master carpenters. They would band as many as eight master carpenters together, with a whole bunch of apprentices, great lots of lumber and nails, to tackle really big carpentry projects. You could think of it as a sort of super-carpenter. Guys like Seymour Cray were very good at creating these kinds of master carpenter teams.

It wasn't easy: in order to make effective use of eight very expensive master carpenters, the usual deck plans wouldn't do. You had to completely rethink the plans and the instructions so that each master carpenter could keep busy without getting in the way of another master carpenter, that each wouldn't run out of materials and have to sit idle (a very expensive proposition since you had to pay them anyway), and make sure that all the individual parts each was working on all fit together.

Designing the deck plans so that all eight master carpenters kept busy was no mean feat, and a lot of very smart people spent a lot of time coming up with those plans. Even so, there was a limit to how much parallelization could be found even in the biggest construction projects. Gene Amdahl, who started out training carpenters for Thomas Watson and went on to found his own carpenter training company, once rather famously observed that the amount by which a carpentry project could be sped up depended not just on the number of carpenters and the speed of each carpenter, but the degree to which the deck building instructions could be written to keep all of the carpenters busy at the same time. It was one of those observations that was both obvious and profound.

Some of those same smart people also designed some very specialized tools for the master carpenters to make better use of their very expensive time. My favorite was the vector hammer. The vector hammer had a single handle, so that it could be wielded by a single master carpenter, but had many heads, perhaps even dozens. A single master carpenter could, if he were careful, and everything lined up just right, drive dozens of nails at a time. So the deck architects spent a lot of time designing deck plans in which the nails all just happened to line up just right so that the vector hammer could be used.

Applying this same team idea to the high school shop students seemed like a winning proposition. But to get the same level of performance out of a group of high school shop students as you would from eight master carpenters, you would need a lot of high school shop students. How many? A lot. A guy named Danny Hillis made an ambitious attempt to harness sixty-four thousand high school shop students to build a single deck as fast as eight master carpenters.

Yeah, you see the problem now don't you? How do you rewrite your deck plans to keep sixty-four thousand high school shop students busy building a single deck? How do you even keep them supplied with lumber and nails? Keep them from stumbling over one another? It's almost impossible.

Oh, for sure, there were some successes. Some carpentry projects, like building a long picket fence, were embarrassingly parallel. If the fence were long enough, you could space the students out along the length of the fence line and have them all start building in parallel towards the next student, making sure to join the end of their fence section to the beginning of the adjacent student's. For a lot of projects, this actually worked pretty well. And you can be sure for the guy trying to get you to hire his sixty-four thousand shop students, those were the projects he used as examples in his sales pitch. But in general, no one knew how to design blueprints and write instructions to make effective use of that many high school shop students. Most of the students sat around idle, drinking energy drinks, while the few that could be kept busy slowly plodded along.

But a funny thing happened. Over time, as they got older and more experienced, the high school shop students individually got better. They started learning from their mistakes, started copying some of the techniques of the master carpenters, starting being more efficient at building decks. But remarkably there was one thing they didn't do: they didn't get more expensive.

Suddenly it became practical to use a single high school shop student to build a deck. Maybe not as fast as those eight master carpenters. But maybe almost as fast (or sometimes faster) than one of them. And the instructions for a single student to build a deck were so simple, even I could write them. As time went by, each high school shop student continued to get better and better. A guy by the name of Gordon Moore recognized that high school shop students would likely continue to get better and faster at a predictable rate for the foreseeable future.

Alas, it turns out that there were still limits to how fast a single high school shop student could build a deck. Students could only move so fast, could only drink so much energy drink (and had to take more frequent bathroom breaks). And no matter what, they didn't have the experience and knowledge of the master carpenters, so that in their haste to build a deck quickly they occasionally made a mistake, but because they were still pretty good at what they did, the mistake was sometimes subtle and hard for even the deck architects to find.

It was inevitable that someone would suggest that, just as the owners of mansions and estates did decades ago with master carpenters, a group of high school shop students be banded together as a team. First it was just a couple of students. Then four. Then eight were banded together in a team in what we might call a multi-carpenter approach. Some folks, like a guy named David Patterson and his friends who think about building decks a lot, have suggested creating teams of dozens or even hundreds of high school shop students, in what they call a many-carpenter approach.

Oh. Yeah, see, now we're back to the problem of decades ago when no one knew how to write instructions and deck plans for lots of carpenters. It's like we forgot what Amdahl said decades ago.

But the problem is even worse than it was way back then. See, using the one cheap but experienced high school shop student worked for so long and so well, that there are lots and lots of instruction books and architectural plans for how to build all sorts of cool things using a single carpenter. How detailed are these plans? One place where I once worked had an instruction book for its carpentry projects that was eight million lines long. That's a book of around 120,000 pages of instructions, all based on the assumption that a single carpenter would be working on them. This instruction book, portions of which dated back twenty years, generated this company as much as a billion dollars in revenue annually.

Where would you start breaking down eight million lines of instructions so that two carpenters, much less eight, or dozens, could all be kept busy building a deck? Oh, for sure, they've tried, but without much luck. After much effort, the carpenters still keep getting in each others way, tripping over each others lumber, accidentally hammering another carpenter's thumb.

And that's just the legacy deck plans. In fact, even if you're designing a brand new deck from scratch, no one really knows how to write deck building instructions to keep a lot of carpenters busy in parallel. Humans don't think that way. There are lots of ideas about abandoning English altogether and writing the deck building manuals in a new language better suited for expressing parallelism in carpentry. But there are an awful lot of carpenters and deck designers that already know English. And that doesn't do anything for all the myriad of English-language plans, blueprints, and instruction books that already exist.

As a homeowner, I think it is an interesting problem.

Acknowledgements

Thanks to Ken Mulvihill, Robert Gallegos, and Ken Howard for the inspiration.

Tags

IBM mainframe supercomputer DEC minicomputer microprocessor multiprocessor Amdahl's Law Moore's Law Thinking Machines massively parallel processor MPP multi-core many-core threads multithreaded C C++ Java

Saturday, October 17, 2009

Bananas

If you live in Europe or the Americas, every banana you have ever eaten was probably a variety known as the Cavendish. This are other varieties of bananas, but they look so different (for one thing, they may have large, hard seeds, something bred out of the Cavendish) that you may not even recognize them as the same fruit you slice up on your breakfast cereal. And that's a problem. There is so little genetic diversity in the Cavendish that a single disease could wipe the long, yellow, conveniently self-packaged fruit we all know right off the planet.

Alarmist rhetoric? Not so much. That's exactly what happened to its similarly monogenetic predecessor, the Gros Michel banana.

Michael Osinski argues that a similar lack of diversity nearly led to the collapse of Western Civilization. Conspiracy theory? Again, not so much. Osinski, in an article in New York Magazine, admits that the latest global economic meltdown might have been at least partly his fault.

Osinski wrote the software that was used by nearly every investment bank on the planet to bundle mortgages into a kind of bond known as a collateralized debt obligation or CDO, and to compute an asking price for the result. A user of Osinski's software package would pour thousands of individual mortgages into one end, puree them into a smooth consistency, then parcel the results out into many different investment vehicles, each vehicle containing a tiny portion of each of those mortgages. Then many individual investors bought a portion of those investment vehicles. Investors like your 401(k), your mutual fund, and your pension plan. His software computed the value of the CDO based on many factors including the risk of the mortgages that went into it.

Because each mortgage was blended together with thousands of other mortgages, the risk of default of any one individual mortgage was spread out. Because the resulting smoothie of mortgages was distributed into many different investment vehicles, even if many mortgage holders were to default, those bad mortgages would be a tiny portion of each investment vehicle. And because the ownership of each investment vehicle was spread out among many investors, the tiny risk was shared among many. In the case of your 401(k), your mutual fund, and your pension plan, the investor itself was in fact made up of money from many different individuals.

The only way it could possibly go wrong would be if the entire housing market suddenly collapsed. And what were the chances of that happening?

How do you determine the worth of something? As someone who has bought and sold a used motorcycle or two, I can tell you that this isn't the easiest question to answer. My Honda CB700SC that was a basket case (that is, it had been separated into its component parts, and would take some work to get it running again), I sold it for a fraction of its book value, which is, after all, just someone's opinion based on what others have actually paid for a particular model of vehicle. My Harley-Davidson FLHS, on the other hand, I sold years later for more than I paid for it. See, in a free market, the worth of something isn't what some piece of paper or blue book says it is. Its worth is what someone else is actually willing to pay for it. Once you grasp that, it becomes clear why touchy-feely factors like investor confidence suddenly become really important, because the idea of value has a significant psychological component.

When the housing industry collapsed and house prices were suddenly severely deflated, people discovered that they owed far more on their mortgages than their homes were worth, because their homes were worth only what someone else was willing to pay for them.

That may not seem like such an issue if you've never owned a home. When you take out a loan to buy a car (or a motorcycle) for example, the vehicle is probably worth less than you owe on it as soon as you drive it off the lot. But houses are different. When your job transfers you to another state, or you graduate from college, or you lose your job and have to move to where your employment prospects are better, you can take your vehicle with you. But typically you'll have to sell your house, if for no other reason than to take the money you make on the sale to use as a down payment on a house in your new town.

That is, if you make any money on your house. But if you find you owe more on your home than someone else is willing to pay for it, you are seriously underwater. You can no longer afford to sell your home, because you can't afford to pay off the balance of the mortgage. You can't afford a new home, because you don't have the down payment from the sale of your previous home. You can't afford to move, even though your job depends on it.

The solution for many people was to just walk away from their old homes, to default on their mortgages, and leave the mortgage holder holding the bag. And by mortgage holder, I mean, ultimately, your 401(k).

The risk of default of each of those pureed mortgages was among the parameters supplied by the users of Osinski's software package. It took these risks into account when computing the value of the CDOs. The sudden default of so many mortgages more or less simultaneously was not a risk factor that the users of Osinski's software had anticipated. They had seriously underestimated the risk, and hence the software package had seriously overvalued the CDOs. Investor confidence in mortgage-backed securities eroded; taking all of that risk (which had in fact always been there) into account devalued the CDOs, which now found themselves owning toxic assets in the form of defaulted mortgages. And because of the incredibly broad distribution of the ownership of the CDOs, we all ended up owning parts of defaulted mortgages that were worth virtually nothing.

In a talk he gave at a recent conference I attended in Santa Fe, computer scientist and financial software guru Arthur Whitney (who, by the way, decades ago worked with Kenneth Iverson on the programming language APL) estimates that in a matter of days the valuation of these CDOs fell from around US$33 trillion to US$8 trillion.

Let me make this clear: almost overnight, twenty-five trillion dollars in wealth evaporated. It didn't get spent. It wasn't stolen. It simply ceased to exist because thousands of investors suddenly decided not to buy into the consensual hallucination that was the housing bubble.

So what, if anything, does this have to do with the kinds of topics I usually write about? And what the heck does it have to do with bananas?

Here's the thing: just about every investment bank on the planet was using Osinski's software package. For sure, they were all putting in the wrong parameters, and using it for far more complex investment vehicles than Osinski ever intended. But still, it was all the same software package, with the same input formats, the same calculations, the same menus, the same bugs, the same output formats, the same pop-ups, the same errors and warnings. It is possible, just possible, if there had been other software packages being used for this same purpose, that some investment houses might have come to a different conclusion about the risk inherent in mortgage backed securities? Even if the answer is no, the fact that everyone was using the same software package means they were all exposed to the same software bugs, the same design flaws, the same systemic errors.

They were all eating the same variety of banana. Except in this case, instead of losing a much beloved fruit, they ran the risk of a systemic error leading to the collapse of the world economy. Each of us is, in a very real sense, a victim of the success of Osinski's software package.

I recently helped develop an product in which we worked very hard to make the system recoverable in the field by the end-user. If the disk-based system failed, the user could boot up a flash-based system and recover the disk. If the flash-based system failed, the user could boot up the boot loader and recover the flash-based system. If the boot loader failed, the box had to be shipped back.

All three systems ran on the same processor. Much of the very low level code, that which would be typically described as part of the board support package, could have been common between the disk system, the flash system, and the boot loader. It was just a matter of time until someone suggested just that. I argued (vehemently I'm told) against it. Even if the code were exactly the same, I argued against sharing the same source files in the source code control system. Otherwise, a single error in a single source file could lead to a systemic flaw that caused the same defect to occur in all three tiers of the product, rendering the product not just inoperable but unrecoverable in the field.

I understand -- better than most -- the economics of software reuse. And I understand how free markets drive companies to adopt what they perceive to be the best practices, including the software tools, of their competitors (whether they are really the best or not). Economies of scale drive this too. Do all of your employees use the same brand and model of desktop or laptop? Running the same operating system? All purchased about the same time because of a volume discount? And then there's people. Do you hire people that look like you? That share your opinions? That think the same as you so that you feel comfortable around them?

Whether you are talking about software, hardware, bananas, or people, a lack of diversity can lead to dire and unforeseen consequences.

Sunday, June 14, 2009

Cache Coherency and Memory Models

Sometime around 2002 I was working on an object-oriented multi-threading framework in C++ for an embedded project and the most remarkable thing happened. I had an object that represented a thread. My application created the object via the usual new operator and started the thread. The first thing the thread did was access its own thread object. It core dumped. Core dump analysis using gdb revealed that the virtual pointer in the object was null.

I know what you're thinking.

That's impossible.

That's what I thought too. That event led to a personal research project that consumed many of hours of my free time and led to my putting together a presentation on the Implications of Memory Models (or Lack of Them) for Software Developers. In it, I describe the research of others that reveals that modern processor memory subsystems have broken the memory models implicit in the software.

This includes software written in higher level languages like C, C++, and Java (prior to Version 6 anyway) and as well as that written in assembler. It includes multi-processors, multi-core uniprocessors, and even those single core uniprocessors that are hyper-threaded. The upshot is that, thanks to various processor, memory, and compiler optimizations, memory may not be read or written when you think it is from looking at your code. Or even the compiled code.

Java addressed this issue in Version 6, providing you use volatile or the language's built-in synchronization mechanisms. But C and C++ developers dealing with variables shared among threads must still resort to the appropriate use of the volatile and explicit memory barrier instructions, which may be provided by your threading library's synchronization functions, providing you use them.

Fortunately, at least gcc offers the generic __sync_synchronize() built-in that performs a platform-specific memory barrier with full fence semantics. Unfortunately, in their paper Volatiles are Miscompiled, and What to Do about It, Eric Eide and John Regehr at University at Utah reveal that code containing the volatile keyword is frequently miscompiled. The life of a systems programmer is seldom simple.

Recently I discovered Ulrich Drepper's lengthy white paper on What Every Programmer Should Know About Memory. He gives a detailed explanation of how the hardware architecture of cache and memory subsystems can affect performance and correctness of software. I recommend it to every one that does embedded or systems development.

(Actually, I recommend it to all developers. But my Java buddies have pointed out on more than one occasion that they don't really care to know how things work under the hood. That's an attitude that I can't relate to. But I was one of those kids who took clocks apart to see what made them tick.)

It took me a couple of weeks to read Drepper's 114 page paper cover to cover. The entire time I was reading his description of the MESI cache coherency protocol, I was trying, without much success, to reconcile it with my prior reading on processor memory models.

Then finally I got to section 6.4.2, "Atomicity Operations", page 68:

If multiple threads modify the same memory location concurrently, processors do not guarantee any specific result. This is a deliberate decision made to avoid costs which are unnecessary in 99.999% of all cases. For instance, if a memory location is in the ‘S’ state and two threads concurrently have to increment its value, the execution pipeline does not have to wait for the cache line to be available in the ‘E’ state before reading the old value from the cache to perform the addition. Instead it reads the value currently in the cache and, once the cache line is available in state ‘E’, the new value is written back. The result is not as expected if the two cache reads in the two threads happen simultaneously; one addition will be lost.

If this doesn't run chills down your spine, you're not thinking clearly. Drepper goes on to discuss some of the same atomicity operations I talk about in my presentation. His paper also details the write reordering which I suspect was the cause of my null C++ virtual pointer.

Why is all this stuff important?

In the early 1980s I was a computer science graduate student working on my thesis under the direction of my mentor Bob Dixon, in a research group that was looking at programming language and operating system architectures for massively parallel processors. This was when Japan's Fifth Generation project, dataflow architectures, and the massively parallel Connection Machine were big. At the time, it all seemed bleeding edge and, given our budget, largely hypothetical.

But just a few years later I found myself working at a national lab, the National Center for Atmospheric Research, which not only had a Connection Machine, but other exotic highly parallel systems including several Cray Research supercomputers with multiple processors and vector hardware.

A few years later still and I am more than a little surprised to find that I have a four-core Dell server running Linux in my basement. The future has a way of happening.

In The Landscape of Parallel Computing Research: A View from Berkeley, researchers David Patterson et al. describe the future that they see: not just multi-core, but many-core, the equivalent of a massively parallel processor running on your desktop, with the dozens or hundreds of cores sharing memory. Suddenly, understanding concurrency, synchronization, and memory models (or the eventual higher level languages that implement it all without the developer having to worry about it) seems pretty necessary.

For me, it's 1983 all over again.