Humanity continues to be plagued by genetic diseases like sickle cell anemia and type 1 diabetes. In his book Survival of the Sickest (William Morrow, 2007), Dr. Sharon Moalem asks "Why hasn't natural selection eliminated these diseases?"
For some, like family traits towards hypertension or breast cancer, it's obvious: they don't kill us until after we've reproduced, so there is little or no genetic pressure against them. But for others, like the two I mentioned above and others which can take their victims in the prime of life, this is clearly not the case. So what's the deal?
The deal is this: as bad as these genetic diseases are, they provide some immunity to something that is even worse. They have survival value.
Sickle cell anemia occurs predominantly in populations of African descent. It offers resistance to malaria, because the malformed blood cells are not good hosts for the malaria parasite. Type 1 diabetes is statistically more common in populations of Nordic descent. It offers resistance to frost bite by virtue of the anti-freeze properties of high blood sugar. Hemochromatosis is a genetic disease where the iron levels in the blood are so high that it can lead to organ failure. A huge percentage of the population of Western European descent have at least a mild form of this malady, which in turn gives them resistance to the bubonic plague. (It is also why treating folks with leeches and otherwise bleeding them may actually have made them better.)
Moalem (with co-author Jonathan Prince) offers many other counter-intuitive insights of how dreaded genetic diseases may have kept your ancestors alive, in this medical version of Freakanomics, another book I much recommend. The chapter on epigenetics would have made my hair stand on end, if I had any. The fact that your maternal grandmother smoked while she was pregnant with your mother had a direct effect on the expression of the genes in your body today.
And so it is with large software systems. How does a software application get to be eight million lines of code?
One line at a time, baby, one line at a time.
No one starts out to write an eight million line program, just as natural selection doesn't set out to create genetic diseases. Everything was done for a reason, to provide some necessary capability or to solve some problem. Successful software systems evolve and grow organically, just like biological systems. And as David Parnas asserts in his classic paper "Software Aging" (Proceedings of the 16th International Conference on Software Engineering, IEEE, May 1994), software is no more immune to the effects of entropy than any other artifact in the physical universe.
Continuous software maintenance inevitably occurs in any successful software product. The only software products that don't change are the ones that failed in the marketplace. The fact that this frequently results in an awkward architecture, a compromised design, and spaghetti code just means that the software was a victim of its own success, and the fact that it is rarely economically feasible to scrap it and start over. In evolution we call scrapping something and starting over extinction.
In his book Object-Oriented and Classical Software Engineering (McGraw-Hill, 2002), Stephen Schach cites studies that shows that fully two-thirds of the total cost of the entire software development lifecycle is maintenance (and I have read studies that cite higher numbers). Software reliability pundit Les Hatton cites studies that show that only 20% of code is written as part of initial development; 40% is corrective maintenance or bug fixing, 16% is perfective maintenance or improvements, and fully 24% is adaptive maintenance or simply making changes because something in the external environment changed.
We apparently spend a lot of time just adapting to the latest Linux kernel or Java Virtual Machine release, porting to a new processor platform, wondering if the newest GCC will subtly alter the behavior of our code base, or dealing with the effects of the IT department upgrading from 10Mb/s to gigabit Ethernet on the servers. This is why I think way too little attention is paid to designing systems that are easy to modify. If you are not addressing the largest single piece of the software development pie, then you are not controlling your costs. Adapt or die.
This is one of the reasons why innovation seems so much easier in start-ups. They are too young, inexperienced, and immature to realize that what they're doing "can't be done". They don't have to shoehorn their new product into an existing successful portfolio. They don't have to worry that they may disrupt a revenue stream that supports the livelihood of many folks in their organization. And they don't have to carry around the legacy of their past successes, because they haven't had any. The risk of failure is large, but the cost of failure is small. It's also why large established organizations may find innovation by spinning off a tiny subsidiary and shielding it from the rest of the bureaucracy.
In evolution, the vast majority of mutations are neutral or lethal. But some become a benefit to survival. Likewise, we should expect a lot of new product development to fail. A few will create a new billion dollar industry. Two-time Nobel prize winner Linus Pauling once said "The best way to have a good idea is to have a lot of ideas." Innovation is not for the risk-averse, in either the software world or the biological world.
Evolution has (so far) chosen not to scrap the human DNA and start over, even though doing so might lead to a more efficient design without so much questionable baggage. Likewise, we are understandably reluctant to do the same with our large successful software systems.