Thursday, January 25, 2007

If Java is the new COBOL, is C++ the new assembly?

The phrase “Java is the new COBOL” is so widely used now that I can’t pin down its origins. When you do a web search on it, it is interesting to see the different interpretations.

Clearly some folks mean “Java is obsolete, just like COBOL”, and maybe in some problem domains it is. Domain-specific languages like Ruby on Rails are making in-roads in a lot of roles that were once nearly exclusively Java-based, like web commerce.

But many seem to mean “Java is very broadly deployed and very stable, and is widely used and trusted in enterprise applications, just like COBOL”. Of course, the subtext here is that Java is no longer cool because it’s mainstream.

Being compared to COBOL doesn’t really seem that bad of a thing. I am a little embarrassed to admit that very early in my career I was paid to write a little COBOL. I even got to meet Grace Murray Hopper. But I hadn’t really given COBOL much serious thought for years until in the span of a few days I read some pretty startling (and somewhat contradictory) statistics about COBOL.

In his column “Bar Bets” in a recent edition of Dr. Dobb’s Journal, Jonathan Erickson quotes that there are “250 billion lines of COBOL source code in use”, and “1.5 billion new lines of COBOL are written every year”.

In “How to Interpret COBOL Statistics” in Bob LewisAdvice Line column, a reader quotes “various Gartner reports”: “Five billion lines of new COBOL are being written every year” and “Thirty-four percent of all coding activities are in COBOL”.

No matter which set of statistics are correct, this is not bad for a forty-year-old programming language. If Java reached the level of acceptance and use within the enterprise that COBOL has obviously achieved, I would count that as an incredible success. Considering that most information technologies have a half-life of about five years, technologies like FORTRAN, COBOL, C, and TCP/IP which have done substantially better than that have to be admired.

Bob Lewis’ reader also cites “There are 310 billion lines of legacy code operating in the world (65% of all software)”. This jives with my visualization of the total amount of source code in the world as an expanding sphere. The labor spent maintaining legacy code is represented by the volume of the interior of the sphere, whereas the labor spent writing new code is represented by the surface of the sphere. They are both expanding, but while the surface of the sphere expands proportional to the square of the diameter, the volume expands proportional to the cube. Once it ships or goes production, all code becomes legacy code. If you drilled a core down through the sphere, you would find some Ruby scattered around on the surface, a thick crust of Java, a churning molten layer of C++ and C, and a solid core of FORTRAN, COBOL and assembly.

So there will always be a lot of legacy code, and in greater amounts than there is freshly minted code. As long as it is in use, all of that code has to be maintained. If you asked the COBOL developers of the 1960s what the life span of their creations were likely to be, could they possibly have predicted that their work would still have been running in the twenty-first century? Yet Y2K showed us that this is indeed the case. I suspect Java developers will have a lot of work ahead of them maintaining the ever growing legacy Java base.

One thing I find particularly interesting about the success Java has had in the enterprise is the relative lack of success it has had in the embedded arena, at least in the United States. Someone remarked recently that in Europe, Java is considered an embedded programming language, while in the United States it is seen as an enterprise programming language. Witness the fact the new Apple iPhone does not include a Java Virtual Machine (JVM), and apparently, if Steve Jobs has any say (and one must assume he does), it never will. Yet dozens of other mobile phones and wireless devices have a JVM nestled inside to run Java applications. This is a little ironic since Java was first conceived at Sun Microsystems as an embedded language for applications like set top boxes, where it is in fact used. I suspect most of Java's success came from its application in the enterprise domain, the embedded domain as usual getting no respect.

I have a long background in developing real-time applications. This has understandably manifested in my working on a lot of embedded products, which tend to be heavy on concurrency, synchronization, and message passing, stuff I have seared in my brain if not tatooed on my body. I have watched, and participated, as embedded development moved from assembly language to C and then to C++. Way back in 1999, I worked on a project that developed an embedded product in Java, with the low level device code in C. I thought that Java was a great embedded language back then.

I still think so today, tens of thousands of lines of both C++ and Java later. It has all the stuff you want in an embedded language, mostly stuff that C++ lacks, like built-in concurrency and synchronization, not to mention a simpler syntax than C++. Yes, it lacks some of the features you need to easily do C-ish things like access memory mapped registers as variables. The JVM and its bytecode can be troublesome (although I believe they solve far more problems than they cause). The way memory deallocation is hidden from view means that garbage collection may have to be managed. And the fact that you can’t swing a cat in Java without allocating and deallocating a bunch of objects, if not in your code then in the dozens of classes that constitute the Java utility classes, or in the hundreds of frameworks that have served to complicate Java development, causes some headaches.

But this shouldn’t disqualify Java as an embedded language, any more than any of the quirks in C++ disqualified it. It took discipline and a thorough understanding of how C++ works to use it effectively in embedded applications, but doing so yielded economic benefits like greater productivity, simplified maintenance, and ultimately reduced time-to-market. I was lucky enough to have mentors like Tamarra Noirot, Randy Billinger, and Doug Gibbons who carefully tutored me in the design patterns I needed in my transition from C to C++ for embedded development. Yet even as effective as C++ is for embedded applications, there are still always portions of the application that have to be written in C, or even assembly, to accommodate the hardware and the tougher real-time portions of the system, just as there were in that embedded Java project back in 1999.

It will take a similar discipline and understanding of Java to use it for embedded development, but I believe that that must happen. I recently spent more than a year working on yet another Java-based project, this one a non-embedded product to expose feature-rich multi-modal communication capabilities to business process automation via web services. I was reminded how much more productive I am in Java than in C++ or C. It's not just that I'm more productive writing code in Java (although I am), but it is the amazing tool chain that supports Java. Tools like Eclipse (an IDE which works for C++ too but is much better with Java), Ant (build), JUnit (unit testing), Cobertura (code coverage), and JProfiler (memory usage). Not to mention the rich set of utility classes and frameworks available for Java. Programmer productivity is one of the keys to time-to-market, and time-to-market is one of the keys to success in today’s competitive environment. True, maybe you can’t write 100% of your embedded application in Java. But writing 100% of it in C++ will have similar economics as twenty-five years ago when I remember embedded developers arguing against writing in C instead of assembly, or ten years ago arguing against C++ instead of C.

So C++ is the new assembly. And that’s not a bad thing. It will always have its niche in embedded development. But for purely economic reasons, in the coming years you must minimize the amount of embedded code you write in C++, just as you must minimize the amount you write in C, or in assembly. You will do it because you cannot be competitive otherwise.

When I was discussing this with my friend and occasional colleague Demian Neidetcher, he asked “If Java is the new COBOL, and C++ is the new assembly, what does that make Visual Basic .NET?”

You already know the answer, Demian. It’s the new Basic.


Jonathan Erickson, “Bar Bets”, Dr. Dobb’s Journal, #392, January 2007

Bob Lewis, “How to Interpret COBOL Statistics”, Advice Line, January 17, 2007

Bill Venners, "The Art of Computer Programming: Conversation with James Gosling", Artima Developer, March 25, 2002


Paul Moorman said...

Wow! Nice article. However, it felt like trying to solve a multidimensional puzzle without knowing the number of dimensions.

Perhaps it would be simpler (I love simple, being I'm too stupid to understand all you smart people), if you bring it down to some basic drivers that might help us understand, and perhaps predict, where Java, C++, C, assembly, etc. have been and where it might be going next. Drivers such as: speed to market, cost, programmer productivity, availability of skills, breath of tools, etc.

Java in my world was driven more by cost factors that was improved with the introduction of its underlying made-up instruction set, which added a layer of virtualization to an overly proprietary world. Just a little "your price is too high, I can easily boot you out" and poof, I spend less. It was nice to have a really cool language choke full of things to make programming like better. Also nice to have enabled an awesome amount of innovation with Eclipse and all that. Those might have been better for you than me. The success of Java seems to me to be related to the large number of people that it was able to satisfy at the same time.

One of your points seems to be that all things tend to drift up as time goes by. Assembler, up to C, and so forth. Yep, been that way and no change in sight. Thank our lucky stars or things would get awfully boring and irritating at the same time.

Chip Overclock said...

"speed to market, cost, programmer productivity, availability of skills, breath of tools", yeah, all that stuff. I guess for me it all comes down to time-to-market, and the other metrics are simply ways of achieving that.

One of the dangers of writing a blog with a diverse audience is trying to figure out who you're writing for. Just a year or so ago I heard a senior technologist in a CTO organization remark that real-time code could not be written in C++. He said this in the presence of a couple of us who had spent the past decade developing products for this same company by writing real-time code in C++. We just looked at each other and shrugged. For sure, the acceptance of "new" technology in a particular problem domain is a continuum, or spectrum, or maybe a manifold, or some other cool word that means it's not a simple yes/no kind of thing. I think I'm aiming in my article for folks that have gotten used to the idea of using C++ in embedded development, and I am suggesting that it is time for C++ to be pushed down in the stack in the same way C, and assembly, had been pushed down, and for purely economic reasons. I'm too old to worry about whether something is cool or not, but I do understand economic benefit. But what seems obvious to some at one end of the spectrum (like those folks that worked with me on an embedded Java project in 1999) seems like crazy talk to others.

Even today there are embedded development pundits who are very skeptical about the application of Java in embedded systems. The cell phone in their pocket may well have a JVM running in it. My mind has gotten too slow with advanced age (they generally shoot engineers over forty) to understand this disconnect. So I thought I'd write about it and see what comments I got. I'm pleased to see I got a good one! Thanks!

Chip Overclock said...

When I wrote this article I had lost the reference regarding the use of Java in the enterprise versus embedded spaces. I just now came across it again: Bill Venners interview with James Gosling. Gosling remarks on the different applications and perspective on Java in Europe (embedded) versus the U.S. (enterprise). I've added a reference to the end of the article.