Tuesday, August 14, 2007

Tool Economics

"If all you have is a hammer, everything looks like a nail." - Benard Baruch

I was teaching a class in embedded development at a client's site the other week. We were discussing approaches to fixing an actual bug in their firmware code base. This client uses the IAR Embedded Workbench, an IDE that provides all the usual tools: an editor, a build environment supporting the ARM-based tool chain, a JTAG interface, a down loader, a debugger, etc. When I've done product development for this client, I've used this tool, and while its debugger is the best thing since sliced bread, I found its editor to be a little weak. I preferred instead to do my editing using Eclipse with the CDT and Subversion plug-ins. The IAR EWB and Eclipse played well together, in my experience, allowing me to leverage the best of both worlds.

We discussed for a few minutes the merits of using the debugger to test our hypothesis of what the bug was, where to put in breakpoints, what variables we needed to look at. "Or," someone said, "we could just change the code and see what happens."

I had one of those brief moments of cognitive dissonance where I wondered "Who the heck said that?" and then realized "Oh, it was me!" We were all victims of tool economics.

Back in the 1970s, when I got paid to write code in FORTRAN, COBOL and assembly language, we didn't so much have a tool chain as a tool string or maybe a tool thread. We typed our code onto punch cards using a keypunch, which we frequently had to stand in line for. We submitted them to the operator at the I/O window of the computer room. Our stack of punch cards were queued up with a lot of other similar stacks by the card reader. The operator fed them in, where our job was queued on disk by the mainframe operating system. (This was OS/MFT running HASP on a IBM 360/65, for those that remember when glaciers covered the land and we communicated mostly with grunts.) Eventually the OS would run our jobs and, if all went well, our printout would come banging out on the line printer. The operator would separate our printouts and place them with our original card deck back in the I/O window. On a really good day, you might get two or three runs in, providing you got to the computer center early and left late.

The latency involved in a single iteration of the edit-compile-test-debug cycle bred a very strong offline approach to code development: you looked at your code until your eyes bled, trying as hard as you could to fix as many bugs as possible, running what-if simulations in your head because that was a lot faster that actually running your program. You saved old punch cards, just in case you needed to put those lines back in (maybe even in a different location), and your printouts were heavily annotated with quill pen marks (okay, I exaggerate, they were felt-tip pens). Changes were meticulously planned in advance and discussed with as many other developers as you could lay your hands on. You think you multi-task now? The only way to be productive in those days was to work on many completely different software applications simultaneously, so you could pipeline all of those edit-compile-test-debug cycles. Every new printout arriving at the I/O window was an opportunity to context switch in an effort to remember what the heck you were thinking several hours ago.

Sitting here, writing this while running Eclipse on my laptop and accessing my code base wirelessly from a remote Subversion server, I have zero desire to return to those bad old days. But it does illustrate how our tool chain affects how we approach a problem. Because the developers at my client were focused on using the EWB, their problem solving approach necessarily revolved around debugging, because the debugger was the EWB's greatest strength. My problem solving approach revolved around editing and refactoring with Eclipse, trying an experiment to see what happened, then maybe using Subversion to back my changes out. I resorted to using the debugger only when I had a unit test really go sideways and trap to the RTOS, or when I just couldn't see what was going on without single stepping through the code and watching variables change. Using Eclipse as part of the tool chain made it economical to try new things and verify proposed solutions quickly.

Tool economics affects all phases of software development. It limits us in the solutions we are willing to consider, and forces us to reject some potential solutions out of hand. It colors the processes we use, and the quality we may be able to achieve within the constraints of schedule and budget.

Some of the developers at my client preferred to use the debugger to unit test their code. It was easy to love the debugger in this role, but it was a very labor intensive process. This was okay economically as long as you only had to do this once, when you originally wrote the code.

But no successful code base is static. At least two-thirds of the cost of software development is code maintenance, changing the code base after its initial release. Refactoring, something tools like Eclipse vastly simplify, becomes a way of life. Going through the debugger every time you change a particular function is expensive. I chose instead to write unit tests embedded in the production code base, typically conditionally compiled in so that the test code was present in the development environment but not shipped as part of the product. If I changed a particular function, I just reran the unit test. If the test failed and I was absolutely clueless as to why, then I turned to the debugger. I paid the price of writing the unit test once, and then used it for free henceforth. For sure there are occasionally some Heisenbergian issues with the unit test affecting the functionality of the code base, typically due to memory utilization or real-time, but those were rare.

Embedding unit tests in the code base is just a more economical approach to software development from a developer utilization viewpoint. But the real reason I tend to use that approach is that it's a rare embedded development project I work on in which I have access to a debugger as good as the one provided by the IAR EWB. I'm generally forced into more formal unit testing approach because I really don't have a choice. The fact that it's really cheaper in the long run is just gravy. This is yet another way in which tool availability forces a particular approach.

Tool economics effects how we design our applications too. I remember a job in which I found myself working in an eight million line code base whose oldest elements dated back at least to the early 1980s. Unit testing was extremely expensive. There were no formal unit tests. You tested your change by reserving a lab machine for sometime in the future, loading the application, and spending anywhere from five minutes to hours testing on real hardware and real devices, sometimes in concert with logic and protocol analyzers. I admit that many times I chose a particular design, or even an architectural direction, because it minimized the number of subsystems I would have to touch, and hence re-test. Choosing the "best" solution had nothing to do with it. Choosing a solution that met requirements while at the same time could be completed within the limitations of the schedule had everything to do with it.

You think I was wrong? The managers to which I reported would disagree with you. Admittedly this was sometimes a case of time to market (TTM) taking precedence over cost of goods sold (COGS) (see The Irresistible Force and the Unmovable Object), but that was the trade-off sanctioned by that particular development organization.

The expense of testing also effected the quality of the code base in perhaps unexpected ways. Many times I recall discussing with my fellow developers over beer or lattes that, in the adventure game that was navigating this huge code base, in the course of our work we occasionally encountered code that we knew deep in our gut simply could not work. It could never have worked. What to do? We could have fixed it, but then we would have had to test it, even if it were a trivial change, placing us behind schedule on our actual assignments, and making it more likely we would be in the next wave of layoffs. We could have written and filed a bug report (and too our credit, we typically did just that), but bug reports have a habit of ending up back on the plate of the original submitter. So when we encountered such code, and if time was short, we sometimes just backed away slowly, and pretended never to have seen it. The high cost of testing drove that dysfunctional behavior.

If you are a developer, you deserve the best tools you organization can afford. No matter whether you are in the open source world, the Microsoft camp, or the embedded domain, there are very good tools to be had. The availability of good tools, and your familiarity with them, will affect not only how you approach your job, but the quality of the product that you produce.

If you are a manager, you need to make it a priority to get your developers the best tools you can afford. Your schedule and the quality of the resulting product depends on it. Just meeting requirements is not sufficient, because a requirement must be testable. Having a requirement that says "this software must be maintainable in the face of even unanticipated changes over its entire lifespan" is not testable. Whether you appreciate it or not, your developers are making economic decisions minute by minute that affect both the TTM and the COGS of your product. And, hence, the financials of your company.

2 comments:

Demian L. Neidetcher said...

Man, I often wonder what career I would've ended up in if I were born 10 years earlier. I'm sure it wouldn't be software. I had the bad fortune to begin my career slinging COBOL. I complained and moaned until my managers let me write Python code for intranet development. I remember walking across the parking lot to another building to pick up green-bar to see how my COBOL tests ran.

Today I look at a different green bar to make sure my tests ran correctly. It's the JUnit status bar in the Eclipse JUnit integration and I don't write more than 5-10 lines of code before I run unit tests.

I can't imagine having the patience for that slow turnaround time.

But yeah, indeed tools are incredibly important. I often throw around the idea of blogging about the 'micro-loop'. That overlooked edit-compile-test cycle that developers do that is incredibly important to the productivity of a company.

Chip Overclock said...

I've said before that the entire software development cycle is fractally iterative. You loop in a very tight edit-compile-test-debug loop. (And if you're using tools like Eclipse and JUnit, that loop is very tight indeed, to its merit.) You loop in a design-develop loop. You loop in a archtect-design-evaluate loop. You even loop in a requirements-everythingelse loop. It's looping all the way down. The faster you can iterate through any of those loops the faster you converge to a suitable solution. Tools that help you do that probably pay for themselves.

I've often thought of development projects like those "magic eye" pictures that were all the rage a few years ago. You stare and stare but everything seems out of focus then suddenly: there it is. Project managers have their Gannt and Pert charts, and they keep track of metrics like outstanding bug reports. So that point of view may give you a more logical perspective on progress. But as a developer, with my hands deep in the implementation, every project has seemed like a marvelous unfocused chaos until suddenly: there it is.