Monday, July 10, 2006

In Defense of Misbehavior

Some years ago I read Robert Austin’s 1996 book Measuring and Managing Performance in Organizations. It’s not the kind of book I would have normally chosen to read at that point in my life. But I was on a tear reading books on software engineering methodology and people management, and I kept stumbling across references to it. Reading it changed my whole perspective on Life, the Universe, and Everything. The recent death of Enron chairman Kenneth Lay inspired me to try to organize my thoughts on Austin’s topic of measurement dysfunction.

Austin’s thesis can be summed up as follows:

  • To be effective, incentive plans must be tied to objective measurements of employees’ performance against objectives.
  • To incent employees to produce the optimal desired results and avoid unintended negative consequences, all possible aspects of their performance must be objectively measured.
  • In any but the most trivial of tasks, such complete measurement is at worst impossible, or at best too expensive to be practical.


The implications of this should be terrifying to managers trying to steer the ship of industry, because it says that the helm is at best only loosely coupled to the rudder. Steering inputs may have little effect, the opposite effect, or no effect at all. The linkage may be so complicated as to appear non-deterministic. Or, perhaps worst of all, the lack of complete objective measures may lull the person at the helm into thinking everything is fine when in fact they are steering blindly in a field of icebergs.

In my experience, this seems unbelievable to some, obvious to others. What Austin did, though, was apply agency theory to demonstrate this mathematically. Agency theory is an application of game theory, the very same branch of mathematics that made Nobel laureates of folks like Robert Aumann, John Nash, and Thomas Schelling. It is the basis of much of modern contract law and employment practices. Common employment practices, such as overtime pay for hourly employees, is based on the math behind agency theory.

What is so compelling about Austin’s work is that while you can disagree with his premise when stated as a bold fact, it is a lot harder to argue with the math. Okay, so you don’t like his results; what part of the math don’t you agree with? That the less free time employees have, the more valuable is their remaining free time? That employees’ production is a mixture of results measurable by different metrics (for example, quality, functionality, time to market)? That there is some specific mix of results that is optimal? That some metrics are expensive or impossible to measure?

We have all heard of examples of measurement dysfunction, possibly under different terminology in other contexts. Incentive distortion, unintended consequences, perverse incentives, and moral hazard are just a few I have come across in reading articles on economics, law, management, and ethics. Measurement dysfunction is so fundamental that once you grasp the basic idea, you start seeing it in the news (or experiencing it first hand) everywhere.

Amazon.com’s call center agents were measured by the number of calls they processed per hour, inciting them to hang up on customers in the middle of conversations.

Gateway’s technical support agents shipped whole new computers to customers with relatively minor (but time consuming) problems in order to make their monthly bonuses.

The incentives for corporate executives like Kenneth Lay and Jeffrey Skilling to lie, cheat and steal were so great that they overcame the intrinsic motivators towards honesty and good behavior and the extrinsic disincentives like heavy fines and jail time. In fact, if you can make tens of millions of dollars deceiving your employees, your shareholders, and your government, mightn’t some jail time seem worth the risk? When you face the prospect of forty million dollars in the bank and a few years in jail, paying off the Aryan Brotherhood for protection and bribing a few prison guards suddenly seems doable (whether it really is or not). It may not be until such corporate officers face the prospect of getting letters from their families written on toilet paper from the community soup kitchen, describing how they were sleeping in cardboard boxes because federal agencies froze all their assets, will the disincentives towards crime appear adequate. The incentives for finding loopholes even in the 2002 Sarbanes-Oxley Act are great.

Software developers and the organizations that employ them seem particularly prone to measurement dysfunction. As Joel Spolsky has pointed out (and as has been my personal experience), if you incent developers to write bug-free code, they will go out of their way to cover up the bugs they can’t find, which will hence be shipped with the product. If you incent developers to fix bugs, they will inflate their metrics by introducing more bugs for them to find. If you measure programmer productivity on lines of code delivered, don’t expect any efforts at code reuse or code optimization to succeed; you’re not rewarding them for the lines of code they didn’t write. If you reward developers for customer support, subtle bugs will increasingly appear in production code so that developers can take heroic action. Nelson Repenning has written much on the topic of how rewarding fire fighting in organizations leads to more fires to fight.

I recall talking to the head of a large software development organization who was asking for a magic quality filter through which code could be run in order to add that objective measure to the incentive program. Folks, software developers solve problems and reverse engineer complex systems for a living. If they’re good at it, it is like their brains are hardwired to the task. And they love a challenge. I am completely confident that the developers I work with on a daily basis are perfectly capable of gaming any incentive system that their employer puts in place, without necessarily actually achieving any stated goal of the program. Plus, some aspects of software algorithm quality, such as it does not contain an infinite loop, are actually proveably impossible to detect in all cases (the so-called "halting problem" from my graduate school days).

Which, finally, brings me to the real points of this article.

First: any incentive program is bound to drive dysfunction into an organization. If you must have an incentive program, expect to spend large sums of money and much time tuning it to minimize the dysfunction. Don’t expect to even recognize that dysfunction is occurring. When I have worked in an organization that employed forced-distribution of ranking of employees, and have brought the topic of measurement dysfunction up to managers, every single one of them said “Yes, I understand that this is a risk, but so far it isn’t happening here.” Folks, of course it’s happening here. You have merely provided incentives for your employees to hide it from you. Or maybe you’re in denial. Either way, I see the dysfunction every single day as suboptimal results are delivered in order to improve objective, but partial, metrics.

Second: don’t blame your employees for responding to your incentive program. It is the incentive program that is at fault, not the employee. Upper management says all sorts of things that come under the heading of “motherhood and apple pie”. The only way an employee really knows what upper management truly values is via the incentive program. You may say “quality is job one”. But if you can’t measure quality (and you cannot measure all dimensions of it), but you continue to terminate employees that don’t make their dates, then it is clear to everyone that “quality is job seven, or maybe nine, and besides anything past job four isn’t really important”.

At the time of his dissertation, Austin, now a faculty member at the Harvard Business School, was an executive with the Ford Motor Company Europe. The insight his work gave me motivated me to go so far as to order his Ph.D. dissertation on which his very readable book is based. One wonders what insight from his personal experience managing both technology and people was Austin able to bring to his research.

Sources

Douglas Adams, Life, the Universe, and Everything, Del Rey, 2005

Robert Austin, Measuring and Managing Performance in Organizations, Dorset House, 1996

Robert Daniel Austin, Theories of measurement and dysfunction in organizations, (dissertation), Carnegie Mellon University, University Microfilm, #9522945, 1995

Robert Cenek, "Forced Ranking Forces Fear", Cenek Report, 2006

Robert Cenek, "Forced Ranking Forces Fear: An Update", Cenek Report, 2006

Tom DeMarco and Timothy Lister, Peopleware: Productive Projects and Teams, 2nd edition, Dorset House, 1999

W. Edwards Deming, Out of the Crisis, MIT Press, 2000, p. 23

Jena McGregor, "The Struggle to Measure Performance", BusinessWeek, January 9, 2006

Jena McGregor, "Forget Going With Your Gut", BusinessWeek, March 20, 2006

Jefffrey Pfeffer and Robert I. Sutton, Hard Facts, Dangerous Half-Truths, & Total Nonsense, Harvard Business School Press, 2006

Nelson Repenning et al., "Past the Tipping Point: The Persistence of Firefighting in Product Development", California Management Review, 43, 4:44-63, 2001

Nelson Repenning et al., "Nobody Ever Gets Credit For Fixing Defects that Didn't Happen: Creating and Sustaining Process Improvement", California Management Review, 43, 4:64-68, 2001

Joel Spolsky, "Measurement", Joel On Software, July 15, 2002