Thursday, June 28, 2007

Reverse Engineering gnireenignE esreveR

Reverse engineering is the process of figuring out how something works. It is a process that comes naturally to all engineers, maybe to all humans, as a result of hundreds of thousands of years of evolution. You started doing reverse engineering the first time you took something apart just for the pure joy of destruction, put it back together, and wondered "Hey, how come I have some stuff left over?" Here is what I've learned from decades of reverse engineering the products of others.

Their lawyers are at least as good as mine.

There are three principle reasons that I personally have reverse engineered someone else's product: the joy of figuring out how something works, the desire to build an interoperable product, and the desire to build a competing product. If you are reverse engineering something, the first goal probably only you care about, as long as you don't tell anyone you did it. The second is probably okay too, as long as you are not competing with the producer of the product with which you will interoperate. If your goal is to built a competing product, I advise you to tread carefully.

End user license agreements (EULAs) are full language forbidding you to use any of the information provided with the product for the purposes of reverse engineering it. I am not a lawyer. I suggest you consult with one before reverse engineering a commercial product with the goal of competing with it.

We all know there are all kinds of clever tricks you can do with disassemblers, byte code analyzers, and hardware debuggers to figure out how something works. I advise you steer clear of them. Treat the product as if it were a black box, and base your own implementation only on the information that would be publicly available to a normal consumer of the product. Even then you may be on shaky grounds. If possible, base your implementation purely on publicly available specifications. Depending on the domain, however, a good protocol analyzer is not only fair, but may be a necessity. Logic analyzers and oscilloscopes are a tougher call, depending on whether you are examining normal user outputs of the device or its internal implementation.

The open source advocates will tell you that this is exactly the problem with closed, proprietary systems and the current intellectual property climate that surrounds high tech. And they are right. But the fact that they are right still does not give you the right to violate the copyright, the trade secrets, or the patents of your competitor. Consider a Golden Rule approach: treat your competitor with the same respect you would hope to receive. As we shall soon see, poking into the innards of your competitor's product doesn't really buy you anything anyway.

Their documentation isn't any more accurate than mine.

The publicly available documentation for a commercial product is at best some poor technical writer's best guess on how the product works and is to be used. Because of the lead time necessary for producing user documentation, even in electronic form, the user documentation is frequently generated while the product is still under development. It is often based on early, incomplete, and ultimately erroneous, specifications and requirements. No product design ever survives its implementation. And it is no more likely that your competitor's documentation has kept up to date with changes in their product than yours is.

For this reason, it is important to get a working example of the product you are reverse engineering as soon as possible. On Day One of the reverse engineering project you should have the original product sitting on your desk or in your lab. Your goal is to develop such a level of expertise in the use of that product that it will occur to you that consulting on its use could become a handy secondary source of revenue.

Do not open the product up. Do not peek inside. Hook it up as appropriate and start using it just as a normal user would. Before you write a single line of code or bread board a single circuit, try the command or function in question on the original product and see what happens. Under no circumstances trust the manual to accurately describe what the product does. Otherwise you will end up implementing, at best, somebody's preliminary idea of how some hypothetical product might work which bears only a superficial resemblance to the actual product under study.

Their customers are just as smart as mine.

My whole career has been an exercise in quickly becoming an expert in some technology or problem domain with which I have never worked before. One thing I have learned over and over again: customers use the products I build in ways in which I never anticipated, in fact never could have anticipated. This is one of the reasons developers should get out more to visit customer sites. It is also the mechanism through which temporary flaws in products are transformed virtually overnight into necessary features.

The customers that use the product you are reverse engineering as just as smart as the customers of your own products. Those customers will find new and clever ways to use the product to meet their own business needs. Their use will drive the evolution of the product, what features are added, and how they work.

For this reason, whether you realize it or not, you desperately need a heaping helping of domain knowledge about the market into which the product you are reverse engineering is being sold (and there may be more than one). A common pattern for me over the past thirty years is to be thrown head-long into a new domain where I spend at least a while as the resident ignoramus. (You get used to it.) I've been lucky enough to have people with extensive domain knowledge working next to me that I could ask "Okay, why the heck does it work this way?" Almost inevitably, the answer will make perfect sense in the context of some specific use case or application scenario.

Be prepared however for the occasional "it's always been that way". This is perfectly legitimate. Some developer in 1963 may have implemented a new feature in a certain way, maybe at the request of a single customer, and a user culture, or even an industry, grew up around it. It's like the pattern of numbers on a touch-tone telephone pad. Sure, other patterns are possible. And from a technical point of view, they may even be desirable. But if you deviate from that pattern, the few people that buy your phone are going to end up cursing your name.

Their software isn't any more reliable than mine.

As you become the resident expert in the use the product you are reverse engineering, you will find bugs. If you are lucky, you will cause the product to reboot, lock up, or spontaneously combust. Lucky, because you probably won't be expected to reproduce that behavior. (Well, as in the example below, maybe the rebooting.)

If you are unlucky, you will try something that causes the product to exhibit unexpected, meaning undocumented, behavior. Now you are faced with the question: do I implement the equivalent feature in my product so that it works in a sane manner, or do I make my product bug for bug compatible with the original? For sure, this is a judgement call. However, the expectation of many (if not most) of the customers of the original product being reverse engineered is that your product will misbehave in the same way. One customer's misbehavior is another customer's critical feature upon which all joy depends. "It's handy that the device reboots when we send it this command because that's the only way we've discovered to return it to the factory defaults."

This is even more difficult than it sounds, because the behavior of the product in question is a moving target. The product being reverse engineered will undergo revisions and updates even as you are studying it. I have found it wise to develop a set of regression test cases to test not only my product, but the product under study as I update its firmware or software.

If you are especially unfortunate, it will never occur to you to even try that particular command sequence, protocol variation, or pattern of function invocation that causes such misbehavior, because, well gosh, it just doesn't make sense. In which case the lack of this bug in your product constitutes a bug in your product.

Hey, if it were easy, anyone could do it.

Their software is just a susceptible to entropy as mine.

As features are added over time to any product, the initial design, which likely did not foresee these features, becomes more and more problematic. Sometimes this is an issue that can be resolved with just major refactoring. "If I had known we were going to have to support more than one network protocol, I would have created a network abstraction layer." Sometimes this is a show stopper. "If I had known we were going to have to run multi-threaded on shared-memory multi-core processors, I would have... crap, we're hosed."

However, you have an advantage over the developers of the product you are reverse engineering. Where their product evolved organically over the span of several years, you at least know all of the requirements up front, and what the current environment is in which it will be used. Where as their implementation may be full of warts and bolted-on features (not that you'll ever know; see above regarding "lawyers"), you are starting with a clean slate. Plus, you can take advantage of faster processors and larger memory models, and several years of open source code development that you may be able to leverage (depending on the licensing).

Once you ship, of course, you're in the same trap as they are.

Their engineers aren't any smarter than me.

I've seen features in the product I was reverse engineering that looked like the work of a student intern, and not one of those smart interns that I knew would eventually replace me either. Sometimes it is really obvious that different features were specified by different architects, or designed by different engineers, because of the lack of consistency or a radical departure from the typical interface. Of course, it is your goal to build an equally inconsistent emulation no matter how bad it may smell.

But I've also seen features whose design seemed a completely mystery, yet smelled okay. I knew deep down that I just wasn't seeing the big picture. But I also knew that the engineers of the product I was reverse engineering weren't any smarter than me, at least, not by much. I was just missing some critical piece of domain knowledge (see above, "documentation"), or was not seeing the design pattern at work. If the former, a quick trip to the office of my domain expert was usually enough to clear things up. If the latter, sometimes just starting on the design and implementation was enough to clear things up, because it forced me to face the same design tradeoffs as the original developer. You're mileage may vary, but don't hesitate to do a little designing and coding under the auspices of "prototyping". It is surprising what clarity this can bring to the problem at hand.

My product will be at least as fun to develop as theirs.

Reverse engineering is often its own reward, teaching me all sorts of things, because there is nothing quite like learning from a working example. The forensic detective work that goes into reverse engineering has a real CSI quality to it. And when I produce a product that interoperates with or emulates another product, I feel like there is a weird kind of bond between me and the developers of that product. I feel like shouting "You magnificent bastard, I read your book!"

Saturday, June 16, 2007

Accumulated Wisdom

It's funny what sticks with you. From time to time someone will say something that strikes me as uncommonly wise, and applicable in much broader context. For sure, they may have been quoting someone else, but I heard it from them first. Here are some quotes that keep coming back to haunt me, subject to my faulty memory. (I continually add to this list as I come across new bits of wisdom.)

"Any performance claims a vendor may make should be considered guarantees that their product cannot exceed them." -- Bill Buzbee

"You need to take at least three weeks vacation to realize how trivial everything is at work." -- Bob Dixon

"Negative results are still results." -- Bob Dixon

"90% of producing a product is just turning the crank." -- Glenn Freundlich

"If you and I always agree, one of us is redundant." -- Ken Howard

"Either you design a layered architecture, or you'll wish you had." -- Ken Howard

"To us, it's just another supercomputer." -- Basil Irwin

"Shut up and take the money." -- Bob Kalisch

"Any new high technology has a half-life of about five years." -- Barry Karafin

"I choose not to live in fear." -- Kate Kligman

"Everyone has their own story to tell." -- Marla Meehl

"Obviousness implies understanding." -- Paul Moorman

"I like to write pretty code." -- Tam Noirot

"How to you find the smart kids? You don't have to. You just have to find one smart kid. That kid will know who all the other smart kids are." -- Bernie O'Lear

"Office politics are so bad here because the stakes are so small." -- Bill Patterson

"If you're going to carry a concealed weapon, you have to be Mr. Cool, you can't let anything upset you." -- Mark Passamaneck

"Every time I pick up a telephone handset and hear dial-tone, I know a little miracle has occurred." -- Ron Phillips

"There are a lot of real worlds." -- Al Sanders

"This could make us famous in the organization... or very well known." -- Mike Manuel

"This is just temporary... unless it works!" -- Red Green

"The 'S' in 'IoT' stands for Security." -- Anne Trotter

"You want to know what makes airplanes fly? It's not the Bernoulli effect. It's not the angle of attack. It's money. Money makes airplanes fly. Money, and a lot of it." -- Steve Niessner

"Nothing happens in contradiction to nature, only in contradiction to what we know of it." -- Dana Scully

"Natural selection believes in you, even if you don't believe in natural selection." -- Gary Longsine

"Adventure is adversity recounted at leisure." -- David Braun (who may have been quoting someone else)

And finally, a few things I've said myself, over and over again.

"It isn’t until you see a globe that you realize every world map you’ve ever seen has been lying to you." -- Me

"I've learned over the years that when a pretty girl pays attention to me, it's probably only because she wants to copy my homework." -- Me

"The only thing worse than management not paying attention to you is management paying attention to you." -- Me

"If you waste your peoples' time you are teaching them that it is okay to be wasteful." -- Me

"No design ever survives its implementation." -- Me

"Forward references are a sign of weakness." -- Me

"My entire business model depends on my working outside my comfort zone." -- Me

"When you find yourself saying 'we have to do this', append the phrase 'at any price' and then see how you feel." -- Me

"I'm a product developer. For me to be successful, I have to ship a product." -- Me

"No one has ever shipped a perfect product, and we aren't going to be the first." -- Me

"Terrible jobs are all alike, but great jobs are each great in their own way." -- Me (paraphrasing Tolstoy)

"A friend of mine once asked me what faith or religion or moral code or philosophy of life I might claim to have, and I said in all seriousness that I was an economist." -- Me

"Sure, my boss is an asshole. But at least I'm sleeping with his wife." -- Me (on being self-employed)

"When all you have is a push down automata, everything looks like an LL(1) grammar." -- Me

"There's no kill like overkill." -- Me

“After more than forty years in my profession, I have come to understand that the very best engineers are fundamentally economists with some technical skills.” -- Me

"One of the problems of being a real-time/embedded developer is that we think of 38 microseconds as being a long time. But it isn't a long time. It's only 38 millionths of a second. In the non-semiconductor, non-electromagnetic, non-quantum, non-relativistic world, I'm told almost nothing gets done in 38 millionths of a second." - Me (pondering relativistic effects on the atomic clocks in GPS satellites)

"Trains, container ships, oil tankers, and large bureaucratic organizations all have two things in common: [1] they won't make any sudden turns, and [2] you really don't want to get in their way." -- Me

"We strive for immortality. Some immortality is genetic. Some is memetic. Glory is a form of the latter. But may lead to the former." -- Me

"'Live-long learning' is just another term for 'adulthood'." -- Me

"Getting me to talk isn't the problem. Getting me to shut up is sometimes the problem." -- Me (on being invited to give an impromptu briefing)

"Any sufficiently advanced VR is indistinguishable from reality." -- Me

"Oh no! Not another learning experience!" -- Me

"Rule of thumb: unless the fault is with us, we can't be the only ones seeing this problem." -- Me

"One crisis at a time." -- Me