I had to demonstrate my software to the professor to pass the course. So I signed up for a morning slot in his schedule. My software worked. I passed.
After the term was over I had reason to go back into that lab during the break and I ran into the hardware support guys taking the system apart.
"What's the deal?"
"We think there's a loose solder joint or something in the disk controller. It quits working when it gets warm."
I smiled and nodded and went on my way.
(I would go on to teach this class as a graduate student, the original professor would be my thesis advisor, and what I learned in that class formed the basis for my entire career since then. It would also form the basis for Mrs. Overclock's Rule: "If Mr. Overclock spends too much time debugging his software, he should start looking at the hardware.")
* * *
Decades ago I ran a systems administration and lab support group at a state university. It was the end of the academic term and I was deleting the course accounts to clean up the disk on a VAX/750 running Berkeley Unix 4.2 in one of the labs I was responsible for. This is something my student assistants normally did, but I thought I would get started on it.
The clean up actually took a long time to execute, so I was going to run it as a background process so I could do other stuff on the system console as it ran. I logged in and executed the following commands.
rm -rf home/cs123 &
I noticed that I didn't get a shell prompt back as I expected to. I waited for a moment or two more, began to get concerned, then started looking more closely at exactly what I had typed.
Have you ever noticed that the * character and the & character are right next to each other on the QWERTY keyboard?
I tried to cancel the command but it was too late. I had just started a process that would delete the entire root file system — including the operating system — from the disk.
One of my student assistants walked by and noticed me staring at the console. She asked "How is it going?"
I sighed and said "Could you please go fetch me the backup tapes?"
(I would go on to automate the process of creating and deleting student directories with shell scripts so that this would be unlikely to ever occur again.)
* * *
Back in my Bell Labs days I was in a lab struggling to troubleshoot some complex real-time traffic shaping firmware I had written for an ATM network interface card that had an OC-3 fiber optic physical interface. Using fiber optics meant the test equipment was all horrendously expensive.
I was working late one night — and truth be told a little peeved at myself for taking this long to debug my code — when it suddenly dawned on me that between the ATM broadband analyzer, the ATM protocol analyzer, the multi-cabinet telecom equipment under test, the network traffic generators, and all the fiber optic cable I had strewn all over the place, I was probably using a million dollars worth of equipment, just to debug my code. It was a major insight: I could never had done that kind of work in a smaller organization.
With all the emphasis these days on cheap computers and free open source software (which much of my current work certainly takes advantage of), that's something I think is often unappreciated: there are some problems you just can't tackle without a million dollars worth of equipment.
* * *
A long time client asked me to come in for an afternoon to one of their labs to help debug some cellular telecom equipment that had been returned from the field and for which I was one of the principal platform developers. We sat at the lab bench watching log messages scroll by on a laptop connected to the unit while a technician got the unit to go into its failure mode.
"Okay", I began, "this is likely to be a hardware problem. There is a failure with the connection between the ARM processor and the PowerPC processor. It sure looks like an intermittent solder joint failure."
"Oh, no", said the technician, "we think this is a software problem. We were thinking you could..."
As he spoke I slammed my hand against the side of the cabinet and the problem went away.
"... oh... Okay, I'll mark that one down as a hardware problem."
Of course, I had no idea what was going to happen when I hit the side of the cabinet. I was just doing that as a diagnostic step.
But it did make me look like a fraking genius.