Monday, September 16, 2024

When The Minimum Viable Product Is Too Minimal And Not Viable

All technology product development is fractally iterative, whether you want it to be or not. The agile development processes at least recognizes this. But agile, and its idea of a Minimum Viable Product (MVP), replaces the waterfall development process' requirements - which consumes a lot of thought, research, and consensus ahead of time - with a competent product manager and close proximity to the customer. My long professional experience working in both waterfall and agile processes suggests that this can work. Except when it doesn't.

2024 BMW R1250GS Adventure: I-25 and CO-60 near Johnstown Colorado

This past spring I was the victim of a Minimum Viable Product strategy when I bought BMW Motorcycle's latest GPS device, the Connected Ride Navigator (CRN-1), for my 2024 BMW R1250GS Adventure, my fourth BMW motorcycle. I spent about US$800 on the CRN-1, and it was a disaster. Prior BMW Motorrad navigators were built by Garmin, and to be fair, had their own hardware issues. But this one was a BMW product, reportedly with TomTom maps. The hardware seemed pretty solid, but it was as if the software had been designed and written by someone who had never used a navigator (BMW's or otherwise), and had never ridden a motorcycle.

IMG_5990

Besides having lots of professional experience writing code to use GPS devices and to use Open Street Maps, I had used an old Garmin standalone unit on many car trips, and my Subaru WRX has an in-dash TomTom. My basic navigation needs are simple. I want to know what road I'm on. I want to know what the next cross street is. I want to know what direction I'm going. Basic stuff like that. The CRN-1 couldn't do any of that. On a recent trip through northern New Mexico, the screen typically was all gray with a single green line - presumably indicating the road - on it; no labels, no other information. And when there were labels, the font was so tiny as to be unreadable with my old eyes using my progressive spectacles.

Here's the MVP thing: since I bought the CRN-1, there have been two software updates, and with each one the device has gotten a little better. But after the New Mexico debacle I had already bought a Garmin Zūmo XT2 navigator, a motorcycle-specific model from BMW's now-competitor, for about US$500. Since I had to modify the navigator cradle on the motorcycle for the XT2, I am unlikely to ever go back.

Sure wish I hadn't spent the money on the CRN-1. You'd think I'd know better than to buy the first release of any tech product. After all, that's why I bought the R1250GS instead of its R1300GS replacement. I'm used to BMW's motorcycle products being well designed and overpriced; the new BMW navigator got one of those right. The MVP CRN-1 was too little and too late.

Thursday, September 12, 2024

Large Language Models and the Fermi Paradox

(I originally wrote this as a comment on LinkedIn, then turned the comment into a post on LinkedIn, then into a post for my techie and science fictional friends on Facebook, and finally turned it into a blog article here. I've said all this before, but it bears repeating.)

The destruction of the talent pipeline by the use of AI for work normally done by interns and entry-level employees not only threatens how humans fundamentally learn, but leads to AI "eating its own seed corn". As senior experts leave the work force, there will be no one left to generate the enormous amount - terabytes - of content necessary to train the Large Language Models.

Because human generated content will generally be perceived to be more valuable than machine generated content, humans using AI to generate content will be highly incentivized to not identify AI generated content as such. More and more AI generated content will be swept up along with the gradually diminishing pool of human content to use as training data, in a kind of feedback loop leading to "model collapse", in which the AI produces nonsense.

A former boss of mine, back in my own U.S. national lab days, once wisely remarked that this is why the U.S. Department of Energy maintains national labs: experienced Ph.D. physicists take a really long time to make. And when you need them, you need them right now. Not having them when you need them can result in an existential crisis. So you have to maintain a talent pipeline that keeps churning them out.

It takes generations in human-time to refill the talent pipeline and start making more senior experts, no matter what the domain of expertise. Once we go down this path, there is no quick and easy fix.

The lobe of my brain that goes active at science fiction conventions suggests that this anti-pattern is one possible explanation for the Fermi Paradox.

Tuesday, August 20, 2024

A Science Fictional Idea: Noise and the Fermi Paradox

Today, Sabine Hossenfelder, "my favorite quantum physicist", had an article about an academic paper in which the author proposed that aliens using really advanced technologies might be able to modulate the quantum properties of photons in order to carry information. (Note: this is not the SFal idea of using entangled particles to communicate faster than light, which - sorry - is not possible; there is no way to modulate the effect to carry information.) Such a modulation scheme could carry a lot more information than our current schemes that modulate properties like amplitude, frequency, phase, etc. A communication beam with quantum modulation would have to be extremely narrowly focused, lest it run into some other matter which would cause the quantum properties to decohere, losing all the information content.

The author wrote a possible approach would be to use frequencies in the infrared, which would require antennas on each end about one hundred kilometers across, which is, actually, not a completely crazy idea. Dr. Hossenfelder also mentioned that because of the decoherence issue, the aliens would be careful NOT to aim the beam near any planet, like, you know, ours. But even if we did receive it, we would have no clue how to demodulate it. I got to thinking about this. (That's why I follow Dr. Hossenfelder, and even support her work on Patreon.)

I recently spent three days at the NIST Time & Frequency Seminar held at the National Institute for Standards and Technology (NIST) Boulder laboratories. A big part of that seminar were demonstrations on how to measure and characterize noise in precision frequency sources. Precision frequency sources like the NIST-F2 cesium fountain clock, shown below, which is the principal frequency reference for the definition of UTC(NIST), the United State's contribution to the international definition of Universal Coordinated Time or UTC. This noise measurement and characterization is not completely removed from measuring noise in communications systems, which, by the way, depend on precision frequency references to work. (A big Thank You to Dr. Jeffrey Sherman, below, for the tour!)

Untitled

Along the commuter train line from our neighborhood to downtown Denver there is an old AT&T Long Lines tower with the giant microwave horn antennas that used be the backbone of the long distance telephone system. This was before fiber optic cables were run along every railroad track - because the railroads owned the right of ways (an effort which gave the telecommunications company SPRINT its name: "Southern Pacific Railroad Internal Networking Telephony").

The Spousal Unit is so very tired of me telling the story - which I do virtually every time we ride the train (sorry) - of the two Bell Labs engineers who were tasked with figuring out and eliminating the source of the noise in the early models of these microwave antennas (which were big enough you could easily stand up inside of them). Alas, they ultimately weren't able to eliminate it: they determined the noise was the Cosmic Background Microwave Radiation that was the result of the Big Bang. They were picking up the noise from the birth of the Universe. As a consolation, however, they did win a Nobel Prize in physics. And invented the entire realm of radio astronomy. (I eventually took a little motorcycle ride and found that Long Lines tower, shown below.)

Untitled

Noise exists in every communication link, whether it's radio, wire, optical, etc. You can't get rid of it completely. Eventually maybe you give up and just declare it's "cosmic background", or "thermal noise", or "electrical noise from other equipment in the room". But noise in communication systems is no small problem; given enough, it can jam your GPS, your WiFi, your mobile phone, etc., or just make your vintage vinyl albums sound bad.

I very dimly recall a result from Information Theory that says something like: the output of a theoretically optimal data compression algorithm is indistinguishable from noise. That is: there is no statistical test that can tell you whether the data stream you're looking at is just noise, or is optimally compressed data. (That's not quite the same as saying, however, that it is random.)

What if the aliens have a nearly optimal compression algorithm? (A perfectly optimal one is impossible.) Sending data from one star system to another is bound to be really expensive, not to mention take a long time. So they would be highly incentivized to use such an algorithm. What if part of the noise we see and hear and receive every day in our own radio communications systems is really alien data transmissions?

We could be awash in extraterrestrial data communications and not even know it.

Monday, April 08, 2024

Ancient History

I bought a four terabyte (4TB) SSD the other day at the local big box store. A Samsung T7 Shield (which I think just means it comes with a rubberlike case around it). It was substantially discounted, probably because the new T9 model is out. Easily fits in my shirt pocket. Hooked it up via the included USB cable to our network-attached storage box, and I'm now using it to automatically back up two Mac laptops and a Mac desktop at the Palatial Overclock Estate.

Mind blown.

Because I am ancient almost beyond belief - it's a miracle I'm still alive, especially considering my hobbies - I remember thirty years ago helping to write a proposal to DARPA to build a one terabyte (1TB) hierarchical storage system that would have included rotating disks and a robotic tape library. It would have taken up an entire room. Can't blame them for not funding it. Someone smarter than me (which could have been just about anyone) probably saw this all coming.

That same organization, the National Center for Atmospheric Research (NCAR) in Boulder Colorado, had the only CRAY-3 supercomputer outside of Seymour Cray's Cray Computer Corporation. Today, your Raspberry Pi Single Board Computer (SBC) - and not even the latest Pi model 5 - has more horsepower than that CRAY-3. And the SBC would fit in your shirt pocket as well.

Don't bet against Moore's Law.

Although as I am always quick to point out, what exactly Moore's Law implies has changed over the past few years. Which is why I was tickled when someone who is using my Diminuto C systems programming library passed along a command line to build the software by running make across sixteen parallel threads of execution - taking advantage of the trend towards multicore processors, now that it's become difficult to make individual processors faster. Between much of the build process being I/O bound, and having four processor cores on the Raspberry Pi 5, this approach really speeds up the makefile.

CRAY-3

That's me, about thirty years ago, leaning against the world's most expensive aquarium; the CRAY-3 logic modules were visible under the transparent top, fully immersed in Fluorinert.

Wednesday, March 20, 2024

Converting GPIO from the legacy sysfs ABI to the ioctl ABI in Diminuto and Hazer

It could be that no one but me is using my "Hazer" GNSS library and toolkit, or the "Diminuto" C-based systems programming library on which it depends. But just in case: I'm close to finishing testing of the develop branch of both repos, both of which have some major changes to how General Purpose Input/Output (GPIO) - the generic term for controlling digital input and output pins in software - and merging develop back into the master branch.

This was all motivated by my being one of the lucky few to get a backordered Raspberry Pi 5, and putting the latest version of Raspberry Pi OS, based on Debian "bookworm" and Linux 6.1, on it, only to find when unit and functional testing my code that the deprecated sysfs-based GPIO ABI no long worked. This wasn't a big surprise - I had read at least two years ago that the old ABI was being phased out in favor of a new ioctl-based ABI. My code makes heavy use of GPIO for a lot of my stuff, e.g. interrupts from real-time sensors, the One Pulse Per Second (1PPS) signal from GNSS receivers, status LEDS, etc. So it was finally time to bite the bullet and replace all the places where I used the sysfs-based Diminuto Pin feature (diminuto_pin) with a new ABI using the ioctl. Hence, the Diminuto Line (borrowing a term from the new ABI) feature (diminuto_line).

Line is now used in place of Pin in all of the Diminuto functional tests that run on  hardware test fixtures I wired up many years ago just for this purpose, and all the functional tests work. The Hazer gpstool utility has similarly been converted to using Line instead of Pin and has been tested with an Ardusimple board using a u-blox UBX-NEO-F10T GNSS receiver.

IMG_5717

(That's a Pi4 on the left connected to my test fixture, and a Pi5 on the right connected to the GNSS receiver.)

Two complaints.

[1] The new ABI is woefully under documented. However, I found some code examples in the Linux kernel source repo under tools/gpio that were essential to my understanding. (I chose not to use the new-ish libgpiod for my work, for reasons, but that is a story for another time. I have no reason to believe that it's not perfectly fine to use.)

[2] The way the ioctl GPIO driver is implemented on older versus newer Raspberry Pi OS versions makes it difficult - I am tempted to say impossible, but maybe I'm just not that smart - to write code that easily runs on either platform using the new ABI. Specifically, the GPIO device drivers in the OSes use a different symbolic naming scheme, making it impossible for application code to select the correct GPIO device and line offset portably on the two platforms. But maybe I'm just missing something. (I hope so.)

I like the new ioctl ABI, and expect to use it exclusively moving forward, even though this will orphan Pis I have that might run older versions of the OS. (I think I have at least one example of every Pi version ever sitting around the Palatial Overclock Estate. Ten of them run 24x7 in roles such as a web server, an Open Street Map server, a Differential GNSS base station, and an NTP server with a cesium atomic clock for holdover.) I have tagged the last version of both repos that still use the sysfs ABI.

That's it.

Update (2024-03-23)

I merged the develop branch in to the master branch this morning. Both the Diminuto build and the Hazer build passed sanity and functional testing (and I'm currently running the long-running "geologic" unit test suite against Diminuto). I had tagged the old master branch in both repos with the name sysfsgpio in case I needed to build them, but I don't anticipate any further development of the old code.

Thursday, March 07, 2024

AI in the Battlefield

The name, "Tactical Intelligence Targeting Access Node" (TITAN), is pretty clever. Peter Thiel's Denver-Based Palantir Technologies, a software-driven data analytics company in the defense and intelligence domain, just won a US$178M contract to build an AI-driven mobile battlefield sensor fusion platform. From Palantir's home page: "AI-Powered Operations, For Every Decision". In this context, TITAN consumes a huge amount of data from remote sensors and tells soldiers what to destroy.

Cool. And absolutely necessary. Soldiers on the battlefield are inundated with information, more than humans can assimilate in the time they have. And even if we didn't build it, our peer adversaries surely will (or more likely, are).

This is the kind of neural network-based AI that's going to mistake a commercial airliner for an enemy bomber and recommend that it be shot down, even if its own cyber-finger isn't on the trigger. Because time is short, and if no other information is forthcoming, someone will pull that trigger.

In the inevitable following Congressional investigation, military officers, company executives, and AI scientists and engineers will be forced to admit that they have no idea why the AI made that mistake, and in fact they can't know, because no one can. When you have an AI with over a trillion - not an exaggeration - variables in its learning model, no one can understand how Deep Learning really works.

Seriously, this is a real problem in the AI field right now. AIs do things their own developers did not anticipate, and cannot explain.

Accidental commercial airliner shoot downs are so common they have their own Wikipedia page. And it's just a matter of time before the cyber-finger is on the trigger, because it can respond so more quickly than its overwhelmed human operators.

The worst thing that could happen is for TITAN be an unqualified success. Someone will get the idea that maybe such a system should have its cyber-finger on the red button for strategic ICBMs.

Tuesday, February 20, 2024

Pig Butchering with Large Language Models

I have my Facebook default privacy settings locked down so that only my FB friends can see my posts on my timeline. And I only accept friend requests from folks I feel I know pretty well, and typically only those I know in meat space. But when I shared my post about selling a BMW motorcycle to my motorcycle club's group on FB, I had to change the privacy setting of that particular post from private to public so that members who weren't on my FB friends list could see it. The comments below are the result.

Pig Butchering With LLMs

Take a close look at them. All of course claim to be from attractive young women. The first two of them are just short comments trying to get me to engage. The fourth one is a long missive that is probably a standard form letter with no specific detail. But the third one has enough specificity that it had me looking up the commenter's profile: a young divorced Asian woman in the fashion industry who lives in San Francisco. Possible but not likely in the BMW motorcycle owner demographic.

It was almost certainly written by an AI, using the current technology based on an artificial neural network, like the Large Language Models such as ChatGPT use. It has all sorts of detail about my post, and at first seems legit, but is really nothing much more than a rewording of what I originally posted to the group.

This is where LLMs are taking the pig butchering or romance scam artists. As they are trained with more and more data, they are just going to get better and better.

Wednesday, February 14, 2024

Are AI Generated Works Intellectual Property?

The U.S. Patent and Trademark Office (USPTO) has once again stressed that only humans can be listed as inventors on patents. And the U.S. Copyright Office, part of the Library of Congress and typically a small bureaucracy with just a few people, is about to make big news as it evaluates whether AI generated works can be copyrighted.

If the USPTO declines to recognize AI "inventors", and the Library of Congress similarly disallows copyrighting of AI generated material, that's going to really put a crimp in the monetization of AI generated intellectual property, since it cannot be protected.

My current thinking is that right now it's right thing to do.

The current technology of Generative Pre-Trained (GPT) AIs are nothing more than gigantic text or image prediction engines based on huge artificial neural network-based statistical models trained with enormous amounts of human created and curated input - input for which the original authors and artists are not being compensated, despite the fact that their work may have been copyrighted. There's no cognition or creativity involved.

But the counter argument is worth thinking about.

We ourselves are nothing but gigantic text or image prediction engines based on huge natural neural network-based statistical models trained with enormous amounts of human created and curated input - material we have read or examined - for which the original authors and artists are not being compensated, despite the fact that their work may have been copyrighted.

The difference is that when we write or make art, we may be trying use the trained neural network in our brain to create what others have not done before. That's creativity.

Update (2024-02-20)

Another counter argument is that there is creativity and cognition involved in the prompt engineering - the term used for the creation of the prompt, or series of prompts, the human operator gives the AI to produce its output. Perhaps, in this respect, using an AI is no different than using tools like Microsoft Word or Adobe Photoshop for your writing or art.

I'm still leaning towards not providing IP protection for AI generated output. But this is a complicated issue. As the subtitle of my blog reminds you, 90% of this opinion could be crap.

Sources

(Perhaps ironically, this article is based on the no doubt copyrighted work of several others that I would like to cite... if only I could remember them. As I do, I'll add the citations here.) 

Emilia David, "US patent office confirms AI can't hold patents", The Verge, 2024-02-13, https://www.theverge.com/2024/2/13/24072241/ai-patent-us-office-guidance

Cecilia Kang, "The Sleepy Copyright Office in the Middle of a High Stakes Clash over A.I.", The New York Times, 2024-01-25, https://www.nytimes.com/2024/01/25/technology/ai-copyright-office-law.html

Louis Menand, "Is A.I. the Death of I.P.?", The New Yorker, 2024-01-15, https://www.newyorker.com/magazine/2024/01/22/who-owns-this-sentence-a-history-of-copyrights-and-wrongs-david-bellos-alexandre-montagu-book-review

Shira Perlmutter, "Copyright Registration Guidance: Works Containing Material Generated by Artificial Intelligence", U.S. Copyright Office, Federal Register, 2023-03-10, https://copyright.gov/ai/ai_policy_guidance.pdf

Katherine Kelly Vidal, "Inventorship Guidance for AI-assisted Invention", U.S. Patent and Trademark Office,  Federal Register, 2024-02-13, https://public-inspection.federalregister.gov/2024-02623.pdf

Thursday, January 25, 2024

Large Language Models/Generative Pre-trained Transformers

(I'm turning this stock comment into a blog article so that I can refer to it in the future.)

My concern is that by the time we figure out we need an enormous volume of high quality content created and curated by human experts to correctly train Large Language Models (LLMs) like ChatGPT, we will have eliminated all the entry-level career paths of those very same human experts by using those same LLMs. As the existing cohort of experts retire, die, move into management, or otherwise quit producing content, there will be no one to take their place.

We will have “eaten our own seed corn”.

Because human-created and -curated content will be more expensive to produce, organizations will be strongly incentivized to use LLM-created content to train other LLMs - or perhaps even the same LLM. This tends to cause errors in the training data to be amplified, leading to model collapse, where the LLM produces nonsense. (This is less likely to happen with human-created content because humans, unlike an algorithm, are unlikely to make exactly the same mistakes.)

Because human-created and -curated content will be deemed to be of higher quality, organizations will be strongly incentivized to not label LLM-created content as such. This will be problematic for LLM developers who are looking for the enormous amounts of high quality data necessary to train their models. 

We will have "salted our own fields".

The seeds of the destruction of LLMs lies in the economics of creating and using LLMs.

I believe that LLMs have a future in being used as tools by experienced users in the same way such users may use tools like Wikipedia, Google Search, and StackOverflow today, with much of the same risk.

Saturday, January 13, 2024

Military EMSO Versus Commercial Aircraft

Jeff Wise wrote this interesting article about how commercial aircraft are getting all crossways - figuratively and literally - as nation states and other actors are jamming and spoofing GPS/GNSS and using other ElectroMagnetic Spectrum Operations (the broader term that has replaced Electronic Warfare) generally targeted at military activity. Like a lot of embedded systems, the boxes inside commercial aircraft were never designed with malware and malicious signals in mind.

Jeff Wise, "Air Travel Is Not Ready For Electronic Warfare", New York Magazine, 2024-01-02

I belong to the Association of Old Crows, a professional society for EMSO folks, and I get their Journal of Electromagnetic Dominance. It's mostly about RF stuff far more low level than my area of expertise, being an embedded/real-time/telecom software/firmware guy, so I can't really appreciate most of it. But the volume of ads and articles in the journal makes it obvious this is a highly active area for both defense and offense.

Black Box AIs in Air Defense Systems

I've said many times - everyone is probably tired of hearing me say it - that I think the use of neural network AI - like used in LLMs/GPTs - in air defense systems for target identification is inevitable. And putting the AI in control of firing to reduce response time will also happen. Accidental shootdowns of commercial aircraft due to human error is common enough that it has its own Wikipedia page, so the AI will actually probably be more accurate than humans. But it's just a matter of time until a commercial aircraft is misidentified by an AI as an enemy target. And when there's the resulting U.S. Congressional investigation about the loss of innocent civilian lives, many are going to be surprised when the defense contractors say that not only does no one know why the aircraft was misidentified, no one can know. That's how these massive neural network algorithms work; they're so far mostly black boxes.

Model Collapse In Air Defense System AIs

We need an enormous volume of high quality content created and curated by human experts to correctly train LLM/GPT-type AIs. Because such data sets are labor intensive, and therefore expensive, to create and to assemble, there will be enormous pressure to train AIs with AI-produced data. This might even happen unknowingly (as has already in fact happened) if the provenance of the original content isn't well documented (or the people building the AI just don't care). (There will be strong incentives not to reveal that content is AI generated, because human-created content will be so much more highly valued.) Training AIs with AI-generated data leads to model collapse, a kind of feedback loop in which errors and hallucinations in the training data are reinforced.

This is likely to occur with the air defense AIs I described above.

And there will be no quick way to fix this. We will likely have eliminated all the career paths of those very same human experts by our use of those same LLMs for their entry level jobs. As the existing cohort of experts retire, die, move into management, or otherwise quit producing content, there will be no one to take their place. See also: "eating your own seed corn".

Thursday, January 11, 2024

The Disastrous Cultural Evolution of Boeing

The news is full of the most recent Boeing debacle involving the 737 MAX 9 airliner and its door plug that bailed out during flight to land in someone's back yard, leading to sudden cabin depressurization and an emergency landing.

A colleague of mine (Thanks, Jeff!) passed along this interesting and short-ish article on some of the recent history of Boeing, published in The Atlantic about the time of the 737 MAX 8 crashes in 2019 involving the aircraft's Maneuvering Characteristics Augmentation System (MCAS).

Jerry Useem, "The Long Forgotten Flight That Sent Boeing Off Course", The Atlantic, 2019-11-20

The gist:

In 1997, Seattle-based Boeing merges with the much smaller McDonnell Douglas (MCD) in a stock swap. Analysts at the time described it as MCD buying Boeing with the larger company's own money. Surprisingly, the finance-centric (read: MBAs) management of MCD takes over the upper management tiers of Boeing that was previously manned by former engineers. Then, in 2001, the new MCD-based upper management gets tired, apparently, of being questioned by the engineers about cost-cutting and safety concerns, so the entire upper management team moves to new digs in Chicago, 1500 miles away from where the aircraft are built.

WTF?

My favorite jobs over the past four decades plus change have been those in which software, firmware, and hardware product development were closely associated - both culturally and geographically - not just with each other, but also with testing, management, production, and customer support.

My interest in this isn't just from a general product development perspective.

I've had the privilege of having worked on several embedded systems products for the business aviation market, products that could use the term cloud computing in a literal sense. None of those products were flight safety critical. For you aviation geeks, our processes conformed to AS9100, a quality standard, and with DO-178C DAL D, a safety standard, and were tested under DO-160. I even did some hacking with ARINC 429 (an aviation packet bus) and ARINC 717 (an aviation synchronous bus used to log to the aircraft flight data recorder). I got to make the Asterisk open source PBX work with the cockpit two-wire headsets, and with the Inmarsat and Iridium satellite constellations. That job had me crawling around in the equipment bay of a two-engine business jet, and taking short test flights.

(I took these photographs of our Bombardier Challenger 600 test aircraft at Centennial Airport (KAPA) near Denver Colorado.)

On Its Way

Interior Looking Aft

I even got to do some product integration at the Gulfstream Aerospace plant in Savannah Georgia, where I may have walked through Oprah Winfrey's new private jet on the assembly line.

It doesn't get much better than that.

Although our business aviation products were certified for DO-178C Design Assurance Level D - the least safety critical level for which the U.S. Federal Aviation Administration requires certification - we had to have our software certified by an FAA Designated Engineering Representative (DER), essentially an FAA licensed contracted inspector. That turned out to be no small thing. From what I've read, the processes for DAL A - flight safety critical - aviation products were like software development processes dialed up past eleven. The amount of scrutiny and testing that every single line of code receives makes you wonder how the MCAS debacle ever happened. Although it's interesting to note that, like many large aviation companies, Boeing had its own DERs on their payroll.

The cautionary tale of the disastrous cultural evolution of Boeing is a remarkable one, from both a safety and a product development point of view.

Friday, January 05, 2024

Right to Repair, Polish Train Hackers, and the NSA's Ghidra

Google "Polish train hackers" and you'll find dozens of articles in the tech press about this story. Here is the link to the one I read, which was translated from the original Polish. It's terrific. Compelling reading if you're interested in the misuse of Digital Rights Management or the Right To Repair movement. Or if you're into embedded systems development and troubleshooting. Or just if you're into stories of heroic efforts by engineers.



Polish embedded systems hackers use (get this) the U.S. National Security Agency's open source Ghidra tool, originally intended to reverse engineer binaries of computer viruses and other malware, to figure out why high-tech passenger trains, like in the video above, quit working after undergoing routine maintenance - following the train manufacturer's own two thousand page maintenance manual - by a third-party.

What did they discover in various versions of the train software/firmware?

  • Odometer checks that prevent a train from running after a million miles.
  • Year, month, and day checks that prevent a train from running after a certain date.
  • Geofencing checks (naturally the trains have GNSS receivers) that prevent a train from running if it is within the boundaries of a competitor's maintenance depot.

I've used Ghidra myself, and written about it in my blog. The tool includes not just a disassembler similar to objdump, but also a remarkable decompiler that can translate machine code using common C compiler idioms and patterns back into C code. Ghidra understands a wide variety of Instruction Set Architectures. Just recently I've been using it to study the binaries of my own code compiled for a RISC-V target.

DRM and Right To Repair is a big deal in the U.S., Manufacturers of agricultural equipment, like farm tractors costing six figures, have resorted to similar shenanigans to prevent even the farmers who own the equipment from repairing their own stuff. So much so that Right To Repair legislation is coming to the forefront of both state and federal legislators.