Thursday, January 25, 2024

Large Language Models/Generative Pre-trained Transformers

(I'm turning this stock comment into a blog article so that I can refer to it in the future.)

My concern is that by the time we figure out we need an enormous volume of high quality content created and curated by human experts to correctly train Large Language Models (LLMs) like ChatGPT, we will have eliminated all the entry-level career paths of those very same human experts by using those same LLMs. As the existing cohort of experts retire, die, move into management, or otherwise quit producing content, there will be no one to take their place. We will have “eaten our own seed corn”.

Because human-created and -curated content will be more expensive to produce, organizations will be strongly incentivized to use LLM-created content to train other LLMs - or perhaps even the same LLM. This tends to cause errors in the training data to be amplified, leading to model collapse, where the LLM produces nonsense. (This is less likely to happen with human-created content because humans, unlike an algorithm, are unlikely to make exactly the same mistakes.)

Because human-created and -curated content will be deemed to be of higher quality, organizations will be strongly incentivized to not label LLM-created content as such. This will be problematic for LLM developers who are looking for the enormous amounts of high quality data necessary to train their models.

The seeds of the destruction of LLMs lies in the economics of creating and using LLMs.

I believe that LLMs have a future in being used as tools by experienced users in the same way such users may use tools like Wikipedia, Google Search, and StackOverflow today, with much of the same risk.

Saturday, January 13, 2024

Military EMSO Versus Commercial Aircraft

Jeff Wise wrote this interesting article about how commercial aircraft are getting all crossways - figuratively and literally - as nation states and other actors are jamming and spoofing GPS/GNSS and using other ElectroMagnetic Spectrum Operations (the broader term that has replaced Electronic Warfare) generally targeted at military activity. Like a lot of embedded systems, the boxes inside commercial aircraft were never designed with malware and malicious signals in mind.

Jeff Wise, "Air Travel Is Not Ready For Electronic Warfare", New York Magazine, 2024-01-02

I belong to the Association of Old Crows, a professional society for EMSO folks, and I get their Journal of Electromagnetic Dominance. It's mostly about RF stuff far more low level than my area of expertise, being an embedded/real-time/telecom software/firmware guy, so I can't really appreciate most of it. But the volume of ads and articles in the journal makes it obvious this is a highly active area for both defense and offense.

Black Box AIs in Air Defense Systems

I've said many times - everyone is probably tired of hearing me say it - that I think the use of neural network AI - like used in LLMs/GPTs - in air defense systems for target identification is inevitable. And putting the AI in control of firing to reduce response time will also happen. Accidental shootdowns of commercial aircraft due to human error is common enough that it has its own Wikipedia page, so the AI will actually probably be more accurate than humans. But it's just a matter of time until a commercial aircraft is misidentified by an AI as an enemy target. And when there's the resulting U.S. Congressional investigation about the loss of innocent civilian lives, many are going to be surprised when the defense contractors say that not only does no one know why the aircraft was misidentified, no one can know. That's how these massive neural network algorithms work; they're so far mostly black boxes.

Model Collapse In Air Defense System AIs

We need an enormous volume of high quality content created and curated by human experts to correctly train LLM/GPT-type AIs. Because such data sets are labor intensive, and therefore expensive, to create and to assemble, there will be enormous pressure to train AIs with AI-produced data. This might even happen unknowingly (as has already in fact happened) if the provenance of the original content isn't well documented (or the people building the AI just don't care). (There will be strong incentives not to reveal that content is AI generated, because human-created content will be so much more highly valued.) Training AIs with AI-generated data leads to model collapse, a kind of feedback loop in which errors and hallucinations in the training data are reinforced.

This is likely to occur with the air defense AIs I described above.

And there will be no quick way to fix this. We will likely have eliminated all the career paths of those very same human experts by our use of those same LLMs for their entry level jobs. As the existing cohort of experts retire, die, move into management, or otherwise quit producing content, there will be no one to take their place. See also: "eating your own seed corn".

Thursday, January 11, 2024

The Disastrous Cultural Evolution of Boeing

The news is full of the most recent Boeing debacle involving the 737 MAX 9 airliner and its door plug that bailed out during flight to land in someone's back yard, leading to sudden cabin depressurization and an emergency landing.

A colleague of mine (Thanks, Jeff!) passed along this interesting and short-ish article on some of the recent history of Boeing, published in The Atlantic about the time of the 737 MAX 8 crashes in 2019 involving the aircraft's Maneuvering Characteristics Augmentation System (MCAS).

Jerry Useem, "The Long Forgotten Flight That Sent Boeing Off Course", The Atlantic, 2019-11-20

The gist:

In 1997, Seattle-based Boeing merges with the much smaller McDonnell Douglas (MCD) in a stock swap. Analysts at the time described it as MCD buying Boeing with the larger company's own money. Surprisingly, the finance-centric (read: MBAs) management of MCD takes over the upper management tiers of Boeing that was previously manned by former engineers. Then, in 2001, the new MCD-based upper management gets tired, apparently, of being questioned by the engineers about cost-cutting and safety concerns, so the entire upper management team moves to new digs in Chicago, 1500 miles away from where the aircraft are built.

WTF?

My favorite jobs over the past four decades plus change have been those in which software, firmware, and hardware product development were closely associated - both culturally and geographically - not just with each other, but also with testing, management, production, and customer support.

My interest in this isn't just from a general product development perspective.

I've had the privilege of having worked on several embedded systems products for the business aviation market, products that could use the term cloud computing in a literal sense. None of those products were flight safety critical. For you aviation geeks, our processes conformed to AS9100, a quality standard, and with DO-178C DAL D, a safety standard, and were tested under DO-160. I even did some hacking with ARINC 429 (an aviation packet bus) and ARINC 717 (an aviation synchronous bus used to log to the aircraft flight data recorder). I got to make the Asterisk open source PBX work with the cockpit two-wire headsets, and with the Inmarsat and Iridium satellite constellations. That job had me crawling around in the equipment bay of a two-engine business jet, and taking short test flights.

(I took these photographs of our Bombardier Challenger 600 test aircraft at Centennial Airport (KAPA) near Denver Colorado.)

On Its Way

Interior Looking Aft

I even got to do some product integration at the Gulfstream Aerospace plant in Savannah Georgia, where I may have walked through Oprah Winfrey's new private jet on the assembly line.

It doesn't get much better than that.

Although our business aviation products were certified for DO-178C Design Assurance Level D - the least safety critical level for which the U.S. Federal Aviation Administration requires certification - we had to have our software certified by an FAA Designated Engineering Representative (DER), essentially an FAA licensed contracted inspector. That turned out to be no small thing. From what I've read, the processes for DAL A - flight safety critical - aviation products were like software development processes dialed up past eleven. The amount of scrutiny and testing that every single line of code receives makes you wonder how the MCAS debacle ever happened. Although it's interesting to note that, like many large aviation companies, Boeing had its own DERs on their payroll.

The cautionary tale of the disastrous cultural evolution of Boeing is a remarkable one, from both a safety and a product development point of view.

Friday, January 05, 2024

Right to Repair, Polish Train Hackers, and the NSA's Ghidra

Google "Polish train hackers" and you'll find dozens of articles in the tech press about this story. Here is the link to the one I read, which was translated from the original Polish. It's terrific. Compelling reading if you're interested in the misuse of Digital Rights Management or the Right To Repair movement. Or if you're into embedded systems development and troubleshooting. Or just if you're into stories of heroic efforts by engineers.



Polish embedded systems hackers use (get this) the U.S. National Security Agency's open source Ghidra tool, originally intended to reverse engineer binaries of computer viruses and other malware, to figure out why high-tech passenger trains, like in the video above, quit working after undergoing routine maintenance - following the train manufacturer's own two thousand page maintenance manual - by a third-party.

What did they discover in various versions of the train software/firmware?

  • Odometer checks that prevent a train from running after a million miles.
  • Year, month, and day checks that prevent a train from running after a certain date.
  • Geofencing checks (naturally the trains have GNSS receivers) that prevent a train from running if it is within the boundaries of a competitor's maintenance depot.

I've used Ghidra myself, and written about it in my blog. The tool includes not just a disassembler similar to objdump, but also a remarkable decompiler that can translate machine code using common C compiler idioms and patterns back into C code. Ghidra understands a wide variety of Instruction Set Architectures. Just recently I've been using it to study the binaries of my own code compiled for a RISC-V target.

DRM and Right To Repair is a big deal in the U.S., Manufacturers of agricultural equipment, like farm tractors costing six figures, have resorted to similar shenanigans to prevent even the farmers who own the equipment from repairing their own stuff. So much so that Right To Repair legislation is coming to the forefront of both state and federal legislators.