Monday, December 18, 2023

Bruce Schneier: AIs, Mass Spying, and Trust

If you do anything with high technology (I do) you don't have to be a cybersecurity expert (I'm not) to learn something from reading security guru and Harvard fellow Bruce Schneier. His recent weekly newsletter had two articles that really gave me new and different points of view.

"Mass Spying" - Mr. Schneier makes a distinction between surveillance and spying. An example of the former is when a law enforcement agency puts a "mail cover" on your postal mail: they know everything you're receiving and its return address, but they don't know the contents. An example of the latter is when they open your mail and read what's in it. Technology made mass surveillance (of all kinds) cost effective, but spying was labor intensive: humans still had to read and summarize the contents of our communications. GPTs/LMMs have made mass spying practical, since now AIs can summarize written and even verbal communication. We know how technology made mass surveillance scalable, but AIs now open the era of scalable mass spying.

"AI and Trust" - Mr. Schneier explains the different between interpersonal trust and social trust. The former is the kind of trust I have for my spousal unit. I've known her for decades, I know what kind of person she is, and I have a wealth of experience interacting with her. The latter is the kind of trust I have for the people that made the Popeye's chicken sandwich yesterday, or for the driver of the bus or of the commuter train I rode on Saturday: I don't know any of these people, but because of laws, regulations, and social conventions, I trust they won't poison me or crash the vehicle I'm in. Interpersonal trust is the result of experience. Social trust is the result of, for the most part, the actions of government, which establishes and enforces a rule of law. Here's the thing: AIs are getting so good - in essence, passing the Turing Test - that we subconsciously mistake them for being worthy of interpersonal trust. But they aren't. Current GPTs/LMMs are tools of corporations, in particular profit-maximizing corporations, and if unregulated (that is, the corporations, not the AIs) they aren't even worthy of our social trust.

Well worth my time to read, and I encourage you to do so too.

Update (2023-12-18)

Schneier points out that you can't regulate the Artificial Intelligence; AIs aren't legal entities, so they can't be held responsible for anything they do. You must regulate the corporation that made the AI.

I'm no lawyer, but I feel pretty confident in saying that corporations that make AIs can and will be sued for their AIs' behavior and actions. AIs are merely software products. The legal entities behind them are ultimately responsible for their actions.

One of my concerns about using Large Language Model/Generative Pre-trained Transformer types of AIs in safety critical applications - aviation, weapons systems, autonomous vehicles, etc. - is that when the LLM/GPT makes the inevitable mistake - e.g. shoots down a commercial airliner near an area of conflict (which happens often enough due to human error there is a Wikipedia page devoted to it) - the people holding the post-mortem inquiry are going to be surprised to find that the engineers that built the AI don't know - in fact, can't know - how it arrived at its decision. The AI is a black box trained with a tremendous amount of data, and its inner workings based on that data are more or less opaque even to the folks that built it. Insurance companies are going to have to grapple with this issue as well.

Update (2024-02-21)

Even though you can't legally hold an AI itself responsible for crimes or mistakes the way you can a person or a corporation, that doesn't keep companies that are using AIs from trying to do just that. In a recent story, Air Canada tried to avoid responsibility for their experimental customer service AI giving a customer completely fictional advice about their bereavement policy. The Canadian small claims court wasn't having it, as well they shouldn't.

Saturday, December 09, 2023

Unusually Well Informed Delivery

The U.S. Postal Service scans the outside address-side of all your mail. Obviously. They have to have automated mechanisms that sort the mail by address, most especially zip code. So they have some pretty good character recognition technology, for both printed and handwritten addresses.

But did you know they keep a scanned image of your mail? U.S. law enforcement agencies - and sometimes intelligence agencies - can and do get a kind of search warrant, referred to as a mail cover, to see these images.

You can see these images too. The U.S.P.S. has a service called "Informed Delivery", part of their "Innovative Business Technology" program. You can sign up online to get an email every day, seven days a week (yes, even on Sunday) for the mail that has been scanned with your address on it. It's free. I've used this for some time.

Every morning I get an email with black and white digital images of my mail that had been scanned, probably the night before. Most of it is junk mail. It also contains color digital images of catalogs that I'll be receiving, that I'm sure the catalog merchandiser pays to have included. This is probably another revenue stream for the U.S.P.S. (and may be what pays for Informed Delivery).

The other day I had something extra in my Informed Delivery email. I had scanned images of the outsides of three other people's mail. These people weren't even on my street; two weren't even in my zip code.

Obviously some kind of glitch. But it wasn't a security hole I was expecting to find. That was naive on my part. If the FBI and the NSA find this information useful, someone who gets it by accident may as well.

Update 2023-12-13: This AM I got another ID email from the USPS with an image of someone else's mail in it, again not in my zip code. So this glitch isn't a one-off.

Update 2023-12-13: And for our friends in the Great White North: "Canada Post breaking law by gathering info from envelopes, parcels: watchdog".

Update 2023-12-13: Note that the images of mail, whether yours or someone else's, in your Informed Delivery email, are remote content: downloaded from a remote server and displayed, in an HTML-like manner, when you view the email. This means it can be removed or altered without accessing your copy of the email on your personal device. If you need to save these images for any reason, you need to save it in such a way that captures the remote images as well. Printing a hardcopy might be the best solution.

Update 2023-12-13: A friend, colleague, and former law enforcement officer asked me if the routing bar code printed by the USPS, and visible in the images in the ID email, for the other peoples' mail was the same as that on my mail. It's been a few years since I've had to eyeball bar codes of any kind, but I'm going to say "no".

Update 2023-12-13: Maybe this is obvious, but I thought I'd better say it: not subscribing to Informed Delivery will not prevent the USPS from scanning your mail, keeping the digital images, and showing them (deliberately or not) to other folks. At least by subscribing, you can see what other people might see.

Monday, December 04, 2023

Lessons in Autonomous Weapons from the Iraq War

Read a good article this AM from the Brookings Institution from back in 2022, about issues in the use of automation in weapons systems: Understanding the errors introduced by military AI applications  [Kelsey Atherton, Brookings Institution, 2022-05-06]. It's in part a case study of a shoot down of an allied aircraft by a ground-to-air missile system operating autonomously during the Iraq War. That conflict predates the development of Generative Pre-trained Transformer (GPT) algorithms. But there's a lot here that is applicable to the current discussion about the application of GPTs to autonomous weapons systems. I found three things of special note.

First, it's an example of the "Swiss cheese" model of system failures, in that multiple mechanisms that could have prevented this friendly fire accident all failed or were not present.

Second, the article cites the lack of realistic and accurate training data, not in this case for a GPT, but for testing and verification of the missile systems during its development.

Third, it cites a study that found that even when there is a human-in-the-loop, humans aren't very good at choosing to override an autonomous system.

I consider the use of full automation in weapons systems to be - unfortunately - inevitable. Part of that is a Game Theory argument: if your adversary uses autonomous weapons, you must do so as well, or you stand to be at a serious disadvantage on the battlefield. But in the specific case of incoming short-range ballistic missiles, the time intervals involved may be too short to permit humans to evaluate the data and make and execute a decision. Also, in the case in which the ballistic missile is targeted at the ground-to-air missile system itself, if the intercepting missile misses the incoming missile, the stakes of the failure are lower if the ground-to-air missile system is itself unmanned.

It was an interesting ten pages that were well worth my time.

Saturday, December 02, 2023

Time, Gravity, and the God Dial

Disclaimer: my knowledge of physics is at best at a dilettante level, even with more than a year of the topic in college, one elective course of which got me one of the only two B letter grades of both of my degrees. (Statistics similarly defeated me.)

I've read that there is no variable for time in the equations used in quantum physics, no t, because (apparently) time doesn't play a role. That's why quantum effects that are visible at the macroscopic level - even something as simple as stuff that absorbs light to glow in the dark - are random processes time-wise.

Yet time t plays a crucial role at the macroscopic level, in classical, or "Newtonian", mechanics.

Not only that, time is malleable, in the sense that it is affected by velocity and acceleration (special relativity) and gravity (general relativity), effects that are not only measurable, but stuff we depend on every day (like GPS) have to make adjustments for it.

So suppose God has a dial that controls the scale of their point of view, all the way from the smallest sub-atomic scale we know of, the Planck length, to the largest cosmological scale we know of, the observable Universe. At some point as God turns this dial on their heavenly tele/micro/scope, zooming out, out, far out, time goes from not being a factor at all to being an intrinsic factor for whatever they’re looking at.

Does this transition happen all at once? Does it happen gradually - somehow - in some kind of jittery change? What the heck is going on in this transition? What other things similarly change at this transition point? Is this the point at which particle-wave duality breaks down? Where Schrödinger's Cat definitely becomes alive or dead? Where does gravity starts to matter?

Gravity? Yeah, gravity. Because we currently have no theory of quantum gravity. Yet it seems necessary that at the quantum level gravity ought to play a role in a wave/particle. If a particle is in a super-position of states, what does that say about the gravitational attraction associated with the mass of this particle? At what point on the dial does gravity make a difference? There's a Nobel prize for sure for the first person to make significant progress on this question.

This is the kind of thing I think about while eating breakfast.

Will Optical Atomic Clocks Be Too Good?

Read a terrific popsci article this morning in Physics Today on time keeping: "Time Too Good To Be True" [Daniel Kleppner, Physics Today, 59.3, 2006-03-01]. (Disclaimer: it's from 2006, so it's likely to be out of date.)

The gist of the article is that as we make more and more precise atomic clocks by using higher and higher frequency resonators (like transitioning from cesium atomic clocks that resonate in the microwave range to elements that resonate in the optical range), in some ways they become less and less useful. Eventually we will create (or perhaps y bnow have created) clocks whose frequencies are so high that they are effected by extremely small perturbations in gravity, like tidal effects from the Sun and the Moon. Or perhaps, I wonder, as clocks get more sensitive, even smaller gravitational effects, like a black hole and a neutron star colliding 900 million light years away (which has in fact been detected).

Even today, the cesium and rubidium atomic clocks in GPS satellites have to be adjusted for special (due to the centripetal acceleration of their orbits) and general (orbital altitudes over the center of mass of the Earth) relativistic effects, where, in round numbers, an error of one nanosecond throws the ranging measurement for a single satellite off by about a foot.


(This is an NTP server I built for my home network that incorporates a chip-scale cesium atomic clock disciplined to GPS; everyone needs a stratum-0 clock of their own. Also shown: my lab assistant.)

With far more accurate/precise atomic clocks, we won't be able to compare them. Note that relativistic effects aren't just jitter issues, they affect the fundamental nature of time itself, so it's not just a measurement or equipment issue.


(This is a photograph I took in 2018 of part of an experimental ytterbium lattice optical atomic clock at the NIST laboratories in Boulder Colorado.)

One of the problems with optical atomic clocks is that to compare two of them in two locations we have to account for differences in altitude as little as one centimeter; that's how precise these clocks are, and how sensitive they are to general relativistic effects. We simply don't have, and probably can't have, a way to measure altitudes from the center of mass of the Earth that accurately. One of the ways we measure the shape of the "geoid" of the Earth is to (you knew this was coming) compare synchronized/syntonized atomic clocks. So there's definitely a chicken and egg problem.

Friday, September 01, 2023

A Swiss Cheese of Errors

 In 2021, an F-35B fighter jet rolled off the front of the aircraft carrier HMS Queen Elizabeth during a failed take off. "Rolled" is the probably the right term, as British carriers do not use a catapult like U.S. carriers. The pilot ejected and landed on the flight deck with only minor injuries.

It was discovered later that a protective cover - part of the "red gear" because of its color - over the left engine intake had mistakenly been left in place. It was sucked into the compressor inlet of the single center-mounted jet engine, reducing power to where it was insufficient for take off.

The U.S. and its allies recovered the carcass of the F-35B. Which is good, because if they hadn't, somebody else would have.

As you would expect after totaling a bleeding edge US$80M aircraft, part of the enormously expensive and troubled U.S. F-35 program, there was a lengthy post mortem report written. I read a short (about forty pages) summary and analysis of this report this morning by Aerosurrance, a U.K. based aviation consultancy.

There is a concept in the study of organizational and complex systems failures - which is a hobby of mine that I've written about here before - called the Swiss cheese model. This is where "holes" in redundant layers of safety systems and checks (because no such system is perfect) just happen to line up at exactly the wrong time to produce a catastrophic failure.

This report was like that: maintenance crews were overworked, fatigued, and under staffed; procedures were insufficient or not followed; poor design of the red gear; no sharing of similar failures, four of which had occurred before in the U.S.; etc. (And after having previously read several long ProPublica articles about failures in the U.S. Navy that were in part due to similar issues with overwork and fatigue, I'm seeing a pattern.)

While reading this article, it occurred to me that we already have a term for Swiss cheese types of failures, and have had for eons, we just tend not to use it when it the result is tragic.

We call them "a comedy of errors".

Wednesday, August 02, 2023

Boom Town

In 1969, a forty-kiloton nuclear bomb was detonated underground in Colorado, near the town of Parachute, between Glenwood Springs and Grand Junction, west of Denver and a little south of what is now Interstate 70. The test shaft was over 8,400 feet deep. It was a test to see if small nuclear devices could free up natural gas deep underground, part of Project Plowshare. Edward Teller, one of the developers of the H-bomb, was there during the test.

In 1973, three thirty-kiloton nuclear bombs were detonated underground in Colorado, in Rio Blanco County, in the northwestern part of the state. It was another Project Plowshare effort to find peaceful uses for nuclear bombs.

These efforts to extract natural gas were a partial success. Partial because the gas was so radioactive it could not be sold, so was instead burned off.

As a result, Colorado passed an amendment to its state constitution such that approval for the detonation of a nuclear device in the state has to pass a state-wide popular vote. It is the only state in the U.S. with such a requirement.

This all happened before the Spousal Unit and I moved to the Denver area in 1989. But that's not the only adventure with radioactivity you may find in Colorado.

Given the depth of the H-bomb detonation near Parachute, the site is probably not nearly as radioactive as the site of the old Rocky Flats Nuclear Weapons Plant, between Denver and Boulder Colorado, just a few minutes drive from my home, and which I drove right past when I commuted to Boulder every day to work in the early 1990s. Rocky Flats - somewhat euphemistically - manufactured triggers for nuclear bombs; the trigger for a nuclear (fusion) bomb is an atomic (fission) bomb. Rocky Flats was an EPA superfund site for a long long time after it closed in 1992, following an FBI raid.

Radioactivity is a natural phenomena here in Colorado, thanks to decaying uranium ore underground that creates radioactive radon gas. Like a lot of folks in our neighborhood, we have radon mitigation in our home: a fan that runs 24x7 that pulls air (and presumably radon) out of our crawl space and exhausts it above the house.

Huge piles of mine tailings around Colorado mountain towns contain a lot of uranium ore, which was just a waste product when it was originally dug out during the gold and silver mining era. No one at the time had any idea they were creating an environmental hazard that would last, for all practical purposes, forever.

For reasons unrelated to any of this, my tiny little company owns a couple of geiger counters. One day, the Spousal Unit asked me if my watch with a tritium dial gave off enough radiation to be detectable. Good question. Tritium is a radioactive isotope of hydrogen. Its radioactive decay produces helium, an electron (a.k.a. beta particle), and an electron anti-neutrino. Tritium watch dials have hour and hand markers that are tiny vials containing tritium gas and a phosphorescent compound. The beta particles from the decaying tritium excite the phosphor, making it glow. No external light source is needed, although in time all the tritium will have decayed and the vial will stop glowing. Gun sights for use at night may use this mechanism too.

I swept one of my geiger counters close over the watch, and got a reading of normal background radiation. Beta particles from tritium decay are so feeble energetically that they typically can't penetrate more than about a quarter inch of air, much less human skin. Most probably can't make it past the wall of the vial.

Then, I happened to sweep the geiger counter across one of my other watches, where it went completely nuts. It turns out an old mechanical French Army surplus watch from the 1960s that I own has a radium dial. I had no idea; that wasn't in the original description when I purchased it. Radium is a decay product of naturally occurring uranium, and radon gas is in turn a decay product of radium. As it decays, radium produces ionizing radiation in the form of helium nuclei (a.k.a. alpha particles) and gamma radiation.

That watch dial was hot. The watch now resides in a lead-lined envelope on a shelf in the basement, along with some samples of uranium ore.

In these parts, it's no accident that we all have a healthy glow.


Westword, "The Boom Years", 2023-07-30

Wikipedia, "Rocky Flats Plant", 2023-06-19

Wednesday, July 26, 2023

Model Collapse

A few decades ago I was working at the National Center for Atmospheric Research, a national lab in Boulder Colorado sponsored by the National Science Foundation. Although our missions were completely different, we had a lot in common operationally with the Department of Energy labs, like Los Alamos and Lawrence Livermore, regarding supercomputers and large data storage systems, so we did a lot of collaboration with them.

I had a boss at NCAR that once remarked that the hidden agenda behind the DoE labs was that they were a work program for physicists. Sometimes, often without much warning, you need a bunch of physicists for a Manhattan Project kind of activity. And you can't just turn out experienced Ph.D. physicists at the drop of a hat; it takes years or even decades. So for reasons of national security and defense policy you have to maintain a pipeline of physicist production, and a means to keep them employed and busy so that they can get the experience they need. Then you've got them when you need them.

This always seemed very forward thinking to me. The kind of forward thinking you hope someone in the U.S. Government is doing.

It came to me today that this is the same issue in the screen writers' and actors' strike.

Machine learning (ML) algorithms, of which Large Language Models (LMMs) are just one example, need almost unbelievably large amounts of data to train their huge neural networks. There is a temptation to use the output of ML models to train other ML models because it's relatively cheap and easy to create more input data, where as expensive humans can take a long time to do it. But training an ML model with the output of another ML model leads to an effect called "model collapse".

I mentioned an article on VentureBeat (which cites an academic paper) on this topic in a prior blog article. The VentureBeat article by Carl Franzen provides the following metaphor:

If you trained an ML model to recognize cats, you could feed it billions of "natural" real-life examples of data about blue cats and yellow cats. Then if you asked it questions about cats, you would get answers containing blue cats, and yellow cats, and maybe even occasionally green cats.

But suppose yellow cats were relatively rarely represented in your data, whether they were rare in the real world or not. Mostly then you would get answers about blue cats, almost never yellow cats, and rarely if ever green cats.

Then you started training your new improved ML model on the output of the the prior ML model. The new "synthetic" data set would dilute out all the examples of yellow cats. Eventually you would have model that didn't even recognize yellow cats at all.

This is one example of model collapse: the ML model no longer represents the real-world, and cannot be relied upon to deliver accurate results.

This is what will happen if you eliminate the human elements from your screenwriting or acting (or software development), using AI algorithms to write and to synthesize and portray characters (or write software). If you don't have a full pipeline constantly producing people who have training and experience at writing or acting (or writing software, or whatever it is you need), you no longer have a way to generate the huge human-created and human-curated datasets you need to train your AIs. The models collapse, and eventually the writing or portrayal of characters (or the software) in no way represents the real world.

That valley isn't even uncanny; it's just wrong.

But you can't just gen up more competent, trained, experienced writers or actors (or software developers) on the spur of the moment. It takes years to do that. By the time you realize you're in trouble, it's too late.

This is the precipice some folks want us to move towards today.

Saturday, July 22, 2023


The movie we've all be waiting for, the movie about a beloved childhood toy, opened in theaters everywhere this week. I am talking, of course, about Christopher Nolan's Oppenheimer.

Who knows how many little girls were inspired to enter STEM fields by playing with their "Oppie" dolls. The Spousal Unit has regaled me with stories about the many happy hours spent dressing her Oppie in different suits, trench coats, and fedora hats. Of the joy she felt on that one extra special Christmas morning when she opened a brightly wrapped package to find her very own "Oppie's Dream Atomic Bomb". How she collected Oppie's friends like Lieutenant Colonel Leslie Groves ("With Angry Face!"), and the terrific glow-in-the-dark Louis Slotin (which inspired her to pursue a career in medicine). And how building their own Trinity test site in the backyard sand box brought her and her older brother closer together.

She and I look forward to seeing Oppenheimer so that she can relive those golden childhood memories.

Using Microsoft's Windows Subsystem for Linux on Windows 11

I recently bought a new laptop that runs Microsoft Windows 11.

And I didn't immediately install a Linux distro over top of Windows.

Shocking, I know, for a guy who for years has been, and remains, firmly in the Apple ecosystem: laptop, desktop, phone, and tablet. And who, for the past few decades, has been writing software for the Linux ecosystem (including for Android). But I didn't own a hardware platform that can run Windows 11, which only works on systems that have a Trusted Platform Module (TPM). And a lot of the commercial tools for embedded systems, and vendor tools for GNSS devices, that I use only run on Windows.

I bought a 2022 HP Envy x360 Convertible Model 15. That's a laptop with a 15.6" touch sensitive screen that folds up to convert into a tablet. It's the first hardware platform I've run my code on that uses an AMD processor: a Ryzen 7 5825U. It came with Windows 11 Professional. It has 64GB of RAM, and a 2TB PCIe SSD.

So of course almost the first thing I did was get Microsoft's Windows Subsystem for Linux (WSL) working on it. This allows you to run a full blown Linux/GNU distro - not an emulation layer - with a Linux kernel, in a highly optimized virtual machine environment native to Microsoft. Then I got my own software running on it, my Diminuto and Hazer repositories.

It was mostly straightforward, although getting a USB device (in my case, a GNSS receiver dongle) attached to the Linux environment was a little weird - definitely weirder than doing the same thing using the commercial VMware software, which I have done many many times.

Here's a snapshot of my GNSS software, gpstool, running under Ubuntu Linux, under Windows 11, on the new laptop.

Hazer on HP Envy x360 15 Convertible

I have come to think of the WSL window as the "system console" for the running Linux. If you do an ifconfig command in this window, you can get the local IP address for the Linux instance. Using that address, you can ssh into Linux from Windows and have multiple concurrent Linux sessions. I use the popular Windows app PuTTY - which I also use to connect to serial-attached devices - but anything similar, like using the Windows' native ssh command from a PowerShell console, should work.

You can easily tell that the NMEA 0183 data stream from the GNSS device is running through some kind of USB bridge software layer under Windows that adds significant latency. My GNSS software displays, second by second, both the local system clock (LOC) and the GPS time from the incoming NMEA data (TIM). On this system, they consistently differ by one second, TIM running one second late. I have seen this also when running under VMware, but never when running natively on a system. Definitely won't be using this approach for precision timing, but it should be fine for geolocation.

I've found an issue with one of my USB-attached GNSS receivers, that the optional Windows usbipd utility, which you use to manage the connection of USB devices to WSL, refuses to attach to Linux: "device is in an error state". It's the one dongle I have that uses the Data Carrier Detect (DCD) indication to provide the GNSS one-pulse-per-second ("1PPS") signal. It works fine natively with Linux on, for example, a Raspberry Pi. Other USB-attached GNSS devices have worked fine.

Otherwise: so far, so good.

Sunday, July 16, 2023

Large Machine Learning Models Are Not Intrinsically Ethical - And Neither Are Large Corporations

I think the screen actors and writers concerns about the use of large AI models is legitimate, since the models cannot exist and could not be successful without being trained using a ginormous amount of human-created input, typically without the permission or even knowledge of the original creators.

But that's just the tip of the iceberg, being currently the most visible public example of this concern.

Eventually, software engineers will wise up and figure out they have the same issue, with companies training AIs using software - including open source software - written by humans, most of whom are no longer, or never were, employees of theirs, without any compensation, consent, or acknowledgement.

Worse, companies will try to get around using expensive, experienced, and ethical developers by training AIs to generate software that will be used in safety critical or weapons systems.

Eventually, companies will save even more money, and avoid any intellectual property issues, by training AIs using software that was itself generated by other AIs, and... it's turtles all the way down. With each iteration, it will be like a game of telephone, the quality of the output getting worse and worse. Except sometimes with ground to air missiles.

In time, there will be corporate executives for some prime defense contractor sitting in front of a Congressional committee, trying to explain why their automated weapons system shot down a commercial airliner because it thought it was a Russian bomber. They will be forced to admit that no one - not their scientists, not their own engineers, not anyone - really understands how the AI in the system came to that decision.

Because that's how complex large neural network machine learning models are. It's not traditional if-then-else logic, a so-called "rule-based" system, like I studied when I was a graduate student in Computer Science. It's an almost incomprehensibly gigantic simulated network of neurons that was configured by an almost unbelievably huge dataset of input. A dataset whose contents no human moderated or approved or even examined. Or, because of its volume, could examine.

I admit this isn't my area of expertise. But I have a couple of degrees in Computer Science from an accredited program at a university. I have worked for many years in large multi-national corporations, part of that time in the defense-industrial complex. So I feel like I have a reasonably informed opinion on both the technical aspects and how large corporations work.

I believe that the application of very large machine learning models to weapons systems is inevitable. If not by the U.S., then by other countries, perhaps including our allies. The results will be unpredictable. And unexplainable.


It only just now occurred to me that how large machine learning models work might be a good metaphor for the hive minds of large organizations.

Not really joking.

Postscript 2

My use of "hive minds" above was quite deliberate, BTW, since my train of thought first connected machine learning modes with the emergent behavior of some insect colonies e.g. bees. The individual bee - and the neural network inside its brain - is relatively simple, but the group behavior of a lot of bees is quite complex - and not even remotely understood by any individual bee.

Postscript 3

I couldn't read this paywalled article from Bloomberg [2023-07-16], but the part I could see, just a few minutes ago, coincidentally, was enough.

"Israel Quietly Embeds AI Systems in Deadly Military Operations

Selecting targets for air strikes and executing raids can now be conducted with unprecedented speed, according to army officials.

The Israel Defense Forces have started using artificial intelligence to select targets for air strikes and organize wartime logistics as tensions escalate in the occupied territories and with arch-rival Iran.

Though the military won’t comment on specific operations, officials say that it now uses an AI recommendation system that can crunch huge amounts of data to select targets for air strikes. Ensuing raids can then be rapidly assembled with another artificial intelligence model called Fire Factory, which uses data about military-approved targets to calculate munition loads, prioritize and assign thousands of targets to aircraft and drones, and propose a schedule."

Postscript 4

There's a very recent article on Vox about how the inner workings of large machine learning models are unknowable.

Postscript 5

Postscript 6

The article from VentureBeat that I cite just above makes an interesting point: the fact that using AI model output as training data for another AI model leads to "model collapse" means that high-quality human-generated or human-curated training data becomes increasingly more rare and more valuable. I predict this will lead to new open source licenses, GNU and otherwise, that restrict data or code use as training data for machine learning models. (And of course, AI developers will routinely violate those open source licenses, just as they are violated now.)

Monday, July 03, 2023

Hazer with the U-blox NEO-F10T GNSS Receiver on the Ardusimple SimpleGNSS Board

It's been a while since I talked about my GPS/GNSS efforts. Some time ago I bought a SimpleGNSS board from Ardusimple to try out. The SimpleGNSS has the new U-blox NEO-F10T GNSS receiver. This was my first experience using a new U-blox generation 10 device. It is the first GNSS device of any kind I've used that includes features specific to the latest version 4.11 of the National Marine Electronics Association 0183 standard. And it's the first I've used that's capable of receiving the new L5 band signal from the latest Block III GPS satellites.

I was about ready to dismantle this experiment. But before I removed it from my workbench, I thought I'd update y'all on my latest version of the Hazer library and its gpstool utility.

Hazer is a Linux/GNU/C-based library that supports the processing not only of the usual NMEA 0183 output of GNSS receivers, but also proprietary binary output like UBX from U-blox devices, and CPO output from Garmin devices. It also handles input and output of RTCM messages in support of differential GNSS, yielding geolocation precision down to about a 1.5 centimeters. (I run a DGNSS fixed base and a stationary rover 24x7 at the Palatial Overclock Estate.)

gpstool is the Swiss Army Knife of Hazer. It's one of those old-school tools that has a nightmare of command line options. I use it to functionally test the library, and also as the main component of most of my GNSS efforts. You wouldn't want to use gpstool to navigate cross-country (although I have used it in conjunction with OpenStreetMap to generate a moving map display in real-time). But I find it really useful for testing and evaluating new GNSS devices and just generally futzing around with geolocation and precision timing.

(You may want to click on the short video clips to view them on YouTube instead of from this blog; the UI seems to crop the images, losing some information. You can also click on photographs to see a larger image.)

Here is a short video clip of a Raspberry Pi 4B running gpstool with the NEO-F10T connected via USB.

Besides processing the output of the U-blox device, gpstool is following the 1 Hz One Pulse Per Second (1PPS) digital output signal from the device, which is syntonized to GPS time, and strobes another digital line on the Raspberry PI, syntonized to 1PPS (subject to software latency), to which I've attached a red LED.

Here is a short screen capture that shows the output from gpstool. I've SSHed into the Raspberry Pi from my Mac desktop to run gpstool. The utility uses some simple ANSI screen controls to update the display in real-time. I'm viewing standard output directly from gpstool, but the utility also has the ability to run headless in the background as a daemon, and you can view this real-time display remotely at leisure. (This is how I run my DGNSS systems.)

If you have perused the output of gpstool in any of my prior articles, you may notice a new field in the SAT lines showing the frequency band used by the device to receive data from the indicated satellite, e.g. L1 C/A. Some GNSS devices (like this one) may receive data from the same satellite over more than one band. (I confess this "new feature" is a long delayed bug fix because I botched the handling of new fields in version 4.10 of the NMEA 0183 spec. I have no excuse.)

gpstool isn't just a passive monitor. You can use its command line options to send NMEA, UBX, and other message formats to the device to interrogate and configure it. I did this with the NEO-F10T. Here is a screen shot of a snippet of a script similar to the one I used to generate the output shown in this article.  It sends a batch of UBX messages, sequentially, waiting for an acknowledgement for most of them, to configure the device.

Screen Shot 2023-07-06 at 13.28.38

You can find this script in its entirety in the Hazer repository on GitHub.

Finally, in addition to the standard output, gpstool generates log output on standard error, which here I capture in a file. (If you run gpstool headless, log output can also be automatically directed to the system log without any need for a command line option or API call.)

Hazer gpstool Example Log File

In this example, I have the log output level set to INFO (informational) and above; setting it to the DBUG (debug) level and above generates more output than I typically need unless I am actually debugging new code. (The logging system is a feature of Diminuto, my Linux/GNU/C-based systems programming library on which Hazer and gpstool are built; Diminuto has its own ginormous feature set.)

That's a quick update of some of my latest poking around with GNSS. I'm always on the lookout for new (to me, anyway) GNSS devices to play with!

Saturday, June 10, 2023

The Expanse of Cosmic Horror

(This blog article is a mash up of two long social media posts I made some time ago, so it might seem a little fragmented and repetitive, even though I've tried to edit it a little.)

The late H(oward) P(hillips) Lovecraft [1890-1937] is credited with inventing the genre of cosmic horror - one of my favorite genres in either print or visual media. I don't classify his iconic creation Cthulhu as comic horror. That gigantic octopus-headed other-worldly creature that lies dreaming deep under the sea in the impossibly ancient city of R'lyeh is way too anthropomorphic. Cthulhu is horrifying, sure, but cosmic horror is all about reality-bending stuff that comes from non-earthly realms and virtually defies description. Cosmic horrors aren't evil; they are indifferent, as uncaring about their impacts on us puny humans as a lawn mower is to ant hills. Cosmic horrors drive humans mad because of our inability to perceive and process their reality.

Propnomicon Miskatonic Badge Edit

My affection for cosmic horror is one of the reasons my alter ego goes to science fiction conventions wearing pins or badges that identify him as a faculty member at Miskatonic University - that fictional university in an equally fictional Arkham County, Massachusetts situated on the banks of the still fictional Miskatonic River, where several Lovecraft stories take place.

(And in case you're wondering, there is absolutely no excuse for the terrible racism evident in some of H. P. Lovecraft's writing. In this respect he was wrong and ignorant, even for the time in which he lived and wrote, and that part of his work is repugnant and not to be admired. But I learned long ago that if I only love those who are perfect, I will come to love no one. So I celebrate his other work while decrying that part of his milieu.)

Cosmic horror is one of the reasons that, with some exceptions, I can't quite enjoy reading science fiction that casually features faster than light (FTL) travel or communication. (I'm just a little more forgiving of visual media in this respect, but not much.) My dilettante reading of physics leads me to interpret FTL as creating circumstances that would bend or break our very perception of reality, eliminating our ability to agree on basic cause and effect. It would lead to violations of the basic laws of physics - which is perhaps part of the definition of cosmic horror.

Here are some media - not intended in any way to be comprehensive - that I think fits the description of cosmic horror.

  • John Carpenter's movie The Thing and John W. Campbell's novella "The Thing From Another World" on which it is based, is by far the best known example. (The Carpenter movie is also a great example of another horror genre, body horror).
  • The movie Alien (but not so much the terrific sequel Aliens) shoots for a bit of cosmic horror, with an alien which could easily have been right out of Lovecraft's "Cthulhu Mythos", although it has the same anthropomorphic flaw as Cthulhu itself. [added 2023-06-19]
  • The movie Event Horizon for sure has elements of "things man was not meant to know", all the result of the development of an FTL drive. [added 2023-06-19]
  • Jeff VenderMeer's book Annihilation and the movie adaptation of the same name starting Natalie Portman was all about cosmic horror. (The book is part of a trilogy; I recommend the second and third books too.)
  • The later books in James S. A. Corey's nine-novel series The Expanse are definitely cosmic horror. (The television adaptation didn't last long enough to really get to those parts.) The alien FTL technologies and nano-technology found by humans in the later parts of this epic series are not at all understood by the scientists, and they are appropriately horrified by their non-local behavior. (James S. A. Corey is the pen name of the authorial partnership of Daniel Abraham and Ty Franck.)
  • And perhaps most controversially, even as most of it was based on the real science and technology of its day, I find Stanley Kubrick's movie 2001: A Space Odyssey to have elements of cosmic horror.

I’ve been thinking about this for some time now: The Expanse is obviously a great hard-SF saga of humans colonizing the solar system - and, eventually, beyond. Both the books and the television adaption are careful to obey the laws of physics, e.g. no FTL, no artificial gravity, and force, inertia, and momentum are all a bitch. But it’s not just that. In my opinion, it’s fused with a great story of cosmic horror, in the Lovecraft tradition, if H. P. Lovecraft had had some physics courses. (Lovecraft was in fact a great fan of astronomy in his time.)

I like to say that we really don’t want faster than light travel. When Einstein wrote

E = mc2

E was energy, m was mass, and c was… the speed of light in a vacuum? Not really.

The speed of light isn’t fixed. It has a top speed in a vacuum, but travels more slowly - indeed, sometimes more slowly than other particles - when going through a transparent medium. (That’s what causes the glow, called Cherenkov Radiation, in water-immersed reactors; particles that are the result of atoms fissioning travel through the water faster than photons, which in turn produces more photons.) But more weirdly: the speed of light is a particular medium is a constant for all observers in inertial reference frames (that is, moving but not accelerating), regardless of how fast each observer is moving relative to one another.

This is what special relativity is all about. If that weren't the case, stuff like the chemical reactions in our bodies that are necessary for life wouldn't work reliably. And keep in mind, we are all in motion, all of the time: we move on the surface of the Earth, the Earth orbits the Sun, the Sun travels through the Milky Way galaxy gravitationally dragging all of its planets along with it, the Milky Way moves through its local galactic cluster, the cluster... well, as far as we know, there's no end to it. All in motion, all of the time. There's no such thing as standing still. And if there were... standing still relative to what?

The variable c isn’t really the speed of light, in a vacuum or otherwise; it’s the speed of causality, the maximum speed at which cause-and-effect can travel. In a vacuum, light travels at this speed.

(Update: as far as I know, there is no definitive answer as to what Einstein and his peers had in mind when they chose c to stand for the speed of light in a vacuum; some have suggested it stands for constant.)

Special relativity is strange enough, in that observers, in different locations and traveling at different speeds in different directions, may legitimately disagree on the order of two independent events, “independent” meaning that those events are not connected directly or indirectly by cause and effect. There is no correct answer as to which event occurred first; it all depends on your point of view.

If we had faster than light travel, or faster than light communication, this gets even worse. Really, our basic perception of reality would come into question. We might not just disagree on the objective order of events, but information about events could arrive before the events apparently occurred. This is actually kinda scary.

Scary in a cosmic horror kind of way.


As I said before, while I’m a fan of H. P. Lovecraft - my alter ego has Miskatonic University business cards he uses for non-work related stuff - I find the popular conception of Lovecraft's creation Cthulhu to be way too anthropomorphic. The mere fact that we can describe Cthulhu - it (apparently) has an octopus head, a humanoid body, wings - means it's really not a cosmic horror. A true cosmic horror would be more of what Lovecraft meant when he wrote of people being driven insane because they could not possibly comprehend, much less describe, what they were witnessing and experiencing.


One of the things I liked about The Expanse books is that the scientists experimenting with the protomolecule and with the ring gates - both extraterrestrial technologies which humans manifestly did not understand any more than my cat understands my laptop - were actually deeply disturbed by the protomolecule's non-local behavior, and by the gates’ faster than light travel. And rightfully so. Lovecraft fans in The Expanse universe would have known that it would not end well. Technology that enables faster than light travel, and non-local behavior, is inevitably going to be a kind of cosmic horror, driving people mad with its non-Euclidian geometries.


So I have come to think of The Expanse series as a fusion of the hard-SF genre and the cosmic horror genre (with a little bit of body horror thrown in for good effect). I would not be surprised to find that the authors Daniel Abraham and Ty Franck had that in mind from the very beginning.

Thursday, June 08, 2023

Widows and Orphans and Working from Home

Terrific - and terrifying - article from The Atlantic's "Work In Progress" blog by Dror Poleg, author of the book Rethinking Real Estate: the next crisis will start with empty office buildings.

The commercial real estate market - once so stable it was considered a widows and orphans investment - is changing radically. 25% of commercial real estate in large cities is empty, and that only counts the space whose leases have expired; it doesn't count leased space that isn't occupied, and is unlikely to be occupied again when the lease expires. Many real estate firms are "handing the keys to the bank" by defaulting on their loans.

I've been thinking about this ever since the Spousal Unit and I attended the World Science Fiction Convention in Chicago in 2022, post pandemic. It was held downtown at the Hyatt Regency on Wacker Driver right next to the huge Illinois Center complex.


We've attended conventions in this very same venue many times. I was shocked to see how the pandemic and the work at home movement had changed it. Illinois Center is an office building complex that sits atop a vast underground environment linking many such complexes. When we've been there in the past, on working days, this environment was full of retail, food, and service shops, and people bustling through it. This last time, it was almost empty, with a lot of empty storefronts.

Standing in our hotel room and peering at the adjacent office building, on a working day, the I could see the window office space on several floors were empty; I saw one single office worker, looking at what appeared to be large blueprints or schematics.


The commercial real estate market underpins a lot of city tax revenue and investments including pension plans. The clock is ticking: according to Poleg, a third of all office leases expire by 2026.

Wednesday, June 07, 2023

Bug Blast from the Past

I found a Day One bug in some unit test code today that I wrote in 2014, nine years ago. What caused the bug to show up now? I had botched where I had put a thread scheduling yield function call relative to a critical section in the unit test. Apparently this is the first time I have ported this multi-threaded C code to a single-core processor - in this instance, a Raspberry Pi Zero

This is remarkable. As someone said when I related this story to them, it's not uncommon to uncover bugs in code being ported from single core to multi-core processors, but the opposite transition happens seldom enough that this kind of event is rare.

For most of my career - which began before microprocessors even existed - all such computer chips were single core, that is: capable of executing only a single machine instruction at a time. I started writing this library, Diminuto, for a single core ARM4 processor in 2008. But it didn't take long to migrate to inexpensive ARM and Intel chips that could execute more than one instruction at a time. Today, I routinely use small development and test machines that have four or even eight processor cores. As a former colleague of mine, who I met when I worked in Boulder at the National Center for Atmospheric Research and he with Seymour Cray at Cray Computer Company in Colorado Springs, once remarked, your Raspberry Pi 4B single board computer can, by most metrics, outperform Seymour's last supercomputer, the Cray-3 a.k.a. "The Fish Bowl".

The evolution of multi-core microprocessors was largely motivated by, IMNSHO, the end of Moore's Law, at which point doing stuff in parallel was the only way to produce faster microprocessors... faster, at least, for those applications which could take advantage of them. See also: Amdahl's Law.


Yes, that's a young(er) iteration of me leaning on a Cray-3 in NCAR's computer room at their Mesa Laboratory in Boulder Colorado. Compared to what we can do with computers today, those were the old days, but not necessarily the good old days.

Saturday, June 03, 2023

The Short Tail

Way back in 2004, Chris Anderson, who was then editor in chief of WIRED magazine, wrote an article, and later a book, about "The Long Tail". The idea is that technology and economics have made it possible for companies to provide some kinds of products for which each individual item might have low demand, but the number of items would be so large, that it would be profitable. The classic example of this was Netflix, from which you could rent a DVD of an obscure movie from decades ago; that DVD might not even exist until it rose to the top of your Netflix wish list, at which point it could be manufactured onto a physical DVD from a digital archive, and then shipped to you. Another example is, which provides an enormous variety of physical items, leveraging warehousing and logistics to keep costs low on low volume items. (The title comes from the shape of the graph where you plotted the demand for each item against the number of items; the head of the graph was high, representing the big hits, but the tail, although low, trailed off almost to infinity, containing all the niche products.)

Great idea. But those days are over.

Netflix, as everyone now knows, is dropping its DVD rental service in favor of internet streaming. Wow. Great. Except the number of movies it provides via streaming is a tiny fraction of what was available before on DVD. This orphans our long Netflix wish list, which was full of old classic movies that I hadn't seen, but wanted to.

Just a couple of days ago, the Spousal Unit had other plans, and I sat down with the TiVo remote control and laboriously searched for The Ipcress File, a 1965 spy thriller with a young Michael Caine, and Paint Your Wagon, a 1969 musical western with Lee Marvin and Clint Eastwood. No luck on either movie, on either Netflix or with Amazon Prime, the latter of which I thought might work even if I had to pay a few bucks for either movie.

So much for the idea that internet bandwidth, and digital storage, being so cheap that virtually any digital media would be easily and almost instantly available. This is not the future I was promised in lieu of flying cars.

Today I read that Disney is dropping a pile of shows and movies from its streaming service. Why? Money, of course. This will - somehow - allow it to write-off US$1.5B dollars. This is existing content, which costs almost nothing to store and deliver technically, but may of course incur licensing fees to its creators. Including some of the very same writers that are currently on strike to protest in part terrible labor practices by production companies and giant media conglomerates.

I'm a capitalist at heart - I have to be, my income depends on it - but this is some kind of late-stage capitalism market failure.

Thursday, March 09, 2023

Not Quite the Bootstrap Paradox


Submitted for your approval are two 3.5" floppy diskettes.

I found the diskette on the left underneath our rear deck, when it was demolished to be replaced with a concrete patio this past fall. That was in 2022, just to be clear. It had clearly been under the deck for a long time, covered in dirt, and weathered enough that the label is mostly illegible.

The floppy disk was originally included with the hardback book Secrets of Software Quality: 40 Innovations from IBM. The book was written by Craig Kaplan, Ralph Clark, and Victor Tang, and published by McGraw-Hill. The first edition came out in 1995. 

I had never owned this book before, nor seen it, nor was even aware of its existence. Not until I found this floppy under our demo'ed deck, and decided to buy a new copy of the book. (It wasn't cheap.)

On the right is the diskette from that new copy. The date on the new disk is 1995. Careful inspection of the damaged diskette suggests that at least the last digit of the otherwise unreadable date looks like a "5". What fragments are barely visible of the part number on the weathered diskette seem to match the new diskette.

Our house was built in 1979. I don't know when the rear deck was built, but it was there when we bought the house in 1989.

So clearly the weathered diskette somehow came to be under our rear deck since we bought the house, since it did not exist until about six years after we moved in. It didn't come from any book in my library. We've had people over to the house, of course; computer people, even. But even if one of them had - for some reason - brought this diskette with them, the spacing of the boards on the deck would have made it very difficult - but not impossible - for them to have dropped the diskette and for it to have just happened to fit between two adjacent boards just perfectly so that it ended up where I found it. And what are the chances they would have dropped the diskette, seen it slip through the deck, and not have said anything to me?

The other possibility is that some critter brought it from somewhere else and left it under our deck. We've had several generations of rabbits living under our deck, and at least once I saw a skunk run out from under the deck and race across the yard. What use did a critter think it would have for a floppy diskette? Your guess is a good as mine.

Over the years we've had various folks work on the deck, in the adjacent yard, or around the outside of the house. I would like to think it's possible that one of those laborers was pursuing a course of study in information technology and maybe lost the floppy out of a pocket while, perhaps staining the deck as a summer job.

If it were not for the fact that we replaced the wooden deck with a concrete patio, it would have seemed inevitable that I would have some strange excuse to carry the new diskette across the deck, accidentally drop it, and watch it almost miraculously slip between the floor boards, where it falls through some kind of time portal, resulting in a causal loop. "And thus the prophecy was fulfilled."

And, besides, they weren't bootstrap diskettes.

Sunday, February 12, 2023

Bob Greene Road

It will probably come as no surprise that I'm all about Google Maps: the maps, its Satellite View, its Street View, all that stuff. If I see a street address in something that interests me, pretty much for any reason, I'll look it up on Google Maps, check out the Satellite View, drop into Street View and look at the 360º image.

Growing up I used to spend my summers living in the old Greene family home, the house my mom grew up in, in eastern Kentucky, not far west of the West Virginia line (Bruin, Kentucky, along State Route 7; Elliott County, county seat Sandy Hook; nearest sizable town Grayson, near Grayson Lake State Park, in Carter County).

So it was only natural when Google Maps became a thing, I'd bring up Google Maps and revisit my old childhood haunts from time to time. I found that old house, which had electricity (on good days), but no heating (other than a fireplace), and no running water (other than what I ran out to the well and got). The toilet was an outhouse, which was a fair trek out the back door and beyond the old chicken coop (but not as far as the barn) on a cold night. I also found the homes of my cousin Bob Greene and some other relatives in the neighborhood, which were just on the hillside on the other side of the hollow ("holler") to the north of our house.

Until one day, I couldn't find it. I could find other landmarks, like Route 7, the tributary of Grayson Lake that was in the neighborhood, and the old Horton Flat Road, but I couldn't recognize anything where I thought the old house should be, nor any of the homes I knew to have been near it.

I was mystified for months, until some judicious Googling led me to a page of the Kentucky Department of Transportation that explained that the short section of Route 7 on which the house sat had been completely rerouted, the road straightened out, and widened, and it now ran west of the old house, behind the hill on which the house had been built, instead of in front of it. And the original narrow, winding, sometimes treacherous section of Route 7, on which the gravel driveway up the hill to the old house was, was now an unnamed road off the shiny new Route 7.

A couple of days ago I revisited my old stomping grounds again, the very place where I learned to shoot a gun, first rode a motorcycle, read Frank Herbert's novel Dune on the front porch, and lots of other stuff, only to find that the old former Route 7 segment was now officially named "Bob Greene Road". 

Which made me really happy.

Screen Shot 2023-02-12 at 11.23.32