Sunday, August 17, 2025

What do we mean when we say we don't know how Large Language Models work?

Large Language Models - what passes for "AI" these days (there are other kinds, but this is what most people today mean when they use the term) - are in effect gigantic autocomplete algorithms, implemented using a technique called "machine learning", which is based on artificial neural networks (from how we currently believe the brain works), scaled up to use trillions of parameters, computed from terabytes of training data, much of which is copyrighted and used without the creators' permission. An LLM produces the output that its algorithm deems the most likely to be a response to your input prompt, based on its model of that training data. If that output represents actual truth or facts, it's only because the training data made that seem probable.

LLMs "hallucinating" isn't a bug; it's fundamental to how they operate.

I've read several articles on LLMs whose basic theme is "no one knows how LLMs work". This is true, but probably not in the way that most people think. The LLM developers that work for the AI companies know exactly how the software algorithms work - it's not only just code, it's code that they for the most part wrote. It's the trillions of parameters, derived algorithmically from the terabytes of training data, the is the big mystery.

Imagine a vast warehouse, on the scale of the scenes at the end of Citizen Kane or Raiders of the Lost Ark. That warehouse is full of file cabinets. Each file cabinet is full of paper files about every person that has ever lived in the United States, for as long as the U.S. Government has been keeping records. Your job: tally the number of people in those files whose first name ends in "e", who had a sibling whose first name ends in "r".

You understand the job. The task is straightforward. The algorithm you could use to accomplish this is obvious. But could you do it? No. The dataset is too ginormous. You literally won't live long enough to get it done, even if you could maintain your interest.

But if all that information were to be digitized, stored in a huge database, the database indexed to link records of family members together, and a program written to answer the original question, a computer could come up with the answer after a few minutes. These kinds of mundane repetitive tasks are what computers excel at.

(This isn't the perfect metaphor but it's the best I've got at the moment.)

LLMs are more complicated than that, and more probabilistic, but it's the same idea. We understand how the code part of the task works. But it's the data, the artificial neural network and its implications, we don't understand. We can't understand. Not just the training data - which is far too much for us to read and digest - but the interconnections between the trillions of parameters that are formed and the statistical weights that are computed as the training data are processed.

If someone asks "How did the AI come up with that response?", that's the part we have to say "We don't know." The artificial neural network is just too big, and stepping through it manually, tracing every single step the algorithm made, while technically not impossible, is just too tedious and time consuming. And relating the parameters and weights of the neural net back to the original training data would be like trying to unscramble an egg.

Knowing how the code works will get more complicated as we use LLMs themselves to revise or rewrite the code. This isn't a crazy idea, and if it's not happening now, it will happen, perhaps soon. And then the code, the part we thought we knew how it worked, will evolve such that we no longer know how it works.

Admittedly, artificial neural network based machine learning models aren't my area of expertise. But I'm not completely ignorant about how they work. I think there are myriads of applications for them. For example, I think we'll use them to discover new drug pathways, just waiting to be found in existing voluminous clinical datasets (although any such results will have to be carefully experimentally verified by human researchers). But I'm becoming increasingly skeptical about the more grandiose claims made for them - sometimes by people who should know better.

Saturday, August 16, 2025

Events 2

I read a transcript of a science explainer by Dr. Sabine Hossenfelder about physicist David Deutsch's "Constructor Theory", which I had not heard of before, and how it accounts for time.


It sounds like just the 180º opposite of what I've been talking about: creating a model of physics that seems more like the kind of real-time systems I work on as a basis for reality. The shortest time period (Planck Time?) is the recycle time of a kind of null task, a term right out of Real-Time Operating Systems. That's basically how I think of the world around me - based solely on decades of professional experience - but it seems weird to think of it as a legitimate Theory of Everything.

Down deep, real-time computer systems - with their asynchronous, concurrent, and parallel behavior - are a lot more non-deterministic than people might think. It's one of the reasons that it's hard to debug such systems - a bug might only reveal itself under certain timing or certain order of events. Determinism is a kind of emergent property created by engineers who are hiding the details under the hood from the user - kind of like Newtonian physics layered on top of relativity and quantum mechanics.

Once you become accustomed to architecting, implementing, and debugging such systems, it's easy - it was for me, anyway - to start seeing the entire world through the same lens. Maybe I should not be surprised that there's one candidate for a Theory of Everything that takes this viewpoint.

Friday, August 08, 2025

Events

 I've spend decades as a software/firmware developer of real-time systems, going all the way back to the 1970s when I was writing software in the assembler languages of the IBM 360/370 and the PDP-11. The term "real-time" always seemed kind of ironic, since it is easy, when closely scrutinizing such systems - with their asynchronous, concurrent, and parallel behavior - to come to the conclusion that time doesn't exist. Only ordered events. We don't have a way to measure time, except by counting events produced by some oscillator that ultimately derives its periodicity from nature. We call such devices a "clock", Since the only way to test the accuracy and precision of a clock is with a better clock, it's turtles all the way down.

Turing award winning computer scientist Leslie Lamport even wrote what came to be a classic paper on this topic, "Time, Clocks, and the Ordering of Events in a Distributed System" [CACM, 21.7, 1978-07]. He proposed a "logical clock" which was simply a counter that incremented every time it was read, allowing events to be placed in a clear order. I remember reading this paper as a graduate student. And again, later. And again, even later. I may read it again today.

Years ago I mentioned this line of thought to a colleague of mine, who happened to have a Ph.D. in physics and had worked at Fermi Lab. (It's handy to keep such folks around just for this reason.) He immediately brought up the now obvious to me fact that time must exist: Einstein's special and general relativity.

Einstein's theories of SR and GR have been experimentally verified time and again (no pun intended). You can synchronize two atomic clocks side by side, then take one up to the top of mountain (where it experiences less gravity due to being further from the center of the Earth, and hence time runs faster: that's GR) and back down, and find they they now differ by just the predicted amount. This experiment has been done many times.

The U.S. Global Positioning System (and indeed all other Global Navigation Satellite Systems) work by just transmitting the current time to receivers on the Earth. Fundamentally, that's it. All the heavy lifting, computationally, is done by the GPS receiver in your hand. But the atomic clocks inside every GPS satellite have to be carefully adjusted by controllers on the ground to account for GR (because the satellites in their orbits are further from the center of the Earth than you are, and so their clocks run faster), and for SR (because the satellites in their orbits are centripetally accelerated more than you are, and so their clocks run slower). GPS wouldn't give useful results if this correction weren't performed.

The resonant frequency of cesium-133 is the definition of the "second" in the International System (SI) of units. Count off exactly 9,192,631,770 pulses of the microwaves emitted by cesium-133 during the hyperfine transition of their electron in the element's outer electron shell, and that's one second. If cesium is lying to us, we'll never know.

Or maybe we would. Experimental atomic clocks using elements like ytterbium are running in national metrology labs. These are called "optical" atomic clocks because they operate at terahertz frequencies using lasers instead of microwaves at gigahertz frequencies, and their periods are measured in attoseconds instead of nanoseconds. The time is very near in which the definition of the SI second will be changed to use these clocks.

Clocks that are so precise that their position has to be determined by careful surveying because their results are different if the altitude of the laboratory optical bench changes by a centimeter, thanks to GR.

Clocks that are still nothing more than oscillators and counters.

(I took the photograph below in 2018: a survey marker embedded in the concrete floor of an optical atomic clock laboratory at NIST's Boulder Colorado facility.)

Untitled


Thursday, August 07, 2025

We Got The Beat

 Time: it's a funny thing.

We can't measure it directly. The best we can do is construct mechanisms that have some kind of periodic behavior and then count the "beats" (as watchmakers call it) that they produce.
There have been all kinds of sources of periodicity used during human history. The heart beat for short periods of time. The movement of the Sun across the sky for the day. The phases of the Moon for periods of a "moonth". The seasons for the year.
There were many attempts to make time keeping devices - candles with marks drawn on them, water clocks that counted drips, sundials, hourglasses. But none of these were accurate enough to measure longitude, the angular east-west distance across the Earth. (Latitude can be determined by the height of the Sun above the horizon, taking the season and the hemisphere into account. There are almanacs still published today with the numbers you need to do this.)
Navigators going all the way back at least to the ancient Greeks and Polynesians had known that timekeeping could be used to determine longitude, by comparing local time (e.g. local noon, determined by the sun) with a clock set to the time at the port from which you departed. But it wasn't until the mid-to-late 1700s that there was a "chronometer" design accurate, precise, stable, and reliable enough to carry on board ship that navigators were able to use it to determine their longitude.
All "modern" clocks, from those initial chronometers until today, consist of an oscillator - a source of stable precise beats - and a counter - the watch face. And all oscillators are made of three basic components: a resonator (a source of periodicity derived from nature), a power source (a falling weight, a spring, a battery, the mains), and a feedback loop (known as an "escapement" in a mechanical clock).
Many things have been used as resonators over the centuries (and all of these are still used today): a pendulum, a balance wheel, a quartz crystal, an atom of cesium, rubidium, aluminum, ytterbium. But no matter how sophisticated clocks become, they still have the three components that can be classified as a resonator, a power source, and a feedback loop.
The clock below - on display at NIST in Boulder Colorado and whose photograph I took in 2018 - was sold by IBM in 1956. It is an electric pendulum "Type 37" clock that set itself from the NIST WWV/WWVH telegraphic time code using a vacuum tube radio receiver. It was typically used in factories as the master clock from which all other clocks were set.

Untitled

Wednesday, August 06, 2025

NIST Time and Frequency Seminar 2025

Once again I attended the fire hose of information that is the U.S. National Institute of Standards and Technology (NIST) Time and Frequency Seminar. This three day, typically annual, event, held at their Boulder Colorado laboratories (commuting distance for me), covered such wide ranging topics as optical atomic clocks, practical measurement techniques for time and frequency, how to characterize and analyze frequency and phase errors in data, ways in which television and radio broadcasters might augment GPS for timing and positioning, and much more.

In honor of the event I wore my Rolex Milgauss. Felt cute, might delete later.

Wore my Rolex Milgauss, felt cute, might delete later.

* * *

My understanding is that virtually all national time and frequency metrology laboratories, including NIST (civilian) and the USNO (military) in the U.S., use an ensemble of cesium beam atomic clocks and hydrogen maser atomic clocks, the average of which is used to determine their contribution to the measurement of the SI second and the international definition of UTC. These are commercial devices, not laboratory experiments, and aren't astronomically expensive.

They use a combination of both because, even though the cesium resonant frequency is the (current) definition of the second in the international system of units, cesium atomic clocks suffer from jitter (short term variation), while hydrogen masers are more stable. The jitter in commercial atomic clocks is well understood, and the difference between a rack-mounted commercial cesium beam clock and a much larger and far far more expensive cesium fountain atomic clock in labs at places like NIST is all the extra hardware to try to reduce that jitter.

The image below is of the NIST F-3 cesium fountain clock. The collection of commercial cesium beam standards are kept locked up in another room.

NIST F-3 Cesium Fountain Clock

Here's the thing: all hydrogen maser clocks suffer from drift (long term variation). And they all drift by a different amount. And it is not understood why. One hypothesis is it's some mechanism of aging of the components. If the manufacturers could eliminate this, they certainly would (and charge more).

The image below is of a decommissioned commercial hydrogen maser clock that I saw at NIST in 2018. You can't typically find one of these at NIST where it can be photographed because the running ones are kept locked in temperature controlled chambers adapted from commercial egg incubators.

Untitled

* * *

The blinking 1Hz LEDs in this brief video clip literally represent the real-time manufacture of the UTC(NIST) time scale (the U.S. civilian time base) and the U.S. contribution to the international determination of UTC.



* * *

Next to the toe of my shoe is a survey marker embedded in the floor of one of the NIST labs. Atomic clocks are so precise now that centimeter changes in altitude have to be adjusted for, thanks to general relativistic effects.

Survey Marker in Floor of the NIST F-3 Laboratory

* * *

This was my fourth (2018, 2023, 2024, and 2025) and probably last time attending the Time and Frequency Seminar. It is so popular that not only does it sell out, but the waiting list is lengthy too. Could be time to let someone else become a certified Time Lord.

Certified Time Lord

Sunday, April 13, 2025

AgTech

It was World War II that permanently took my parents out of the agricultural (or maybe the vice) industry in rural eastern Kentucky where they grew up in families that were tobacco farmers and "distillers" (really): my dad joined the Navy, deployed on the U.S.S. Yorktown in the Pacific theatre, and my mom was literally a riveter at a defense plant in Columbus Ohio. Maybe it was the old family businesses that left me with a latent interest in the "ag" industry. Last week I attended another of the Colorado Technology Association's Insights Series: "Innovations in AgTech: How Technology is Shaping Colorado Agriculture". It was a terrific learning experience.

There were two keynote speakers, one from CoBank and the other from American AgCredit. These are cooperative banks that are part of the member-owned - like a credit union - Farm Credit System, a government-sponsored enterprise established by Congress in 1916 to insure farmers had access to loans. Why are these guys talking to an org like the CTA? Because by law they can only lend money to farmers; they can't lend money to agricultural technology startups trying to develop new technology for farmers, who are often running US$10Ms+ businesses with huge capital investments managed only using spreadsheets. 

So the banks find themselves trying to put venture capitalists together with AgTech startups to solve the problems their members have. Technology is a big deal in agriculture because labor is a huge cost. (Remarkably, land is another big cost, and many of the efficiency drivers provided by technology only scale with more land; a modern tractor can mow 175 acres of hay in four hours, and costs as much as a house.)

The event included a panel discussion with the two bankers (one of whom was formerly a rancher himself - he showed a photograph of him wrestling a steer to the ground), three former founders of successful AgTech startups, and a really interesting faculty member at Colorado State University - a land-grant university that was formerly Colorado Agricultural College - who is the Director of Ag Innovation at CSU.  (I learned that agriculture in Colorado, where I call home, is pretty unique in that the state has many different biomes, so that it lends itself to growing different crops and livestock, unlike, say, states in the Midwest.)

My work with Differential GNSS and Inertial Measurement Units, which I've written about here previously, was inspired by my interest in precision agriculture, used in applications like auto-steer for tractors, a technology which has led to a huge cost savings for farmers.

It's events like this that keep me renewing my membership in the CTA, a professional society I've been a member of since 2018. I'm not there to hire, to find work, to buy, or to sell; mostly just there to learn and to get ideas for my own projects. Although I routinely attend networking events like their upcoming C-Level at Mile High that are relatively expensive to attend, I do so mostly to support the organization, and because, in the case of C-Level, its fun to wander around on the party floor of Empower Field, a part of the Bronco's football stadium fans rarely if ever see. Also, the food is not bad. But when people choose to chat with me, I feel kind of guilty that I'm probably wasting their time. (I admit left the AgTech event as soon as it transitioned to the networking stage.)

Saturday, March 01, 2025

Population Implosion

The March 3rd edition of The New Yorker had a long article (it was the only thing I got read this AM during my usual Saturday AM breakfast out) about the global declining birth rate. The whole thing reads like science fiction, not unlike Children of Men.

https://www.newyorker.com/magazine/2025/03/03/the-population-implosion

The poster child for this issue is South Korea (half of whose population lives in Seoul, BTW), whose birth rate stands at 0.7. (2.1 is considered a "replacement rate"). Each successive generation is a fraction of the size of the previous one. There are schools in the country that had one thousand students at their peak that now have five.

The U.S. rate isn't nearly that low: 1.66, but still well below replacement. But even immigration won't address the issue of who is going to do the work and pay the taxes that fund Social Security, since for the U.S. the nations from which people immigrate also have declining birth rates.

Reasons? Lots of them. But a big part was the deliberate planning on the parts of non-governmental organizations and governments who panicked about population growth, the food supply, and the environment decades ago. If you think about it, NGOs and governments have, at best, very coarse control over the "birth rate" knob, so getting it tuned perfectly to the desired rate - whatever that may be - is almost impossible. Most got it too low. South Korea got it way too low.

I won't live long enough to have to worry much about this. But eventually we'll have to use AI and automation just to do fundamental stuff like farming and distribute basic goods; there won't be anyone to do the work, and the people that do exist will be too old.

This won't really effect the climate change issue, since climate change is happening on the span of decades, while population decline is on the span of generations.

It occurred to me that this would be an interesting SF story: aliens - perhaps "obligate reproducers" (adults have to procreate or they die) - show up and say "Hey, no sweat, we're patient, we'll stick around until you aren't using your planet anymore. It's just a matter of time."

Edit: capitalism seems to depend on an ever growing population of consumers. What it really means when the population explosion trend reverses - as inevitable as this may seem - is anyone's guess... but it can't be good.

Thursday, February 27, 2025

Let Them Burn

A recent article in MIT Technology Review (probably paywalled) is about dealing with electric vehicle battery fires.

https://www.technologyreview.com/2025/02/24/1111551/ev-lithium-ion-battery-fire-first-responders-firefighters/

It's based on the research by an EV battery pack designer who is also a volunteer firefighter, and who now consults with fire departments on this issue. His conclusion: let them burn, while trying to isolate them from surrounding vehicles and structures. Isolating can mean anything from covering them with a fire blanket, to (as one case study illustrated) moving the EV to a vacant lot with a forklift while it is burning. Wow.

Fires need three things to continue to burn: fuel, oxygen, and heat. Typical firefighting techniques involve interrupting one or more of these constituents. But lithium battery packs provide all three all by themselves, as part of a "thermal runaway" chemical reaction.

Traditional vehicle fires are typically centered around the easily accessible engine compartment, and can usually be put out in minutes with hundreds of gallons of water. EV fires are centered around the huge battery pack often underneath the vehicle, and - if they can be put out at all - may take hours and thousands of gallons of water, and may later spontaneously reignite.

The article has many worrisome case studies, including one where an EV owner accidentally drove his car off a pier in Florida. When the battery pack became saturated with electrically conductive salt water, it shorted and ignited... and continued to burn under thirty feet of water. Wow again. EV batteries igniting when saturated with salt water from flooding in coastal areas due to hurricanes is apparently a growing phenomena.

As a typical homeowner with lots of lithium battery packs - some quite large, for power tools - I've gotten concerned enough about this that I don't leave the packs on chargers when no one is at home (not even phones, laptops, or tablets). And I have a small chest of drawers inside the house just inside the door from the garage in which I store my expensive charged lithium battery packs (which don't like the cold either, but that's more of a longevity issue). I do keep rechargeable gear in both automobiles and on both motorcycles (jumper battery packs, tire inflators), and I worry about that.

Mrs. Overclock recently bought some small fire blankets, one of which is now out in the garage next to the wall mounted fire extinguisher.

Update: another recent article on the same topic from the same source, the gist being preventing EV battery fires is a lot more practical than extinguishing them.