Wednesday, September 03, 2025

Real-Time versus Real Time

Interesting article from IEEE Spectrum: "How AI’s Sense of Time Will Differ From Ours" [Popovski, 2026-08-13].

Human cognition integrates events from different senses - especially seeing and hearing - using a temporal window of integration (TWI). Among other things, it's the ability that lets us see continuous motion with synchronized sound in old school films at 24 frames per second. But under the hood, everything is asynchronous with different sensing and processing latencies. Which is why we don't automatically integrate seeing distant lightning strikes with the thunderclap, even though intellectually we may know they're the same event.

Machines have to deal with this as well, especially AI in applications like self-driving vehicles. It's non-trivial. "Computers put timestamps, nature does not" as the author remarks. Anyone that develops real-time software - or has spent time analyzing log files - has already had to think about this. I talked about this issue in a prior blog article: "Frames of Reference".

I've also pointed out in a prior article, "Frames of Reference III", that our human sense of simultaneity continuously gives us a false view of reality. If I look towards the back of my kitchen, I see the breakfast table and chairs a few feet away. Since light travels about a foot per nanosecond, I'm actually seeing events that occurred a few nanoseconds ago (plus the communication and processing latency inside me). The back yard that I can see through the window: a few tens of nanoseconds ago. The house across the street: a hundred nanoseconds ago. The mountains to the west: microseconds ago. If I can see the moon on a clear evening: over a second ago. I see all of these things as existing in the same instant of time, but nothing could be further from the truth; my perception is at best an ensemble of many instants in the past, and the present is just an illusion.

AI perception of the real-world will have similar complications.

Sunday, August 17, 2025

What do we mean when we say we don't know how Large Language Models work?

Large Language Models - what passes for "AI" these days (there are other kinds, but this is what most people today mean when they use the term) - are in effect gigantic autocomplete algorithms, implemented using a technique called "machine learning", which is based on artificial neural networks (from how we currently believe the brain works), scaled up to use trillions of parameters, computed from terabytes of training data, much of which is copyrighted and used without the creators' permission. An LLM produces the output that its algorithm deems the most likely to be a response to your input prompt, based on its model of that training data. If that output represents actual truth or facts, it's only because the training data made that seem probable.

LLMs "hallucinating" isn't a bug; it's fundamental to how they operate.

I've read several articles on LLMs whose basic theme is "no one knows how LLMs work". This is true, but probably not in the way that most people think. The LLM developers that work for the AI companies know exactly how the software algorithms work - it's not only just code, it's code that they for the most part wrote. It's the trillions of parameters, derived algorithmically from the terabytes of training data, the is the big mystery.

Imagine a vast warehouse, on the scale of the scenes at the end of Citizen Kane or Raiders of the Lost Ark. That warehouse is full of file cabinets. Each file cabinet is full of paper files about every person that has ever lived in the United States, for as long as the U.S. Government has been keeping records. Your job: tally the number of people in those files whose first name ends in "e", who had a sibling whose first name ends in "r".

You understand the job. The task is straightforward. The algorithm you could use to accomplish this is obvious. But could you do it? No. The dataset is too ginormous. You literally won't live long enough to get it done, even if you could maintain your interest.

But if all that information were to be digitized, stored in a huge database, the database indexed to link records of family members together, and a program written to answer the original question, a computer could come up with the answer after a few minutes. These kinds of mundane repetitive tasks are what computers excel at.

(This isn't the perfect metaphor but it's the best I've got at the moment.)

LLMs are more complicated than that, and more probabilistic, but it's the same idea. We understand how the code part of the task works. But it's the data, the artificial neural network and its implications, we don't understand. We can't understand. Not just the training data - which is far too much for us to read and digest - but the interconnections between the trillions of parameters that are formed and the statistical weights that are computed as the training data are processed.

If someone asks "How did the AI come up with that response?", that's the part we have to say "We don't know." The artificial neural network is just too big, and stepping through it manually, tracing every single step the algorithm made, while technically not impossible, is just too tedious and time consuming. And relating the parameters and weights of the neural net back to the original training data would be like trying to unscramble an egg.

Knowing how the code works will get more complicated as we use LLMs themselves to revise or rewrite the code. This isn't a crazy idea, and if it's not happening now, it will happen, perhaps soon. And then the code, the part we thought we knew how it worked, will evolve such that we no longer know how it works.

Admittedly, artificial neural network based machine learning models aren't my area of expertise. But I'm not completely ignorant about how they work. I think there are myriads of applications for them. For example, I think we'll use them to discover new drug pathways, just waiting to be found in existing voluminous clinical datasets (although any such results will have to be carefully experimentally verified by human researchers). But I'm becoming increasingly skeptical about the more grandiose claims made for them - sometimes by people who should know better.

Saturday, August 16, 2025

Events 2

I read a transcript of a science explainer by Dr. Sabine Hossenfelder about physicist David Deutsch's "Constructor Theory", which I had not heard of before, and how it accounts for time.


It sounds like just the 180º opposite of what I've been talking about: creating a model of physics that seems more like the kind of real-time systems I work on as a basis for reality. The shortest time period (Planck Time?) is the recycle time of a kind of null task, a term right out of Real-Time Operating Systems. That's basically how I think of the world around me - based solely on decades of professional experience - but it seems weird to think of it as a legitimate Theory of Everything.

Down deep, real-time computer systems - with their asynchronous, concurrent, and parallel behavior - are a lot more non-deterministic than people might think. It's one of the reasons that it's hard to debug such systems - a bug might only reveal itself under certain timing or certain order of events. Determinism is a kind of emergent property created by engineers who are hiding the details under the hood from the user - kind of like Newtonian physics layered on top of relativity and quantum mechanics.

Once you become accustomed to architecting, implementing, and debugging such systems, it's easy - it was for me, anyway - to start seeing the entire world through the same lens. Maybe I should not be surprised that there's one candidate for a Theory of Everything that takes this viewpoint.