Tuesday, July 08, 2008

Thinking About Software with Gödel and Cantor

Most people that work with me are surprised to learn that I test at the far end of the introversion scale on the Myers-Briggs Type Indicator personality test (I'm an INTJ). Since I have a reputation as a very hands-on engineer, they might also be surprised by the fact that I enjoyed taking a lot of computer science theory classes when I was in graduate school: formal languages, computability, algorithms, etc. I insist on finding practical applications for this stuff all the time. And occasionally I entertain myself by thinking about things like Gödel Numbers.

In 1931, philosopher, logician, and mathematician Kurt Gödel showed how you could take any formal language and represent it as a mathematical function by encoding the symbols of the language numerically and representing the productions of the language as a function that operates on the resulting numbers. Whenever you have formal systems A and B, showing that one can be converted into the other is always a win, because it's a form of code reuse: proving something about A now also applies to B. It didn't take long to show that a Gödel number could be represented as a string that could be operated on by a Turing Machine, adding yet another formal system to the party.

You can trivially apply this to any piece of software as well. Just take a core dump of your program's initial state in memory. You can think of this result as a hexadecimal representation of its Gödel number. Now take another dump of your program after every instruction executes. You end up with a series of Gödel numbers. Now write a single mathematical formula that generates this series of numbers. You have just converted your program into a abstract function amenable to all sorts of formal analysis. True, it's so large and ugly that such analysis may be intractable, but it's at least theoretically possible.

Remember those intelligence tests you took in grade school where you had to choose the next number in a sequence? I am sorry to inform you that any number can be the correct answer. It is trivially easy to write a mathematical function to generate any arbitrary sequence of numbers. As Mrs. Overclock (a.k.a Dr. Overclock, Medicine Woman) frequently points out, the fact that she and I both did well in school has more to do with the fact that we are good at taking standardized tests, not that we're necessarily smart. Of course, knowing what answer is expected is itself a useful life skill. But I digress...

Theoretical computer scientists are fond of talking about Turing Machines with infinitely long tapes. They do this mostly so that the are not constrained by physical limitations. What they really mean is "the tape is just as long as it needs to be". The fact that there are algorithms that ultimately try to march along to the end of an infinite tape is part of the fun. This is a form of the Halting Problem.

We can do this with our abstract mathematical function too. We just assume we always have enough bits to represent any possible Gödel number we want to compute. An infinite number of bits should be about right. But what kind of infinity?

It turns out that there are different orders of infinity. Around 1874, set theorist Georg Cantor showed that indeed while the number of integer numbers (or the cardinality of the set of integer numbers to be more precise) was infinite, and the numbers of real numbers (ditto) was infinite, there had to be more real numbers than integer numbers. That's because we could form a one-to-one correspondence between the set of integer numbers

(... -1, 0, 1, 2, 3, ...)

and the set of real-numbers

(... -1.0, 0.0, 1.0, 2.0, 3.0, ...)

but in between each of those real numbers was another infinite series of entries in the set of real numbers

(... -1.0, ... 0.0, 0.001, ... 0.01, ... 0.1, ... 0.2, ... 1.0, ... 2.0, ... 3.0, ...) .

Cantor called the transfinite number that was the cardinality of the set of all integers 0 ("aleph null") and the transfinite (but larger) number that was the cardinality of the set of all real numbers 1 ("aleph one").

What is the relationship between 0 and 1? You might expect that since there is an infinite number of entries in the set of real numbers in between each entry that corresponds to an entry in the infinite set of integer numbers, that 1 would equal 02, or maybe 00. Not so fast...

Since we can make a one-to-one correspondence, or enumerate, each bit in our hexadecimal Gödel number with an integer number, the number of bits in our infinitely large Gödel number is 0. So how many unique Gödel numbers can we form from 0 bits? That's easy for any software developer: 20.

Here's where it gets interesting (no, really). What order of infinity is 20? It turns out, according to Cantor's continuum hypothesis, that this value is 1, the cardinality of the set of all real numbers.

Coincidence? You decide.

7 comments:

Anonymous said...

One postulates that Chip is talking about a Turing Machine and that a Turning Machine is simply the product of a Turing Machine coupled with some historically impaired spell checker.

By the way, Happy 52nd Chip.

Chip Overclock said...

Damn those spell checkers! Correction noted and made. Thank you, Officer Malloy, and a happy 52nd to you as well!

Michael L Perry said...

A minor correction to your summary of Cantor's proof.

The fact that there are numbers in between is not important. For example, the integers can be mapped to the even numbers by multiplying each by 2. There are missing numbers in between (the odd numbers), but the two sets are still in one-to-one correspondence. There are still just as many even numbers as there are integers. (Freaky, huh?)

The crux of Cantor's proof is that it is impossible to create a one-to-one correspondence between the integers and the reals. In other words, the reals cannot be "counted". Their infinity is bigger.

And, no, it's not a coincidence that infinities and bits follow the same pattern. Given n distinct items, you can create 2^n different sets. If those items are bits, and inclusion in the set means that the bit is one, then those sets are numbers. Cantor showed that the reals can be put into one-to-one correspondence with the sets of integers, and so it's infinity is 2^the integers.

Chip Overclock said...

Thanks for the clarification. I confess I never took set theory. In hindsight I wish I had, as well as a bunch of other stuff like queuing theory and game theory, a working knowledge of each I would find useful. The connection between Godel numbers and Cantor's transfinite math just came to me recently as part of my recreational reading.

Anonymous said...

Chip,
I am intrigued by your proposal to represent a program by a function that generates the same number at each timestep as the program dump.
Determining functions that mimic real phenomena is what us theoretical Physicists do for a living, if we make a living.
The problem of determining the
correct function can be likened to
projecting a vector onto the three spatial axes: you have an orthonormal set of basis vectors and any arbitrary vector can be determined by the coefficients in a linear sum of the basis vectors.
Representing sounds as a Fourier series is an example closer to our history, since the sine and cosine
(or imaginary exponentials if you are a dweeb) comprise such a basis for the set of real number functions.
The problems come in when you have only a finite number of test
points from which to generate an infinite function. If the basis is well chosen, the error gets acceptably small with a small number of test points. If not, you can remember the outcome, since you probably lived it.
So, your example of mimicking programs could be limited in its utility, depending on how well you chose the basis functions from which you build your representation.
Which brings me to an item I always wanted to pursue. Most voice processing happens through Fourier representations, but infinite sine waves are not good models for speech. (Singing, yes, especially for professionals.) Tchebchev polynomials are better, and some CELP algorithms use them. However, they are not well suited for modeling with DSPs, so I do not foresee wider use for them. Sad.

Ken

Chip Overclock said...

My other readers don't have the benefit of Ken and my recent lunch time conversation (at Wahoo's on East 20th in uptown Denver). While Ken is talking about simulating a continuous analog system (continuous at least until you get to quantum granularity), I am talking about simulating a discrete system that changes with every tick of the CPU clock. Ken's is the much harder problem (as he and I both recall from our days at Bell Labs; for that matter Mr. Malloy is a former Bell Labs person too).

But I still think it's interesting to ponder the fact that any software system of arbitary complexity can be turned into a mathematical function with a single argument (the CPU clock), or for that matter into a single finite state machine with a single stimulus (again the CPU clock). Both lend themselves to a very different analysis than is traditionally done with large software systems... or at least they would if the resulting function or FSA were not so large as to be intractable.

John Ryskamp said...

We are in the midst of a renaissance in the historiography of set
theory.
Above all, I recommend A. Garciadiego, BERTRAND RUSSELL AND THE
ORIGINS OF
THE SET-THEORETIC 'PARADOXES,' but there's also Grattan-Guinness and
Ferreiros, discussed in the paper linked below.



The inquiry into the nature and influence of constructivism in math, is only beginning, but here is one area in which some progress has been made. It may interest you. It involves the nature of the formulation of Einstein's concept of the relativity of simultaneity, and the way he used "practical geometry" (his term for constructivist mathematics) to express it." He thought the formulation of this point of view was his crowning
achievement,
and thought very highly of the lecture in which you can read his
discussion
of it, "Geometry and Experience." From Poincare (SCIENCE AND HYPOTHESIS), but also from the long
tradition of
natural mathematics stretching back to Aristotle's concern over the
"paradox"
of Zeno, he adopted the idea that all argumentation leads inevitably
to
paradox. This is certainly the gist of the response to Cantorian set
theory,
hence the fame of the supposed set-theoretic 'paradoxes.'


The most important result of this concern was the idea that there is
no such
thing as logical content: arguments, if expressed in a certain way,
can
approach logical content but can never actually contain it, because,
again,
argumentation necessarily leads to paradox.


So, for practitioners of constructivism or practical geometry, the
only way
out was a compromise: construct an argument, but make it contain the
constructivist idea. That idea is that mathematics is an inherent
human
function.


For those interested in logical content, this is already so far afield
that
eyes glaze over. And it was never seen to be relevant to relativity,
because
no one was able to say how Einstein used practical geometry as a
technique in
constructing an argument. "Geometry and Experience" was seen to be a
bunch
of genial generalities with no relevance to relativity. Why were
people
unable to understand where Einstein used "practical geometry"?
Because, I
think, we share so much of constructivist mathematical thinking that
we are
blind to its presence in arguments.


In any event, you ought to know that Einstein DID use practical
geometry in
developing the relativity of simultaneity. Whatever else you may now
argue
regarding the relativity of simultaneity, you can no longer ignore the
constructivist mathematics in it--that, now, HAS to be taken into
account.
There is an historical sidetrack to this: Einstein's use of
constructivist
math is in "disguised" form in the 1905 paper.


So, I think intentionally, he made it explicit in the "train
experiment." If
you notice, the train experiment and the clock experiments are the
same
experiment: they can be translated mechanically, one into the other.


Thus, the constructivist term Einstein inserted into the train
experiment is
also present in the clock experiment.


Remember, that in doing this, he intentionally deprived the argument
of
logical content, because he felt he had to do so. If YOU feel that
one must
do so, this will not bother you. If you insist on logical content in
your
argument, it will bother you A LOT. So it's really a matter of taste,
and
not one for debate.


As the paper below says, at one stage of the argument, point M is said
to
"naturally" (fallt zwar...zusammen" in the original German) coincide
with
point M'. (By the way, close readers of this text--translators--have
realized that this was a conceptual anomaly: they treat it differently
in the
French and Italian translations of RELATIVITY).


The logical problem with this notion is as follows:


1. if you drop the term, and M and M' coincide in traditional
Euclidean
fashion, you are led to the contradiction of assuming two Cartesian
coordinate systems and deducing one such system (I leave the proof of
this to
you).


So M and M' cannot coincide in a Euclidean way: that much cannot, I
think, be
contested and no one has ever argued that they could so coincide and
that
Einstein was saying that they did so coincide. At least, I haven't
seen any
such contentions.


2. if you retain the term, you find that it is not part of the
formulation
of the relativity of simultaneity. It is not a definition, an
assumption, a
principle, a deduction or anything else. You will look in vain for
the
logical role it plays in the argument. It doesn't play any at all.
It
simply rattles around in the argument--a loose cannon on deck.


So what is it? It is what Einstein always meant it to be: it is an
arbitrary
insertion into the argument, made necessary--according to his approach
of
"practical geometry"--in order for the argument to avoid paradox. So
Einstein did exactly what he wanted to do. However, I think we have
had an
unconscious prejudice against the lack of logical content, so we never
wanted
to think that that's what he wanted to do, or did do. But that's not
taking
Einstein seriously. I suggest you take him seriously--at least do him
that
courtesy, if you are going to pay any attention to what he says.


By the way, are there any paradoxes? That is, was there anything for
Einstein to worry about? No. Not even Zeno's paradox has stood up to
analysis. The "logical" compulsion we feel with respect to the
paradoxes so
far proposed, is an artifact of their construction--it's their
rhetoric--it
is not a result of logical content in these arguments. They have
none. Too
bad, because they are very seductive. But that's the way it is.


This is where the new set theory history is making all kinds of
contributions.
Particularly Garciadiego is devastating with his care with respect to
the
history and the terms, showing that Richard didn't even consider his
argument
a paradox, that there is no Cantor paradox, no Russell paradox, and so
on.
They are glib sleights of hand which do not stand up historically or
logically.


So you really have to do some more work understanding the history of
math.


Another thing which is being revealed by new work into Einstein, is
how
little he probed into contemporary set theory debates. He never
criticized
anything Poincare said about those debates, although particularly
Grattan-
Guinness is scathing in his discussion of Poincare. Einstein didn't
really
know anything about the set theory which set him off on his
mathematical
approach. Very remarkable, I think--very eye-opening.


Einstein is not alone in the sloppiness with which he approached the
mathematical foundation of his argument. The Fefermans and other
commentators are amazingly critical of Godel in their remarks in the
collected works, regarding his understanding of set theory debates.


We tend to think of these twentieth-century mandarins as close
students, as
scrupulous thinkers. It turns out that they were slobs.


And of course, Cantor has been subjected to recent research which is
even
more embarrassing for his work than the many longstanding critiques.


Again, there's more to the background of constructivism than the set
theory
debates. And it has had an influence far beyond Einstein. You find
it in
Darwin, Godel, Sraffa, really everywhere. It has stood in the way of
logic
for a long long time.


Finally, you should consider where "natural" coincidence leaves us.
If we
can't get to general relativity because of "natural" coincidence, then
that
means that once again the Pythagorean theorem is at issue (it was a
resolved
issue under general relativity, for reasons you know). Does the
Pythagorean
theorem have logical content?


My feeling is, no. I think it also has a "natural" coincidence in
it. But
where?


Ryskamp, John Henry, "Paradox, Natural Mathematics, Relativity and
Twentieth-
Century Ideas" (June 17, 2008). Available at SSRN:
http://ssrn.com/abstract=897085