LLM analyzing LLM

Erik Larson presents a guest post about Satan Altman by a European bank executive who manages financial Big Data systems. The author delves into some details about AI that I hadn’t heard before. Specifically and rather oddly, ChatGPT doesn’t break language into words or morphemes, and doesn’t break math into the words of math language (numbers, hundreds, tens, units). It just arbitrarily breaks the arbitrary stream of characters into tokens based on previous likely patterns. Thus its language answers and its math solutions are sometimes wrong when previous token patterns lead it astray.

= = = = = START QUOTE:

Sam and his fellows are holders of deep convictions. They were overwhelmed by the effect of scaling their transformer model and became convinced AGI was within reach. The past year has given many hints that some of OpenAI’s people are in the process of realising this is not the case, from Sam’s statements to actual technology.

We — all of us — see above all what we want or expect to see. We crave positive reinforcement. It is a fundamental human property. Therefore, when discussing artificial intelligence, maybe the most important subject is human intelligence, and the observation that we think our convictions come from our observations and reasonings, while the reverse is probably more the case than we will comfortably accept.

In the meantime, we need to be very careful with what we assume to be true and keep looking at what really is going on.

= = = = = END QUOTE.

No, Sam doesn’t have deep convictions. A con man doesn’t have convictions. A swindler is an actor. He knows how to create and manipulate convictions in suckers. From resurrection to perpetual motion to AI, tech magicians know how to make their pet gimmick seem real and all-powerful.

Our perception operates by forming deltas against previous memories, as I described here for phonemes and formants.

Every stage of our perception is based on units of real language like words and sentences and formants, NOT on arbitary lengths of the waveform or arbitary pixel-squares of the visual scene. Much of our tokenization is done mechanically in the cochlea and retina and kinesthetic senses before it even reaches the abstract areas of the brain. What the brain receives is already modularized by MEANING.

When not manipulated, we DON’T see what we WANT to see. The primary purpose of our comparator is to give LOUD ALARMS when observed reality disagrees with the expected pattern. The pattern may be a standard pronunciation or standard grammar or a familiar room or a familiar cultural taboo. Any violation, whether a difference or an absence or an addition, wakes us up sharply.

So the author is making an LLM-type mistake in analyzing Sam’s surface characteristics, not comparing him against patterns of real human types formed by personal experience.

= = = = =

Techy sidenote: The arbitary tokens are especially strange because proper tokenizing has been a huge part of computer programming for 70 years. Compilers have been around since 1955, and always start by marking out spaces and punctuation and recognizable keywords. Compilers also recognize the ‘morphemes’ of numbers, breaking the ‘root’ (integer) from the ‘inflection’ (decimal part). Language translators have been developed even longer, and quickly mastered the separation of known words and known inflections. Translators still have trouble with loose semantic structures like idioms and metaphors because those patterns are ‘externally’ contained in human experience, not ‘internally’ written in the text or sounded in the phonemes.

= = = = =

Later: Now, according to “stories”, the OpenAI board wants to get Sam back immediately after firing him. The only thing we can say at this point is that ALL of the “stories” are nonsense. Whatever is really happening, IF ANYTHING AT ALL IS HAPPENING, the “stories” are stagecraft. I’m going to revert to Ockham. Assume it’s just devils dancing on a pinhead. A story told by ChatGPT, full of sound and fury, signifying nothing.

And later again, the simple truth emerges. Sam is back and the board has resigned. So the whole stageplay was orchestrated by Sam to get rid of a board that wasn’t following his orders instantly and perfectly. OpenAI now becomes a standard tech company where the founder is the universe.