An actual "thinking machine" would be constantly running computations on its accumulated experience in order to improve its future output and/or further compress its sensory history.
An LLM is doing exactly nothing while waiting for the next prompt.
An actual "thinking machine" would be constantly running computations on its accumulated experience in order to improve its future output and/or further compress its sensory history.
An LLM is doing exactly nothing while waiting for the next prompt.
I think the thing you were looking for was more along the lines of a persistent autonomous agent.
I see thinking as less about "timing" and more about a "process"
What this post seems to be describing is more about where attention is paid and what neurons fire for various stimuli
Frankly this objection seems very weak
This is currently done with multiple LLMs and calls, not within the running of a single model i/o
Another example would be to input a single token or gibberish, the models we have today are more than happy to spit out fantastic numbers of tokens. They really only stop because we look for stop words they are trained to generate and we do the actual stopping action
it’s fine though, this was as productive as i expected
Still, what current LLMs are doing with their fixed rules is only a very limited form of reasoning since they just use a fixed N-steps of rule application to generate each word. People are looking to techniques such "group of experts" prompting to improve reasoning - step-wise generate multiple responses then evaluate them and proceed to next step.
It's an interesting window on people's intuitions -- this pattern felt surprising and alien now to someone who imbibed Hofstadter and Dennett, etc., as a teen in the 80s.
(TBC, the surprise was not that people weren't sure they "think" or are "conscious", it's that they were sure they aren't, on this basis that the program is not running continually.)
I'm listing things that current LLMs cannot do (or things they do that thinking entities would not) to argue they are so simple they are far from anything that resembles thinking
> it’s fine though, this was as productive as i expected
A product of your replies becoming lowering in quality, and becoming more argumentative, so I will discontinue now
Current LLMs have none of that - they are just the fixed set of rules, further limited by also having a fixed number of steps of rule application.
An LLM has no innate traits such as curiosity or boredom to trigger exploration, and anyways no online/incremental learning mechanism to benefit from it even if it did.
The effect is as if you had multiple people playing a game where they each extend a sentence by taking turns adding a word to it, but there is zero continuity from one word to the next because each person is starting from scratch when it is their turn.
What do you mean? They get to access their previous hidden states in the next greedy decode using attention, it is not simply starting from scratch. They can access exactly what they were thinking when they put out the previous word, not just reasoning from the word itself.
But that's exactly what I'm saying - the model has access to what it was thinking when it generated the previous words, it does not start from scratch. If you don't have the KV cache, you still have to regenerate what it was thinking from the previous words so on the next word generation you can look back at what you were thinking from the previous words. Does that make sense? I'm not great at talking about this stuff in words
There will be some overlap in what the model is now "thinking" (and has calculated from scratch) since the new prompt is one possible continuation of the previous one, but other things it was previously "thinking" will no longer be there.
e.g. Say the prompt was "the man", and output probabilities include "in" and "ran", reflecting the model thinking of potential continuations such as "the man in the corner" and "the man ran for mayor". Suppose the word sampled was "ran", so now the new prompt is "the man ran". Possible continuations can no longer include refining who the subject is, since the new word "ran" implies the continuation must now be an action.
There is some work that has been saved, per the KV cache, in processing the new prompt, but that is only things (self attention among the common part of the two prompts) that would not change if recalculated. What the model is thinking has changed, and will continue to change depending on the next sampled continuation ("the man ran for mayor", "the man ran for cover", "the man ran his bath", etc).