S1: A $6 R1 competitor?

(timkellogg.me)

851 points tkellogg | 2 comments | 05 Feb 25 11:05 UTC | HN request time: 0.412s | source

Show context

pona-a ◴[05 Feb 25 14:06 UTC] No.42948636[source]▶

If chain of thought acts as a scratch buffer by providing the model more temporary "layers" to process the text, I wonder if making this buffer a separate context with its own separate FNN and attention would make sense; in essence, there's a macroprocess of "reasoning" that takes unbounded time to complete, and then there's a microprocess of describing this incomprehensible stream of embedding vectors in natural language, in a way returning to the encoder/decoder architecture but where both are autoregressive. Maybe this would give us a denser representation of said "thought", not constrained by imitating human text.

replies(7): >>42949506 #>>42949822 #>>42950000 #>>42950215 #>>42952388 #>>42955350 #>>42957969 #

bloomingkales ◴[05 Feb 25 15:49 UTC] No.42950215[source]▶

>>42948636 #

Once we train models on the chain of thought outputs, next token prediction can solve the halting problem for us (eg, this chain of thinking matches this other chain of thinking).

replies(1): >>42951030 #

1. psadri ◴[05 Feb 25 16:36 UTC] No.42951030[source]▶

>>42950215 #

I think that is how human brains work. When we practice, at first we have to be deliberate (thinking slow). Then we “learn” from our own experience and it becomes muscle memory (thinking fast). Of course, it increases the odds we are wrong.

replies(1): >>42951204 #

2. bloomingkales ◴[05 Feb 25 16:45 UTC] No.42951204[source]▶

>>42951030 (TP) #

Or worse, we incorrectly overweight the wrong chain of thinking to an irrelevant output (but pragmatically useful output), at scale.

For example, xenophobia as a response to economic hardship is the wrong chain of thinking embedded in the larger zeitgeist.

↑