(transformer-circuits.pub)

168 points 1wheel | 1 comments | 21 May 24 15:15 UTC | HN request time: 0.204s | source

Show context

optimalsolver ◴[21 May 24 15:10 UTC] No.40429473[source]▶

>what the model is "thinking" before writing its response

An actual "thinking machine" would be constantly running computations on its accumulated experience in order to improve its future output and/or further compress its sensory history.

An LLM is doing exactly nothing while waiting for the next prompt.

replies(5): >>40429486 #>>40429493 #>>40429606 #>>40429761 #>>40429847 #

1. HarHarVeryFunny ◴[21 May 24 15:21 UTC] No.40429606[source]▶

>>40429473 #

Right, it's not doing anything between prompts, but each prompt is fed through each of the transformer layers (I think it was 96 layers for GPT-3) in turn, so we can think of this as a fixed N-steps of "thought" (analyzing prompt in hierarchical fashion) to generate each token.

↑

Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet