←back to thread

168 points 1wheel | 1 comments | | HN request time: 0.211s | source
Show context
gautomdas ◴[] No.40436795[source]
I've really been enjoying their series on mech interp, does anyone have any other good recs?
replies(2): >>40437371 #>>40441436 #
1. PoignardAzur ◴[] No.40441436[source]
"Transformers Represent Belief State Geometry in their Residual Stream":

https://www.lesswrong.com/posts/gTZ2SxesbHckJ3CkF/transforme...

Basically finding that transformers don't just store a world-model as in "what does the world that produce the observed inputs look like?", they store a "Mixed-State Presentation", basically a weighted set of possible worlds that produce the observed inputs.