/top/
/new/
/best/
/ask/
/show/
/job/
^
slacker news
login
about
←back to thread
Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet
(transformer-circuits.pub)
168 points
1wheel
| 1 comments |
21 May 24 15:15 UTC
|
HN request time: 0.201s
|
source
1.
maciejgryka
◴[
22 May 24 08:22 UTC
]
No.
40438615
[source]
▶
>>40429540 (OP)
#
I recorded myself trying to read through and understand the high-level of this if anyone's interested in following along:
https://maciej.gryka.net/papers-in-public/#scaling-monoseman...
ID:
GO
↑