←back to thread

169 points mgninad | 1 comments | | HN request time: 0.21s | source
Show context
mrtesthah ◴[] No.45072533[source]
Do we know if any of these techniques are actually used in the so-called "frontier" models?
replies(3): >>45072588 #>>45073417 #>>45076391 #
1. zackangelo ◴[] No.45076391[source]
Not quite a frontier model but definitely built by a frontier lab: Grok 2 was recently open sourced and I believe it uses a fairly standard MHA architecture with MoE.