/top/
/new/
/best/
/ask/
/show/
/job/
^
slacker news
login
about
←back to thread
The Tradeoffs of SSMs and Transformers
(goombalab.github.io)
63 points
jxmorris12
| 1 comments |
08 Jul 25 19:12 UTC
|
HN request time: 0.247s
|
source
Show context
Herring
◴[
08 Jul 25 21:04 UTC
]
No.
44504065
[source]
▶
>>44503056 (OP)
#
I'm a bit bearish on SSMs (and hybrid SSM/transformers) because the leading open weight models (DeepSeek, Qwen, Gemma, Llama) are all transformers. There's just no way none of them tried SSMs.
replies(5):
>>44504164
#
>>44504299
#
>>44504738
#
>>44505203
#
>>44506694
#
1.
visarga
◴[
08 Jul 25 21:20 UTC
]
No.
44504164
[source]
▶
>>44504065
#
Yes, until serious adoption I am reserved too, both on SSMs and diffusion based LLMs.
ID:
GO
↑