←back to thread

63 points jxmorris12 | 1 comments | | HN request time: 0.247s | source
Show context
Herring ◴[] No.44504065[source]
I'm a bit bearish on SSMs (and hybrid SSM/transformers) because the leading open weight models (DeepSeek, Qwen, Gemma, Llama) are all transformers. There's just no way none of them tried SSMs.
replies(5): >>44504164 #>>44504299 #>>44504738 #>>44505203 #>>44506694 #
1. visarga ◴[] No.44504164[source]
Yes, until serious adoption I am reserved too, both on SSMs and diffusion based LLMs.