https://www.wheresyoured.at/deep-impact/
Basically, DeepSeek is _very_ efficient at inference, and that was the whole reason why it shook the industry when it was released.
https://www.wheresyoured.at/deep-impact/
Basically, DeepSeek is _very_ efficient at inference, and that was the whole reason why it shook the industry when it was released.
We also don't know the per-token cost for OpenAI and Anthropic models, but I would be highly surprised if it was significantly more expensive than open models anyone can use and run themselves. It's not like they're also not investing in inference research.
I remember seeing lots of videos at the time explaining the details, but basically it came down to the kind of hardware-aware programming that used to be very common. (Although they took it to the next level by using undocumented behavior to their advantage.)