←back to thread

DeepSeek-v3.1-Terminus

(api-docs.deepseek.com)
101 points meetpateltech | 2 comments | | HN request time: 0.417s | source
1. storus ◴[] No.45334044[source]
I tried V3.1 but it was driving me crazy by ignoring parts of user input, which R1 never did. I had many such instances when e.g. asking about running DeepSeek 671B it instead picked DeepSeek 67B because 671B is too large to exist so I must have made a mistake etc. I concluded that despite being better in benchmarks than R1, it was essentially useless due to this characteristics and I instead started using R1 at OpenRouter. Not sure why deepseek.com removed R1 and left only V3.1 without any ability to switch back, I guess it's cheaper to run.
replies(1): >>45341058 #
2. Grimblewald ◴[] No.45341058[source]
Matches my experience in general as well. I find benchmarks largly useless for comparing current models. Many, despite improved metrics, are strictly worse than predecessors. What little gains they show in some areas, like agentic use here, are often set by far broader and often catastrophic losses.