←back to thread

DeepSeek-v3.1

(api-docs.deepseek.com)
776 points wertyk | 2 comments | | HN request time: 0.403s | source
Show context
rsanek ◴[] No.44980753[source]
Looks to be the ~same intelligence as gpt-oss-120B, but about 10x slower and 3x more expensive?

https://artificialanalysis.ai/models/deepseek-v3-1-reasoning

replies(5): >>44981187 #>>44981737 #>>44981789 #>>44982171 #>>44982769 #
easygenes ◴[] No.44981789[source]
Other benchmark aggregates are less favorable to GPT-OSS-120B: https://arxiv.org/abs/2508.12461
replies(1): >>44982519 #
petesergeant ◴[] No.44982519[source]
With all these things, it depends on your own eval suite. gpt-oss-120b works as well as o4-mini over my evals, which means I can run it via OpenRouter on Cerebras where it's SO DAMN FAST and like 1/5th the price of o4-mini.
replies(1): >>44983709 #
indigodaddy ◴[] No.44983709[source]
How would you compare gpt-oss-120b to (for coding):

Qwen3-Coder-480B-A35B-Instruct

GLM4.5 Air

Kimi K2

DeepSeek V3 0324 / R1 0528

GPT-5 Mini

Thanks for any feedback!

replies(1): >>44984580 #
petesergeant ◴[] No.44984580[source]
I’m afraid I don’t use any of those for coding
replies(1): >>44987662 #
1. bigyabai ◴[] No.44987662[source]
You're missing out. GLM 4.5 Air and Qwen3 A3B both blow OSS 120B out of the water in my experience.
replies(1): >>44987765 #
2. indigodaddy ◴[] No.44987765[source]
Ah good to hear! How about Qwen3-Coder-480B-A35B-Instruct? I believe that is the free Qwen3-coder model on openrouter