(api-docs.deepseek.com)

776 points wertyk | 2 comments | 21 Aug 25 19:06 UTC | HN request time: 0.403s | source

Show context

rsanek ◴[22 Aug 25 03:30 UTC] No.44980753[source]▶

>>44976764 (OP) #

Looks to be the ~same intelligence as gpt-oss-120B, but about 10x slower and 3x more expensive?

https://artificialanalysis.ai/models/deepseek-v3-1-reasoning

replies(5): >>44981187 #>>44981737 #>>44981789 #>>44982171 #>>44982769 #

easygenes ◴[22 Aug 25 07:16 UTC] No.44981789[source]▶

>>44980753 #

Other benchmark aggregates are less favorable to GPT-OSS-120B: https://arxiv.org/abs/2508.12461

replies(1): >>44982519 #

petesergeant ◴[22 Aug 25 09:29 UTC] No.44982519[source]▶

>>44981789 #

With all these things, it depends on your own eval suite. gpt-oss-120b works as well as o4-mini over my evals, which means I can run it via OpenRouter on Cerebras where it's SO DAMN FAST and like 1/5th the price of o4-mini.

replies(1): >>44983709 #

indigodaddy ◴[22 Aug 25 12:27 UTC] No.44983709[source]▶

>>44982519 #

How would you compare gpt-oss-120b to (for coding):

Qwen3-Coder-480B-A35B-Instruct

GLM4.5 Air

Kimi K2

DeepSeek V3 0324 / R1 0528

GPT-5 Mini

Thanks for any feedback!

replies(1): >>44984580 #

petesergeant ◴[22 Aug 25 13:40 UTC] No.44984580[source]▶

>>44983709 #

I’m afraid I don’t use any of those for coding

replies(1): >>44987662 #

1. bigyabai ◴[22 Aug 25 18:04 UTC] No.44987662[source]▶

>>44984580 #

You're missing out. GLM 4.5 Air and Qwen3 A3B both blow OSS 120B out of the water in my experience.

replies(1): >>44987765 #

2. indigodaddy ◴[22 Aug 25 18:13 UTC] No.44987765[source]▶

>>44987662 (TP) #

Ah good to hear! How about Qwen3-Coder-480B-A35B-Instruct? I believe that is the free Qwen3-coder model on openrouter

↑

DeepSeek-v3.1