←back to thread

507 points martinald | 3 comments | | HN request time: 0s | source
Show context
sc68cal ◴[] No.45053212[source]
This whole article is built off using DeepSeek R1, which is a huge premise that I don't think is correct. DeepSeek is much more efficient and I don't think it's a valid way to estimate what OpenAI and Anthropic's costs are.

https://www.wheresyoured.at/deep-impact/

Basically, DeepSeek is _very_ efficient at inference, and that was the whole reason why it shook the industry when it was released.

replies(7): >>45053283 #>>45053303 #>>45053401 #>>45053455 #>>45053507 #>>45053923 #>>45054034 #
1. thatguysaguy ◴[] No.45053923[source]
Why would you think that deepseek is more efficient than gpt-5/Claude 4 though? There's been enough time to integrate the lessons from deepseek.
replies(1): >>45054572 #
2. overgard ◴[] No.45054572[source]
Because to make GPT-5 or Claude better than previous models, you need to do more reasoning which burns a lot more tokens. So, your per-token costs may drop, but you may also need a lot more tokens.
replies(1): >>45055026 #
3. jstummbillig ◴[] No.45055026[source]
GPT-5 can be configured extensively. Is there any point at which any configuration of GPT-5 that offers ~DeepSeek level performance is more expensive than DeepSeek per token?