←back to thread

507 points martinald | 2 comments | | HN request time: 0.034s | source
Show context
sc68cal ◴[] No.45053212[source]
This whole article is built off using DeepSeek R1, which is a huge premise that I don't think is correct. DeepSeek is much more efficient and I don't think it's a valid way to estimate what OpenAI and Anthropic's costs are.

https://www.wheresyoured.at/deep-impact/

Basically, DeepSeek is _very_ efficient at inference, and that was the whole reason why it shook the industry when it was released.

replies(7): >>45053283 #>>45053303 #>>45053401 #>>45053455 #>>45053507 #>>45053923 #>>45054034 #
CjHuber ◴[] No.45053401[source]
The reason it shook the market at least was because of the claim that its training cost was 5 million.
replies(2): >>45053502 #>>45055728 #
hirako2000 ◴[] No.45053502[source]
That' what the buzz focused on, strange as we don't actually know what it cost them. While inference optimization is a fact and is even more impactful since training costs benefit from economics of scale.
replies(1): >>45054957 #
1. CjHuber ◴[] No.45054957{3}[source]
I don't think that's strange at all, it's a much more palatable narrative for the mass who doesn't know what inference and training is and who think having conversations=training
replies(1): >>45065298 #
2. hirako2000 ◴[] No.45065298[source]
I agree nothing surprising in that, also back then inference wasn't as much questioned as today with regards to being sold at a loss.