←back to thread

507 points martinald | 1 comments | | HN request time: 0.273s | source
Show context
noodletheworld ◴[] No.45053394[source]
Huh.

I feel oddly skeptical about this article; I can't specifically argue the numbers, since I have no idea, but... there are some decent open source models; they're not state of the art, but if inference is this cheap then why aren't there multiple API providers offering models at dirt cheap prices?

The only cheap-ass providers I've seen only run tiny models. Where's my cheap deepseek-R1?

Surely if its this cheap, and we're talking massive margins according to this, I should be able to get a cheap / run my own 600B param model.

Am I missing something?

It seems that reality (ie. the absence of people actually doing things this cheap) is the biggest critic of this set of calculations.

replies(10): >>45053436 #>>45053533 #>>45053550 #>>45053564 #>>45053601 #>>45053730 #>>45053776 #>>45053962 #>>45055164 #>>45055610 #
1. johnsmith1840 ◴[] No.45055164[source]
Another giant problem with this article is we have no idea the optimizations used on their end. There are some widly complex optimizations these large AI companies use.

What I'm trying to say is that hosting your own model is in an entierly different leauge than the pros.

If we account for error in article implies higher cost I would argue it would return back to profit directly because how advanced optimization of infer3nce has become.

If actual model intelligence is not a moat (looking likely this is true) the real sauce of profitable AI companies is advanced optimizations across the entire stack.

Openai is NEVER going to release their specialized kernels, routing algos, quanitizations or model comilation methods. These are all really hard and really specific.