←back to thread

507 points martinald | 2 comments | | HN request time: 0.447s | source
Show context
noodletheworld ◴[] No.45053394[source]
Huh.

I feel oddly skeptical about this article; I can't specifically argue the numbers, since I have no idea, but... there are some decent open source models; they're not state of the art, but if inference is this cheap then why aren't there multiple API providers offering models at dirt cheap prices?

The only cheap-ass providers I've seen only run tiny models. Where's my cheap deepseek-R1?

Surely if its this cheap, and we're talking massive margins according to this, I should be able to get a cheap / run my own 600B param model.

Am I missing something?

It seems that reality (ie. the absence of people actually doing things this cheap) is the biggest critic of this set of calculations.

replies(10): >>45053436 #>>45053533 #>>45053550 #>>45053564 #>>45053601 #>>45053730 #>>45053776 #>>45053962 #>>45055164 #>>45055610 #
hirako2000 ◴[] No.45053550[source]
Imo the article is totally off the mark since it assumes users on average do not go over th 1M tokens per day.

Afaik openai doesn't enforce a daily quota even on the $20 plans unless the platform is under pressure.

Since I often consume 20M token per day, one can assume many would use far more than the 1M tokens assumed in the article's calculations.

replies(2): >>45053741 #>>45054928 #
1. empath75 ◴[] No.45053741[source]
There's zero basis for assuming any of that. The most likely situation is a power law curve where the vast majority of users don't use it much at all and the top 10% of users account for 90% of the usage.

It is very likely that you are in the top 10% of users.

replies(1): >>45053996 #
2. hirako2000 ◴[] No.45053996[source]
True. the article also has zero basis in its estimating the average usage from each tier's user base.

I somewhat doubt my usage is so close to the edge of the curve since I don't even pay for any plan. It could be that I'm very frugal with money and fat on consumption while most are more balanced, but 1M token per day in any case sounds slim for any user who pays for the service.