←back to thread

152 points isoprophlex | 8 comments | | HN request time: 0.001s | source | bottom
Show context
daft_pink ◴[] No.45645040[source]
I think this is a minor speed bump and VC’s believe that cost of inference will decrease over time and this is a gold rush to grab market share while cost of inference declines.

I don’t think they got it right and the market share and usage grew faster than inference dropped, but inference costs will clearly drop and these companies will eventually be very profitable.

Reality is that startups like this assume moore’s law will drop the cost over time and arrange their business around where they expect costs to be and not where costs currently are.

replies(6): >>45645108 #>>45645191 #>>45645220 #>>45645347 #>>45645403 #>>45645748 #
x0x0 ◴[] No.45645347[source]
> inference costs will clearly drop

They haven't though. On two fronts: 1, the soa models have been pretty constantly priced, and everyone wants the soa models. Likely the only way costs drop is the models get so good that people are like hey, I'm fine with a less useful answer (which is still good enough) and that seems, right now, like a bad bet.

and 2 - we use a lot more tokens now. No more pasting Q&A into a site; now people hammer up chunks of their codebases and would love to push more. More context, more thinking, more everything.

replies(3): >>45645435 #>>45645642 #>>45645775 #
1. ctoth ◴[] No.45645435[source]
You're describing increased spending while calling it increased cost. These aren't the same thing. A task that cost me $5 to accomplish with GPT-4 last year might cost $1 with Sonnet today, even though I'm now spending $100/month total on AI instead of $20 because I'm doing 100x more tasks. The cost per task dropped 80%. My spending went up 5x. Both statements are true.

Here's an analogy you may understand:

https://crespo.business/posts/cost-of-inference/

replies(1): >>45645594 #
2. KallDrexx ◴[] No.45645594[source]
Fwiw that's not necessarily true, because if Sonnet ends up using reasoning, then you are using more tokens than GPT-4 would have used for the same task. Same with GPT-5 since it will decide (using an LLM) if it should use the thinking model for it (and you don't have as much control over it).
replies(1): >>45645698 #
3. steveklabnik ◴[] No.45645698[source]
This is addressed in the post.
replies(1): >>45646216 #
4. x0x0 ◴[] No.45646216{3}[source]
Right, but if I understand you, the counterargument is dumb since the context in which we are discussing is business viability (vcs investing in businesses where the unit economics require inference cost decreases), so actual dollars out rather than some imaginary cost per token is the metric that matters.

Inference is getting so much cheaper that cursor and zed have had to raise prices.

replies(2): >>45646301 #>>45647083 #
5. steveklabnik ◴[] No.45646301{4}[source]
> so actual dollars out rather than some imaginary cost per token is the metric that matters.

Even if we take this as true, the point is that this is different than "the cost of inference isn't going down." It is going down, it's just that people want more performance, and are willing to pay for it. Spend going up is not the same as cost going up.

I don't disagree that there are a wide variety of things to talk about here, but that means it's extra important to get what you're talking about straight.

replies(1): >>45646445 #
6. x0x0 ◴[] No.45646445{5}[source]
Playing word games labeling inference narrowly as the cost per token rather than the per-X $ going to your llm api provider per customer/user/use/whatever is kinda silly?

The cost of inference -- ie $ that go to your llm api provider -- has increased and certainly appears to continue to increase.

see also https://ethanding.substack.com/p/ai-subscriptions-get-short-...

replies(1): >>45646884 #
7. steveklabnik ◴[] No.45646884{6}[source]
> The cost of inference -- ie $ that go to your llm api provider

This is the crux of it: when talking about "the cost of inference" for the purposes of the unit economics of the business, what's being discussed is not what they charge you. It's about their COGs.

That's not word games. It's about being clear about what's being talked about.

Talking about increased prices is something that could be talked about! But it's a different thing. For example, what you're talking about here is total spend, not about individual pricing going up or down. That's also a third thing!

You can't come to agreement unless you agree on what's being discussed.

8. dcre ◴[] No.45647083{4}[source]
Why do the unit economics require a decrease in inference spend per user? This is discussed at the end of the post. I think this is based on the very strange assumption that these businesses must charge $20 a month no matter how much inference their customers want to use. This is precisely what the move to usage-based pricing was about. End users want to use more inference because they like it so much, and are knocking down these companies’ doors demanding to be allowed to pay them more money to get more inference.