Are OpenAI and Anthropic losing money on inference?

(martinalderson.com)

507 points martinald | 2 comments | 28 Aug 25 10:15 UTC | HN request time: 0.001s | source

Show context

_sword ◴[28 Aug 25 17:53 UTC] No.45055003[source]▶

I've done the modeling on this a few times and I always get to a place where inference can run at 50%+ gross margins, depending mostly on GPU depreciation and how good the host is at optimizing utilization. The challenge for the margins is whether or not you consider model training costs as part of the calculation. If model training isn't capitalized + amortized, margins are great. If they are amortized and need to be considered... yikes

replies(7): >>45055030 #>>45055275 #>>45055536 #>>45055820 #>>45055835 #>>45056242 #>>45056523 #

BlindEyeHalo ◴[28 Aug 25 18:19 UTC] No.45055275[source]▶

>>45055003 #

Why wouldn't you factor in training? It is not like you can train once and then have the model run for years. You need to constantly improve to keep up with the competition. The lifespan of a model is just a few months at this point.

replies(7): >>45055303 #>>45055495 #>>45055624 #>>45055631 #>>45056110 #>>45056973 #>>45057517 #

MontyCarloHall ◴[28 Aug 25 18:40 UTC] No.45055495[source]▶

>>45055275 #

As long as models continue on their current rapid improvement trajectory, retraining from scratch will be necessary to keep up with the competition. As you said, that's such a huge amount of continual CapEx that it's somewhat meaningless to consider AI companies' financial viability strictly in terms of inference costs, especially because more capable models will likely be much more expensive to train.

But at some point, model improvement will saturate (perhaps it already has). At that point, model architecture could be frozen, and the only purpose of additional training would be to bake new knowledge into existing models. It's unclear if this would require retraining the model from scratch, or simply fine-tuning existing pre-trained weights on a new training corpus. If the former, AI companies are dead in the water, barring a breakthrough in dramatically reducing training costs. If the latter, assuming the cost of fine-tuning is a fraction of the cost of training from scratch, the low cost of inference does indeed make a bullish case for these companies.

replies(1): >>45057026 #

1. mgh95 ◴[28 Aug 25 21:05 UTC] No.45057026[source]▶

>>45055495 #

> If the latter, assuming the cost of fine-tuning is a fraction of the cost of training from scratch, the low cost of inference does indeed make a bullish case for these companies.

On the other hand, this may also turn into cost effective methods such as model distillation and spot training of large companies (similarly to Deepseek). This would erode the comparative advantage of Anthropic and OpenAI, and result in a pure value-add play for integration with data sources and features such as SSO.

It isn't clear to me that a slowing of retraining will result in advantages to incumbents if model quality cannot be readily distinguished by end-users.

replies(1): >>45057966 #

2. echelon ◴[28 Aug 25 22:56 UTC] No.45057966[source]▶

>>45057026 (TP) #

> model distillation

I like to think this is the end of software moats. You can simply call a foundation model company's API enough times and distill their model.

It's like downloading a car.

Distribution still matters, of course.

↑