Can anyone explain why it's not allowed to compensate the creators of the data?
But at some point, model improvement will saturate (perhaps it already has). At that point, model architecture could be frozen, and the only purpose of additional training would be to bake new knowledge into existing models. It's unclear if this would require retraining the model from scratch, or simply fine-tuning existing pre-trained weights on a new training corpus. If the former, AI companies are dead in the water, barring a breakthrough in dramatically reducing training costs. If the latter, assuming the cost of fine-tuning is a fraction of the cost of training from scratch, the low cost of inference does indeed make a bullish case for these companies.
If you understand there are multiple models from multiple providers, some of those models are better at certain things than others, and how you can get those models to complete your tasks, you are in the top 1% (probably less) of LLM users.
For others, I think the picture is different. When we ran benchmarks on DeepSeek-R1 on 8x H200 SXM using vLLM, we got up to 12K total tok/s (concurrency 200, input:output ratio of 6:1). If you're spiking up 100-200K tok/s, you need a lot of GPUs for that. Then, the GPUs sit idle most of the time.
I'll read the blog post in more detail, but I don't think the following assumptions hold outside of AI labs.
* 100% utilization (no spikes, balanced usage between day/night or weekdays) * Input processing is free (~$0.001 per million tokens) * DeepSeek fits into H100 cards in a way that network isn't the bottleneck
= $ 10,000,000 C T
=$10,000,000.
Each query costs
= $ 0.002 C I
=$0.002.
Break-even:
> 10,000,000 0.002 = 5,000,000,000
inferences N> 0.002 10,000,000
=5,000,000,000inferences
So after 5 billion queries, inference costs surpass the training cost.
Openai claims it has 100 million users x queries = I let you judge.
Whether they flow through COGS/COR or elsewhere on the income statement, they've gotta be recognized. In which case, either you have low gross margins or low operating profit (low net income??). Right?
That said, I just can't conceive of a way that training costs are not hitting gross margins. Be it IFRS/GAAP etc., training is 1) directly attributable to the production of the service sold, 2) is not SG&A, financing, or abnormal cost, and thus 3) only makes sense to match to revenue.
On the other hand, this may also turn into cost effective methods such as model distillation and spot training of large companies (similarly to Deepseek). This would erode the comparative advantage of Anthropic and OpenAI, and result in a pure value-add play for integration with data sources and features such as SSO.
It isn't clear to me that a slowing of retraining will result in advantages to incumbents if model quality cannot be readily distinguished by end-users.
This is almost surely wrong but my point was about GPT5 level models in general not GPT5 specifically...
Now, when he said that, his CFO corrected him and said they aren't profitable, but said "it's close".
Take that with a grain of salt, but thats a conversation from one of the big AI companies that is only a few weeks old. I suspect that it is pretty accurate that pricing is currently reasonable if you ignore training. But training is very expensive and the reason most AI companies are losing money right now.
which is completely "normal" at this point, """right"""? if you have billions of VC money chasing returns there's no time to sit around, it's all in, the hype train doesn't wait for bootstrapping profitability. and of course with these gargantuan valuations and mandatory YoY growth numbers, there is no way they are not fucking with the unit economy numbers too. (biases are hard to beat, especially if there's not much conscious effort to do so.)
They sure have a lot of training to do between now and whenever that happens. Rolling back from 5 to whatever was before it is their own admission of this fact.
Back of the envelope: $25k GPU amortized over 5 years is $5k/year. A 500W GPU run at full power uses 4.5MWh; at $0.15/kWh the electricity costs $650/year.
The other operating costs you suggest have to be even smaller.
for their investors, however, they are promising a revolution
At the same time, more capable models are also a lot more expensive to train.
The key point is that the relationship between all these magnitudes is not linear, so the economics of the whole thing start to look wobbly.
Soon we will probably arrive at a point where these huge training runs must stop, because the performance improvement does not match the huge cost increase, and because the resulting model would be so expensive to run that the market for it would be too small.
Passing in docs usually helps, but I've had some incredibly aggravating experiences where a model just absolutely cannot accept their "mental mode" is incorrect and that they need to forget the tens of thousands of lines of out of date example code they've ingested during training. IMO it's an under-discussed aspect of the current effectiveness of LLM development thanks to the training arms race.
I recently had to fight Gemini to accept that a library (a Google developed AI library for JS, somewhat ironically) had just released a major version update with a lot of API changes that invalidated 99% of the docs and example code online. And boy was there a lot of old code floating around thanks to the vast amounts of SEO blog spam for anything AI adjacent.
The reasoning here is off. It is like saying new game development is nearly over as some people keep playing old games.
My feeling: we've yet barely scrarched the surface on the milage we can get out of even today's frontier models, but we are just at the beginning of a huge runway for improved models and architectures. Watch this space.
FCFF = EBIT(1-t)-Reinvestment. The operating expenses of the model business are much higher - so lower EBIT.
The larger the reinvestment the larger the hole. And the longer it continues (without clear steep barriers to entry to exclude competitors in the long run) it becomes harder to justify a high valuation.
I really dislike comparisons like this - it glosses over a lot of details.
I don't have difficulty getting to a 20% productivity gain with AI just from automating the tasks I procrastinate on or can't focus on. Likewise the ability to code a prototype overnight/over the weekend is a reasonable extension of practical working hours.
The challenge I do see is that fully AI generated code bases devolve into slop pretty fast. The productivity cutoffs are much lower compared to human engineers.
I think you overestimate the amount of code turnover in 6-12 months...
I think we're a lot more likely to get to the limit of power and compute available for training a bigger model before we get to the point where improvement stops.
In the 90's and early 2000s, but people laughed at businesses like Amazon & Google for years. These types of people highly focused on the free cash flow of a business in it's early years are just dumb. Sometimes a business takes a lot of investment in the early stages - whether it's capex for data centers or S&M for enterprise software businesses, or R&D for pharma businesses or whatever.
As for "clear steep barriers" - again, just clueless stuff. There weren't clear steep barriers to search when Google started, there were dozens of search engines. Google created them. Creating barriers to entry is expensive and the "FCFF people" imagine they arrive out of thin air. It takes a lot of time and or money to create them.
It's unclear if "the model business" is going to be high or low margin. It's unclear how high the barriers to entry for making models will be in practice. It's unclear what the reinvestment required will be. We are a few years into it. About the only thing that is clear is this: if you try to run a positive free cashflow business in this space over the next few years, you'll be crushed. If you want a shot at a large, high return on capital business come 2035, you better be willing to spend up now.