(martinalderson.com)

507 points martinald | 1 comments | 28 Aug 25 10:15 UTC | HN request time: 0.2s | source

Show context

OtherShrezzing ◴[28 Aug 25 14:33 UTC] No.45052672[source]▶

This is a great article, but it doesn't appear to model H100 downtime in the $2/hr costs. It assumes that OpenAI and Anthropic can match demand for inference to their supply of H100s perfectly, 24/7, in all regions. Maybe you could argue that the idle H100s are being used for model training - but that's different to the article's argument that inference is economically sustainable in isolation.

replies(1): >>45053014 #

1. manquer ◴[28 Aug 25 14:57 UTC] No.45053014[source]▶

>>45052672 #

Not really, that is why they sell Batch API at considerably lower costs than the normal API.

There are also probably all kinds of enterprise deals that they are okay with high latency (> hours) that they do beyond the PAYG batch APIs

↑

Are OpenAI and Anthropic losing money on inference?