This is a great article, but it doesn't appear to model H100 downtime in the $2/hr costs. It assumes that OpenAI and Anthropic can match demand for inference to their supply of H100s perfectly, 24/7, in all regions. Maybe you could argue that the idle H100s are being used for model training - but that's different to the article's argument that inference is economically sustainable in isolation.
replies(1):