I don't believe the asymmetry between prefill and decode is that large. If it were, it would make no sense for most of the providers to have separate pricing for prefill with cache hits vs. without.
Given the analysis is based on R1, Deepseek's actual in-production numbers seem highly relevant: https://github.com/deepseek-ai/open-infra-index/blob/main/20...
(But yes, they claim 80% margins on the compute in that article.)
> When established players emphasize massive costs and technical complexity, it discourages competition and investment in alternatives
But it's not the established players emphasizing the costs! They're typically saying that inference is profitable. Instead the false claims about high costs and unprofitability are part of the anti-AI crowd's standard talking points.
replies(1):