I disagree. There are massive fixed costs to developing LLMs that are best amortized over a massive number of users. So there's an incentive to make the cost as cheap as possible and LLMs more accessible to recoup those fixed costs.
Yes, there are also high variable costs involved, so there’s also a floor to how cheap they can get today. However, hardware will continue to get cheaper and more powerful while users can still massively benefit from the current generation of LLMs. So it is possible for these products to become overall cheaper and more accessible using low-end future hardware with current generation LLMs. I think Llama 4 running on a future RTX 7060 in 2029 could be served at a pretty low cost while still providing a ton of value for most users.