Since Gemini CLI was recently released, many people on the "free" tier noticed that their sessions immediately devolved from Gemini 2.5 Pro to Flash "due to high utilization". I asked Gemini itself about this and it reported that the finite GPU/TPU resources in Google's cloud infrastructure can get oversubscribed for Pro usage. Google (no secret here) has a subscription option for higher-tier customers to request guaranteed provisioning for the Pro model. Once their capacity gets approached, they must throttle down the lower-tier (including free) sessions to the less resource-intensive models.
Price is one lever to move once capacity becomes constrained. Yet, as the top voted comment of this post explains, it's not honest to simply label this as a price increase. They raised Flash pricing on input tokens but lowered pricing on output tokens up to certain limits -- which gives creedence to the theory that they are trying to shape the demand in order for it to better match their capacity.