OpenAI shared chart about performance drop with large context like 500k tokens etc. So you still want to limit the context not only for the cost but performance as well. You also probably want to limit context to speedup inference and get reponse faster.
I agree though that a lot of those agents are black boxes and hard to even learn how to best combine .rules, llms.txt, prd, mcp, web search, function call, memory. Most IDEs don't provide output where you can inspect final prompts etc to see how those are executed - maybe you have to use some MITMproxy to inspect requests etc but some tool would be useful to learn best practices.
I will be trying more roo code and cline since they open source and you can at least see system prompts etc.