Context is the bottleneck for coding agents now

(runnercode.com)

196 points zmccormick7 | 1 comments | 26 Sep 25 15:06 UTC | HN request time: 0s | source

Show context

aliljet ◴[26 Sep 25 15:27 UTC] No.45387614[source]▶

There's a misunderstanding here broadly. Context could be infinite, but the real bottleneck is understanding intent late in a multi-step operation. A human can effectively discard or disregard prior information as the narrow window of focus moves to a new task, LLMs seem incredibly bad at this.

Having more context, but leaving open an inability to effectively focus on the latest task is the real problem.

replies(10): >>45387639 #>>45387672 #>>45387700 #>>45387992 #>>45388228 #>>45388271 #>>45388664 #>>45388965 #>>45389266 #>>45404093 #

bgirard ◴[26 Sep 25 15:34 UTC] No.45387700[source]▶

>>45387614 #

I think that's the real issue. If the LLM spends a lot of context investigating a bad solution and you redirect it, I notice it has trouble ignoring maybe 10K tokens of bad exploration context against my 10 line of 'No, don't do X, explore Y' instead.

replies(6): >>45387838 #>>45387902 #>>45388477 #>>45390299 #>>45390619 #>>45394242 #

ericmcer ◴[26 Sep 25 19:44 UTC] No.45390299[source]▶

>>45387700 #

It seems possible for openAI/Anthropic to rework their tools so they discard/add relevant context on the fly, but it might have some unintended behaviors.

The main thing is people have already integrated AI into their workflows so the "right" way for the LLM to work is the way people expect it to. For now I expect to start multiple fresh contexts while solving a single problem until I can setup a context that gets the result I want. Changing this behavior might mess me up.

replies(2): >>45391352 #>>45391400 #

1. sshine ◴[26 Sep 25 21:48 UTC] No.45391352[source]▶

>>45390299 #

> rework their tools so they discard/add relevant context on the fly

That may be the foundation for an innovation step in model providers. But you can achieve a poor man’s simulation if you can determine, in retrospect, when a context was at peak for taking turns, and when it got too rigid, or too many tokens were spent, and then simply replay the context up until that point.

I don’t know if evaluating when a context is worth duplicating is a thing; it’s not deterministic, and it depends on enforcing a certain workflow.

↑