(runnercode.com)

196 points zmccormick7 | 1 comments | 26 Sep 25 15:06 UTC | HN request time: 0.198s | source

Show context

aliljet ◴[26 Sep 25 15:27 UTC] No.45387614[source]▶

There's a misunderstanding here broadly. Context could be infinite, but the real bottleneck is understanding intent late in a multi-step operation. A human can effectively discard or disregard prior information as the narrow window of focus moves to a new task, LLMs seem incredibly bad at this.

Having more context, but leaving open an inability to effectively focus on the latest task is the real problem.

replies(10): >>45387639 #>>45387672 #>>45387700 #>>45387992 #>>45388228 #>>45388271 #>>45388664 #>>45388965 #>>45389266 #>>45404093 #

ray__ ◴[26 Sep 25 15:29 UTC] No.45387639[source]▶

>>45387614 #

This is a great insight. Any thoughts on how to address this problem?

replies(3): >>45387751 #>>45387782 #>>45387912 #

throwup238 ◴[26 Sep 25 15:40 UTC] No.45387751[source]▶

>>45387639 #

It has to be addressed architecturally with some sort of extension to transformers that can focus the attention on just the relevant context.

People have tried to expand context windows by reducing the O(n^2) attention mechanism to something more sparse and it tends to perform very poorly. It will take a fundamental architectural change.

replies(3): >>45387795 #>>45387930 #>>45388296 #

1. yggdrasil_ai ◴[26 Sep 25 16:27 UTC] No.45388296[source]▶

>>45387751 #

>extension to transformers that can focus the attention on just the relevant context.

That is what transformers attention does in the first place, so you would just be stacking two transformers.

↑

Context is the bottleneck for coding agents now