now build it for old codebase, let's see how precisely it edits or removes features without breaking the whole codebase
lets see how many tokens it consumes per bug fix or feature addition.
now build it for old codebase, let's see how precisely it edits or removes features without breaking the whole codebase
lets see how many tokens it consumes per bug fix or feature addition.
1. Precompute frequently used knowledge and surface early. For example repository structure, os information, system time.
2. Anticipate next tool calls. If a match is not found while editing, instead of simply failing, return closest matching snippet. If read file tool gets a directory, return directory contents.
3. Parallel tool calls. Claude needs either a batch tool or special scaffolding to promote parallel tool calls. Single tool call per turn is very expensive.
Are there any other such general ideas?
I am still looking for a good "memory" solution, so far running without it. Haven't looked too deep into it.
Not sure how next tool call be predicted.
I am still using serial tool calls as i do not have any subagents, i just use fast inference models for directly tools calls. It works so fast, i doubt i'll benefit from parallel anything.
i just wrote this comment so people aren't under false belief that it's pretty much all coding agents do, making all this fault tolerant with good ux is lot of work.
There are a few models that solve 30-50% of (new) tasks pulled from real-wolrd repos. So ... yeah.
Yes, it is. Not only in the department of good design in UX, but these LLMs keep evolving. They are software with different versions, and these different versions are continually deployed, which changes the behavior of the underlying model. So the harness needs to be continually updated to remain competitive.