What to build instead of AI agents

(decodingml.substack.com)

233 points giuliomagnifico | 1 comments | 03 Jul 25 00:02 UTC | HN request time: 0.364s | source

Show context

mccoyb ◴[03 Jul 25 01:03 UTC] No.44450552[source]▶

Building agents has been fun for me, but it's clear that there are serious problems with "context engineering" that must be overcome with new ideas. In particular, no matter how big the context window size is increased - one must curate what the agent sees: agents don't have very effective filters on what is relevant to supercharge them on tasks, and so (a) you must leave *.md files strewn about to help guide them and (b) you must put them into roles. The *.md system is essentially a rudimentary memory system, but it could get be made significantly more robust, and could involve e.g. constructing programs and models (in natural language) on the fly, guided by interactions with the user.

What Claude Code has taught me is that steering an agent via a test suite is an extremely powerful reinforcement mechanism (the feedback loop leads to success, most of the time) -- and I'm hopeful that new thinking will extend this into the other "soft skills" that an agent needs to become an increasingly effective collaborator.

replies(4): >>44450945 #>>44451021 #>>44452834 #>>44453646 #

franktankbank ◴[03 Jul 25 02:24 UTC] No.44451021[source]▶

>>44450552 #

Is there a recommended way to construct .md files for such a system? For instance when I make them for human consumption they'd have lots of markup for readability but that may or may not be consumable by an llm. Can you create a .md the same as for human consumption that doesn't hinder an llm?

replies(3): >>44451212 #>>44451577 #>>44452639 #

artpar ◴[03 Jul 25 07:47 UTC] No.44452639[source]▶

>>44451021 #

I am using these files (most of them are llm generated based on my prompt to reduce its lookups when working on a codebase)

https://gist.github.com/artpar/60a3c1edfe752450e21547898e801...

(specially the AGENT.knowledge is quite helpful)

replies(1): >>44452869 #

HumanOstrich ◴[03 Jul 25 08:25 UTC] No.44452869[source]▶

>>44452639 #

Can you provide any form of demonstration of an LLM reading these files and acting accordingly? Do you know how each item added affects its behavior?

I'd also be interested in your process for creating these files, such as examples of prompts, tools, and references for your research.

replies(1): >>44453253 #

1. artpar ◴[03 Jul 25 09:32 UTC] No.44453253[source]▶

>>44452869 #

claude doesn't read them reliably and has to be reminded across sessions. I ususally do @AGENT.main and @AGENT.knowledge and it figures out the rest. Over the period of doing this claude is able to maintain the "project management" part itself, as in terms of "whats the current state of the project" and "what are the next ideal todos and how to go about them".

> Can you provide any form of demonstration of an LLM reading these files and acting accordingly

claude does update them at the end of the session (i say wrap up on prompt). the ones you are seeing in that gist are original forms, they evolve with each commit.

↑