(github.com)

226 points edunteman | 1 comments | 14 May 25 19:38 UTC | HN request time: 0.21s | source

Hi HN! Erik here from Pig.dev, and today I'd like to share a new project we've just open sourced:

Muscle Mem is an SDK that records your agent's tool-calling patterns as it solves tasks, and will deterministically replay those learned trajectories whenever the task is encountered again, falling back to agent mode if edge cases are detected. Like a JIT compiler, for behaviors.

At Pig, we built computer-use agents for automating legacy Windows applications (healthcare, lending, manufacturing, etc).

A recurring theme we ran into was that businesses already had RPA (pure-software scripts), and it worked for them in most cases. The pull to agents as an RPA alternative was not to have an infinitely flexible "AI Employees" as tech Twitter/X may want you to think, but simply because their RPA breaks under occasional edge-cases and agents can gracefully handle those cases.

Using a pure-agent approach proved to be highly wasteful. Window's accessibility APIs are poor, so you're generally stuck using pure-vision agents, which can run around $40/hr in token costs and take 5x longer than a human to perform a workflow. At this point, you're better off hiring a human.

The goal of Muscle-Mem is to get LLMs out of the hot path of repetitive automations, intelligently swapping between script-based execution for repeat cases, and agent-based automations for discovery and self-healing.

While inspired by computer-use environments, Muscle Mem is designed to generalize to any automation performing discrete tasks in dynamic environments. It took a great deal of thought to figure out an API that generalizes, which I cover more deeply in this blog: https://erikdunteman.com/blog/muscle-mem/

Check out the repo, consider giving it a star, or dive deeper into the above blog. I look forward to your feedback!

Show context

web-cowboy ◴[14 May 25 20:23 UTC] No.43988789[source]▶

>>43988381 (OP) #

This seems like a much more powerful version of what I wanted MCP "prompts" to be - and I'm curious to know if I have that right.

For me, I want to reduce the friction on repeatable tasks. For example, I often need to create a new GraphQL query, which also requires updating the example query collection, creating a basic new integration test, etc. If I had a MCP-accessible prompt, I hoped the agent would realize I have a set of instructions on how to handle this request when I make it.

replies(1): >>43988854 #

1. edunteman ◴[14 May 25 20:32 UTC] No.43988854[source]▶

>>43988789 #

In a way, a Muscle Mem trajectory is just a new "meta tool" that combines sequential use of other tools, with parameters that flow through it all.

One form factor I toyed with was the idea of a dynamically evolving list of tool specs given to a model or presented from an MCP server, but I wasn't thrilled by:

- that'd still require a model in the loop to choose the tool, albeit just once for the whole trajectory vs every step

- it misses the sneaky challenge of Muscle Memory systems, which is continuous cache validation. An environment can change unexpectedly mid-trajectory and a system would need to adapt, so no matter what, it needs something that looks like Muscle Mem's Check abstraction for pre/post step cache validation

↑

Show HN: Muscle-Mem, a behavior cache for AI agents