←back to thread

223 points edunteman | 4 comments | | HN request time: 0.927s | source

Hi HN! Erik here from Pig.dev, and today I'd like to share a new project we've just open sourced:

Muscle Mem is an SDK that records your agent's tool-calling patterns as it solves tasks, and will deterministically replay those learned trajectories whenever the task is encountered again, falling back to agent mode if edge cases are detected. Like a JIT compiler, for behaviors.

At Pig, we built computer-use agents for automating legacy Windows applications (healthcare, lending, manufacturing, etc).

A recurring theme we ran into was that businesses already had RPA (pure-software scripts), and it worked for them in most cases. The pull to agents as an RPA alternative was not to have an infinitely flexible "AI Employees" as tech Twitter/X may want you to think, but simply because their RPA breaks under occasional edge-cases and agents can gracefully handle those cases.

Using a pure-agent approach proved to be highly wasteful. Window's accessibility APIs are poor, so you're generally stuck using pure-vision agents, which can run around $40/hr in token costs and take 5x longer than a human to perform a workflow. At this point, you're better off hiring a human.

The goal of Muscle-Mem is to get LLMs out of the hot path of repetitive automations, intelligently swapping between script-based execution for repeat cases, and agent-based automations for discovery and self-healing.

While inspired by computer-use environments, Muscle Mem is designed to generalize to any automation performing discrete tasks in dynamic environments. It took a great deal of thought to figure out an API that generalizes, which I cover more deeply in this blog: https://erikdunteman.com/blog/muscle-mem/

Check out the repo, consider giving it a star, or dive deeper into the above blog. I look forward to your feedback!

Show context
allmathl ◴[] No.43989862[source]
> At Pig, we built computer-use agents for automating legacy Windows applications (healthcare, lending, manufacturing, etc).

How do you justify this vs fixing the software to enable scripting? That seems both cheaper and easier to achieve and with far higher yields. Assume market rate servicing of course.

Plus; how do you force an "agent" to correct its behavior?

replies(1): >>43989877 #
nawgz ◴[] No.43989877[source]
Sorry, am I missing something? They obviously do not control source for these applications, but are able to gain access to whatever benefit the software originally had - reporting, tracking, API, whatever - by automating data entry tasks with AI.

Legacy software is frequently useful but difficult to install and access.

replies(2): >>43989903 #>>43989920 #
allmathl[dead post] ◴[] No.43989920[source]
[flagged]
1. primax ◴[] No.43989953[source]
I think if you got some experience in the industries this serves then you'd reconsider your opinion
replies(1): >>43989977 #
2. allmathl ◴[] No.43989977[source]
i don't see how improving the source could detriment anyone. The rest is just relationship details. The decent part about relationship details is imprisonment.
replies(1): >>43989998 #
3. Centigonal ◴[] No.43989998[source]
As nawgz said, the applications they are automating are often closed-source binary blobs. They can't enable scripting without some kind of RPA or AI agent.
replies(2): >>43990082 #>>43990273 #
4. edunteman ◴[] No.43990273{3}[source]
Correct. In the RPA world, if there's an API available, or even a sqlite server you can tap into, you absolutely should go directly to the source. Emulating human mouse and keyboard is an absolute last resort for getting data across the application boundary, for when those direct apis are unavailable.