←back to thread

1246 points adrianh | 1 comments | | HN request time: 0.208s | source
Show context
kunzhi ◴[] No.44496051[source]
Funny this article is trending today because I had a similar thought over the weekend - if I'm in Ruby and the LLM hallucinates a tool call...why not metaprogram it on the fly and then invoke it?

If that's too scary, the failed tool call could trigger another AI to go draft up a PR with that proposed tool, since hey, it's cheap and might be useful.

replies(1): >>44496622 #
garfij ◴[] No.44496622[source]
We've done varying forms of this to differing degrees of success at work.

Dynamic, on-the-fly generation & execution is definitely fascinating to watch in a sandbox, but is far to scary (from a compliance/security/sanity perspective) without spending a lot more time on guardrails.

We do however take note of hallucinated tool calls and have had it suggest an implementation we start with and have several such tools in production now.

It's also useful to spin up any completed agents and interrogate them about what tools they might have found useful during execution (or really any number of other post-process questionnaire you can think of).

replies(1): >>44501839 #
1. kunzhi ◴[] No.44501839[source]
>Dynamic, on-the-fly generation & execution is definitely fascinating to watch in a sandbox, but is far to scary (from a compliance/security/sanity perspective) without spending a lot more time on guardrails.

Would love love love to hear more on what you are doing here? This seems super fascinating (and scary). :)