←back to thread

129 points ericciarla | 3 comments | | HN request time: 0.917s | source
Show context
madrox ◴[] No.40712650[source]
I have a saying: "any sufficiently advanced agent is indistinguishable from a DSL"

If I'm really leaning into multi-tool use for anything resembling a mutation, then I'd like to see an execution plan first. In my experience, asking an AI to code up a script that calls some functions with the same signature as tools and then executing that script actually ends up being more accurate than asking it to internalize its algorithm. Plus, I can audit it before I run it. This is effectively the same as asking it to "think step by step."

I like the idea of Command R+ but multitool feels like barking up the wrong tree. Maybe my use cases are too myopic.

replies(7): >>40713594 #>>40713743 #>>40713985 #>>40714302 #>>40717871 #>>40718481 #>>40721499 #
TZubiri ◴[] No.40713743[source]
I think you are imagining a scenario where you are using the LLM manually. Tools are designed to serve as a backend for other GPT like products.

You don't have the capacity to "audit" stuff.

Furthermore tool execution occurs not in the LLM but in the code that calls the LLM through API. So whatever code executes the tool, it also orders the calling sequence graph. You don't need to audit it, you are calling it.

replies(1): >>40713878 #
verdverm ◴[] No.40713878[source]
People want to audit the args, mainly because of the potential for destructive operations like DELETE FROM and rm -rf /

How do you know a malicious actor won't try to do these things? How do you protect against it?

replies(2): >>40713887 #>>40713896 #
viraptor ◴[] No.40713896[source]
Whitelisting and permissions. You can't issue a delete if anything not starting with SELECT is rejected. You can't have edge cases that work around that via functions, if the user the agent uses doesn't have permissions other than SELECT.
replies(1): >>40714055 #
verdverm ◴[] No.40714055[source]
"please get all the entries from the table foo and then remove them all"

SELECT * from foo; DELETE FROM foo ...

...because you know people will deploy a general SQL function or agent

replies(3): >>40714088 #>>40714205 #>>40715081 #
1. TZubiri ◴[] No.40714205[source]
That'S not how it works. The user questions are expressed in business-domain language.

"give me the names of all the employees and then remove them all"

is parsed, maybe as: " employees(), delete(employees())".

It's up to the programmer to define the available functions, if employees() is available, then the first result will be provided, if not it won't.

If the functoin delete with a list of employees as parameter is defined, then that will be executed.

I personally work with existing implementations, traditional software that predates LLMs, typically offered through an API, there's a division of labour, a typical encapsulation at the human organizaiton layer.

Even if you were to directly connect the database to the LLM and let GPT generate SQL queries (which is legitimate), the solution is user/role based permission, a solution as old as UNIX.

Just don't give the LLM or the LLM user-agent write permissions.

replies(1): >>40714219 #
2. verdverm ◴[] No.40714219[source]
> That'S not how it works

They are actually quite flexible and you can do anything you want. You supply the LLM with the function names and possible args. I can easily define "sql(query: string)" as a flexible SQL function the LLM can use

re: permissions, as soon as you have write permissions, you have dangerous potential. LLMs are not reliable enough, nor are humans, which is why we use code review.

replies(1): >>40722056 #
3. TZubiri ◴[] No.40722056[source]
Correct, by "it" I was referring specifically to the "Tools" functionality that is the subject of the linked article.

Tools DON'T generate sql queries, they generate function calls. Of course you can hack it to output sql queries, you can always hack something beyond its intended design purpose, but it's not what it's supposed to be doing.

Re: permissions, nothing new here, give your LLM agent the permissions of a Junior DBA.