LLM that can call multiple tool APIs with one request

(cohere.com)

129 points ericciarla | 2 comments | 17 Jun 24 21:47 UTC | HN request time: 0.408s | source

Show context

madrox ◴[18 Jun 24 00:02 UTC] No.40712650[source]▶

I have a saying: "any sufficiently advanced agent is indistinguishable from a DSL"

If I'm really leaning into multi-tool use for anything resembling a mutation, then I'd like to see an execution plan first. In my experience, asking an AI to code up a script that calls some functions with the same signature as tools and then executing that script actually ends up being more accurate than asking it to internalize its algorithm. Plus, I can audit it before I run it. This is effectively the same as asking it to "think step by step."

I like the idea of Command R+ but multitool feels like barking up the wrong tree. Maybe my use cases are too myopic.

replies(7): >>40713594 #>>40713743 #>>40713985 #>>40714302 #>>40717871 #>>40718481 #>>40721499 #

TZubiri ◴[18 Jun 24 03:12 UTC] No.40713743[source]▶

>>40712650 #

I think you are imagining a scenario where you are using the LLM manually. Tools are designed to serve as a backend for other GPT like products.

You don't have the capacity to "audit" stuff.

Furthermore tool execution occurs not in the LLM but in the code that calls the LLM through API. So whatever code executes the tool, it also orders the calling sequence graph. You don't need to audit it, you are calling it.

replies(1): >>40713878 #

verdverm ◴[18 Jun 24 03:45 UTC] No.40713878[source]▶

>>40713743 #

People want to audit the args, mainly because of the potential for destructive operations like DELETE FROM and rm -rf /

How do you know a malicious actor won't try to do these things? How do you protect against it?

replies(2): >>40713887 #>>40713896 #

TZubiri ◴[18 Jun 24 03:48 UTC] No.40713887[source]▶

>>40713878 #

"the args"

You need to be more specific. In a systems, everything but the output is an argument to something else. Even then the system output is an input to the user.

So yeah, depending on what argument you are talking about you can audit it in a different way and it has different potential for abuse.

replies(1): >>40714060 #

verdverm ◴[18 Jun 24 04:24 UTC] No.40714060[source]▶

>>40713887 #

The args to a function like SQL or TERMINAL

replies(1): >>40714218 #

TZubiri ◴[18 Jun 24 05:03 UTC] No.40714218[source]▶

>>40714060 #

I personally don't connect LLMs to SQL, but to APIs.

But I'm pretty sure you would just give an SQL user to the LLM and enjoy the SQL server's built-in permissions and auditing features.

replies(1): >>40714254 #

verdverm ◴[18 Jun 24 05:11 UTC] No.40714254[source]▶

>>40714218 #

What if that user has write permissions and the LLM generates a bad UPDATE, i.e. forgets to put the WHERE clause in... even for a SELECT, how do you know the right constraints were in place and you are getting the correct data?

read-only use-cases misses a whole category. All this is to get back to the point that people want to audit the LLM before running the function because of the unreliability, there is hesitance with good reason

replies(3): >>40714408 #>>40714689 #>>40723056 #

_puk ◴[18 Jun 24 06:34 UTC] No.40714689[source]▶

>>40714254 #

> All this is to get back to the point that people want to audit the LLM before running the function because of the unreliability, there is hesitance with good reason.

some people - I think it's quite clear from this thread that not everyone feels the need to.

I'm now thinking requesting the LLM also output its whole prompt to something like a Datadog trace function would be quite useful for review / traceability.

replies(1): >>40715489 #

1. verdverm ◴[18 Jun 24 09:00 UTC] No.40715489[source]▶

>>40714689 #

Most LLM observability tools do this

I'm currently using LangFuse and exploring OpenLit because it integrates with Otel, which you should be able to forward to Datadog iirc their docs

replies(1): >>40718670 #

2. kakaly0403 ◴[18 Jun 24 15:08 UTC] No.40718670[source]▶

>>40715489 (TP) #

Checkout Langtrace. It’s also OTEL and integrates with datadog.

https://github.com/Scale3-Labs/langtrace

↑