←back to thread

129 points ericciarla | 1 comments | | HN request time: 0.226s | source
Show context
madrox ◴[] No.40712650[source]
I have a saying: "any sufficiently advanced agent is indistinguishable from a DSL"

If I'm really leaning into multi-tool use for anything resembling a mutation, then I'd like to see an execution plan first. In my experience, asking an AI to code up a script that calls some functions with the same signature as tools and then executing that script actually ends up being more accurate than asking it to internalize its algorithm. Plus, I can audit it before I run it. This is effectively the same as asking it to "think step by step."

I like the idea of Command R+ but multitool feels like barking up the wrong tree. Maybe my use cases are too myopic.

replies(7): >>40713594 #>>40713743 #>>40713985 #>>40714302 #>>40717871 #>>40718481 #>>40721499 #
darkteflon ◴[] No.40713594[source]
You mean manually pre-baking a DAG from the user query, then “spawning” other LLMs to resolve each node and pass their input up the graph? This is the approach we take too. It seems to be a sufficiently performant approach that is - intuitively - generically useful regardless of ontology / domain, but would love to hear others’ experiences.

It would be nice to know if this is sort of how OpenAI’s native “file_search” retriever works - that’s certainly the suggestion in some of the documentation but it hasn’t, to my knowledge, been confirmed.

replies(1): >>40713783 #
TZubiri ◴[] No.40713783[source]
No. The DAG should be "manually pre-baked" ( defined at compile/design time).

In runtime you only parse the "user question" (user prompt) into a starting and end node, which is equivalent to a function call.

So the question

"What league does Messi play in?"

Is parsed by the llm as

League("Messi")

So if your dag only contains the functions team(player) and league(team), you can still solve the question.

But the llm isn't tasked with resolving the dag, that's code, let the llm chill and do what it's good at, don't make it code a for loop for you

replies(2): >>40713798 #>>40714040 #
darkteflon ◴[] No.40714040[source]
That’s very interesting. Does designing the DAG in advance imply that you have to make a new one for each particular subset of end-user questions you might receive? Or is your problem space such that you can design it once and have it be useful for everything you’re interested in?

My choice of words was poor: by “pre-baking”, I just meant: generated dynamically at runtime from the user’s query, _before_ you then set about answering that query. The nature of our problem space is such that we wouldn’t be able to design DAG in advance of runtime and have it be useful everywhere.

The answering process itself is then handled by deterministically (in code) resolving the dependencies of the DAG in the correct order, where each node might then involve a discrete LLM call (with function) depending on the purpose. Once resolved, a node’s output is passed to the next tier of the DAG with framing context.

replies(1): >>40714255 #
TZubiri ◴[] No.40714255[source]
You don't make a DAG for each question category. This is classic OOP, OG, Kay's version, you design subject-experts (objects) with autonomy and independency, they are just helpful in general. Each function/method, regardless of the Object/Expert, is an edge in the graph. A user question is simply a pair of vertices, call them I and O, and the execution/solution is a path between the two points, namely the input and the output.

The functions are traditional software (Code, API, SQL) the job of the LLM is only to:

1- Map each type of question into a subsystem/codepath. The functional parsing solution is the most advanced. But a simple version involves asking LLM to classify a question into an enum.

2- To parse the parameters as a list of key/value tuples.

The end. Don't ask the LLM to cook your food, clean your clothes or suck your dick. LLM is revolutionary at language, let it do language tasks.

We are not consumers of a helpful AI assistant, we are designers of it.

replies(2): >>40714793 #>>40717093 #
1. darkteflon ◴[] No.40714793[source]
Very interesting perspective - thanks for your time!