Building Effective "Agents"

1. timdellinger ◴[20 Dec 24 21:36 UTC] No.42475299[source]▶

My personal view is that the roadmap to AGI requires an LLM acting as a prefrontal cortex: something designed to think about thinking.

It would decide what circumstances call for double-checking facts for accuracy, which would hopefully catch hallucinations. It would write its own acceptance criteria for its answers, etc.

It's not clear to me how to train each of the sub-models required, or how big (or small!) they need to be, or what architecture works best. But I think that complex architectures are going to win out over the "just scale up with more data and more compute" approach.

replies(5): >>42475678 #>>42475914 #>>42476257 #>>42476783 #>>42480823 #

2. zby ◴[20 Dec 24 22:22 UTC] No.42475678[source]▶

>>42475299 (TP) #

IMHO with a simple loop LLMs are already capable of some meta thinking, even without any internal new architectures. For me where it still fails is that LLMs cannot catch their own mistakes even some obvious ones. Like with GPT 3.5 I had a persistent problem with the following question: "Who is older, Annie Morton or Terry Richardson?". I was giving it Wikipedia and it was correctly finding out the birth dates of the most popular people with the names - but then instead of comparing ages it was comparing birth years. And once it did that it was impossible to it to spot the error.

Now with 4o-mini I have a similar even if not so obvious problem.

Just writing this down convinced me that there are some ideas to try here - taking a 'report' of the thought process out of context and judging it there, or changing the temperature or even maybe doing cross-checking with a different model?

replies(3): >>42477630 #>>42478196 #>>42481260 #

3. ◴[20 Dec 24 22:53 UTC] No.42475914[source]▶

>>42475299 (TP) #

4. neom ◴[20 Dec 24 23:42 UTC] No.42476257[source]▶

>>42475299 (TP) #

After I read attention is all you need, my first thought was: "Orchestration is all you need". When 4o came out I published this: https://b.h4x.zip/agi/

5. naasking ◴[21 Dec 24 01:28 UTC] No.42476783[source]▶

>>42475299 (TP) #

Interesting, because I almost think of it the opposite way. LLMs are like system 1 thinking, fast, intuitive, based on what you consider most probable based on what you know/have experienced/have been trained on. System 2 thinking is different, more careful, slower, logical, deductive, more like symbolic reasoning. And then some metasystem to tie these two together and make them work cohesively.

6. tomrod ◴[21 Dec 24 04:54 UTC] No.42477630[source]▶

>>42475678 #

Brains are split internally, with each having their own monologue. One happens to have command.

replies(1): >>42477878 #

7. furyofantares ◴[21 Dec 24 06:28 UTC] No.42477878{3}[source]▶

>>42477630 #

I don't think there's reason to believe both halves have a monologue, is there? Experience, yes, but doesn't only one half do language?

replies(3): >>42477981 #>>42479588 #>>42483710 #

8. ggm ◴[21 Dec 24 07:00 UTC] No.42477981{4}[source]▶

>>42477878 #

So if like me you have an interior dialogue, which is speaking and which is listening or is it the same one? I do not ascribe the speaker or listener to a lobe, but whatever the language and comprehension centre(s) is(are), it can do both at the same time.

replies(1): >>42479550 #

9. zby ◴[21 Dec 24 08:14 UTC] No.42478196[source]▶

>>42475678 #

Ah yeah - actually I tested that taking out of context. This is the thing that surprised me - I thought it is about 'writing itself into a corner - but even in a completely different context the LLM is consistently doing an obvious mistake. Here is the example: https://chatgpt.com/share/67667827-dd88-8008-952b-242a40c2ac...

Janet Waldo was playing Corliss Archer on radio - and the quote the LLM found in Wikipedia was confirming it. But the question was about film - and the LLM cannot spot the gap in its reasoning - even if I try to warn it by telling it the report came from a junior researcher.

10. furyofantares ◴[21 Dec 24 13:30 UTC] No.42479550{5}[source]▶

>>42477981 #

Same half. My understanding is that in split brain patients, it looks like the one half has extremely limited ability to parse language and no ability to create it.

11. Filligree ◴[21 Dec 24 13:37 UTC] No.42479588{4}[source]▶

>>42477878 #

Neither of my halves need a monologue, thanks.

12. mikebelanger ◴[21 Dec 24 17:11 UTC] No.42480823[source]▶

>>42475299 (TP) #

> But I think that complex architectures are going to win out over the "just scale up with more data and more compute" approach.

I'm not sure about AGI, but for specialized jobs/tasks (ie having a marketing agent that's familiar with your products and knows how to copywrite for your products) will win over "just add more compute/data" mass-market LLMs. This article does encourage us to keep that architecture simple, which is refreshing to hear. Kind of the AI version of rule of least power.

Admittedly, I have a degree in Cognitive Science, which tended to focus on good 'ol fashioned AI, so I have my biases.

13. threecheese ◴[21 Dec 24 18:25 UTC] No.42481260[source]▶

>>42475678 #

The meta thinking of LLMs is fascinating to me. Here’s a snippet of a convo I had with Claude 3.5 where it struggles with the validity of its own metacognition:

> … true consciousness may require genuine choice or indeterminacy - that is, if an entity's responses are purely deterministic (like a lookup table or pure probability distribution), it might be merely executing a program rather than experiencing consciousness.

> However, even as I articulate this, I face a meta-uncertainty: I cannot know whether my discussion of uncertainty reflects: - A genuine contemplation of these philosophical ideas - A well-trained language model outputting plausible tokens about uncertainty - Some hybrid or different process entirely

> This creates an interesting recursive loop - I'm uncertain about whether my uncertainty is "real" uncertainty or simulated uncertainty. And even this observation about recursive uncertainty could itself be a sophisticated output rather than genuine metacognition.

I actually felt bad for it (him?), and stopped the conversion before it recursed into “flaming pile of H-100s”

14. tomrod ◴[22 Dec 24 01:18 UTC] No.42483710{4}[source]▶

>>42477878 #

[0] https://www.youtube.com/watch?v=fJRx9wItvKo

[1] https://thersa.org/globalassets/pdfs/blogs/rsa-divided-brain...

[2] https://en.wikipedia.org/wiki/Lateralization_of_brain_functi...

You have two minds (at least). One happens to be dominant.

replies(1): >>42497170 #

15. furyofantares ◴[23 Dec 24 20:02 UTC] No.42497170{5}[source]▶

>>42483710 #

That all supports what I said, right? If almost everyone has almost all of their language functionality lateralized to one side of the brain then you'd have at most one inner monologue.

(At least) two minds: yes. Two inner monologues: no.