←back to thread

214 points optimalsolver | 1 comments | | HN request time: 0.206s | source
Show context
equinox_nl ◴[] No.45770131[source]
But I also fail catastrophically once a reasoning problem exceeds modest complexity.
replies(4): >>45770215 #>>45770281 #>>45770402 #>>45770506 #
davidhs ◴[] No.45770281[source]
Do you? Don't you just halt and say this is too complex?
replies(3): >>45770311 #>>45770398 #>>45770868 #
dspillett ◴[] No.45770398[source]
Some would consider that to be failing catastrophically. The task is certainly failed.
replies(3): >>45770566 #>>45770851 #>>45770961 #
carlmr ◴[] No.45770566[source]
Halting is sometimes preferable to thrashing around and running in circles.

I feel like if LLMs "knew" when they're out of their depth, they could be much more useful. The question is whether knowing when to stop can be meaningfully learned from examples with RL. From all we've seen the hallucination problem and this stopping problem all boil down to this problem that you could teach the model to say "I don't know" but if that's part of the training dataset it might just spit out "I don't know" to random questions, because it's a likely response in the realm of possible responses, instead of spitting out "I don't know" to not knowing.

SocratesAI is still unsolved, and LLMs are probably not the path to get knowing that you know nothing.

replies(1): >>45770706 #
1. ukuina ◴[] No.45770706[source]
> if LLMs "knew" when they're out of their depth, they could be much more useful.

I used to think this, but no longer sure.

Large-scale tasks just grind to a halt with more modern LLMs because of this perception of impassable complexity.

And it's not that they need extensive planning, the LLM knows what needs to be done (it'll even tell you!), it's just more work than will fit within a "session" (arbitrary) and so it would rather refuse than get started.

So you're now looking at TODOs, and hierarchical plans, and all this unnecessary pre-work even when the task scales horizontally very well (if it just jumped into it).