←back to thread

265 points ctoth | 5 comments | | HN request time: 0.726s | source
Show context
sejje ◴[] No.43744995[source]
In the last example (the riddle)--I generally assume the AI isn't misreading, rather that it assumes you couldn't give it the riddle correctly, but it has seen it already.

I would do the same thing, I think. It's too well-known.

The variation doesn't read like a riddle at all, so it's confusing even to me as a human. I can't find the riddle part. Maybe the AI is confused, too. I think it makes an okay assumption.

I guess it would be nice if the AI asked a follow up question like "are you sure you wrote down the riddle correctly?", and I think it could if instructed to, but right now they don't generally do that on their own.

replies(5): >>43745113 #>>43746264 #>>43747336 #>>43747621 #>>43751793 #
Jensson ◴[] No.43745113[source]
> generally assume the AI isn't misreading, rather that it assumes you couldn't give it the riddle correctly, but it has seen it already.

LLMs doesn't assume, its a text completer. It sees something that looks almost like a well known problem and it will complete with that well known problem, its a problem specific to being a text completer that is hard to get around.

replies(6): >>43745166 #>>43745289 #>>43745300 #>>43745301 #>>43745340 #>>43754148 #
simonw ◴[] No.43745166[source]
These newer "reasoning" LLMs really don't feel like pure text completers any more.
replies(3): >>43745252 #>>43745253 #>>43745266 #
Borealid ◴[] No.43745266[source]
What your parent poster said is nonetheless true, regardless of how it feels to you. Getting text from an LLM is a process of iteratively attempting to find a likely next token given the preceding ones.

If you give an LLM "The rain in Spain falls" the single most likely next token is "mainly", and you'll see that one proportionately more than any other.

If you give an LLM "Find an unorthodox completion for the sentence 'The rain in Spain falls'", the most likely next token is something other than "mainly" because the tokens in "unorthodox" are more likely to appear before text that otherwise bucks statistical trends.

If you give the LLM "blarghl unorthodox babble The rain in Spain" it's likely the results are similar to the second one but less likely to be coherent (because text obeying grammatical rules is more likely to follow other text also obeying those same rules).

In any of the three cases, the LLM is predicting text, not "parsing" or "understanding" a prompt. The fact it will respond similarly to a well-formed and unreasonably-formed prompt is evidence of this.

It's theoretically possible to engineer a string of complete gibberish tokens that will prompt the LLM to recite song lyrics, or answer questions about mathemtical formulae. Those strings of gibberish are just difficult to discover.

replies(6): >>43745307 #>>43745309 #>>43745334 #>>43745371 #>>43746291 #>>43754473 #
Workaccount2 ◴[] No.43745307[source]
The problem is showing that humans aren't just doing next word prediction too.
replies(2): >>43745388 #>>43758748 #
1. Borealid ◴[] No.43745388[source]
I don't see that as a problem. I don't particularly care how human intelligence works; what matters is what an LLM is capable of doing and what a human is capable of doing.

If those two sets of accomplishments are the same there's no point arguing about differences in means or terms. Right now humans can build better LLMs but nobody has come up with an LLM that can build better LLMs.

replies(2): >>43746308 #>>43746612 #
2. baq ◴[] No.43746308[source]
That’s literally the definition of takeoff, when it starts it gets us to singularity in a decade and there’s no publicly available evidence that it’s started… emphasis on publicly available.
replies(1): >>43746658 #
3. johnisgood ◴[] No.43746612[source]
> but nobody has come up with an LLM that can build better LLMs.

Yet. Not that we know of, anyway.

replies(1): >>43769194 #
4. myk9001 ◴[] No.43746658[source]
> it gets us to singularity

Are we sure it's actually taking us along?

5. Aeolos ◴[] No.43769194[source]
Given the dramatic uptake of Cursor / Windsurf / Claude Code etc, we can be 100% certain that LLM companies are using LLMs to improve their products.

The improvement loop is likely not fully autonomous yet - it is currently more efficient to have a human-in-the-loop - but there is certainly a lot of LLMs improving LLMs going on today.