←back to thread

GPT-5.2

(openai.com)
1094 points atgctg | 2 comments | | HN request time: 0s | source
Show context
svara ◴[] No.46241936[source]
In my experience, the best models are already nearly as good as you can be for a large fraction of what I personally use them for, which is basically as a more efficient search engine.

The thing that would now make the biggest difference isn't "more intelligence", whatever that might mean, but better grounding.

It's still a big issue that the models will make up plausible sounding but wrong or misleading explanations for things, and verifying their claims ends up taking time. And if it's a topic you don't care about enough, you might just end up misinformed.

I think Google/Gemini realize this, since their "verify" feature is designed to address exactly this. Unfortunately it hasn't worked very well for me so far.

But to me it's very clear that the product that gets this right will be the one I use.

replies(14): >>46241987 #>>46242107 #>>46242173 #>>46242280 #>>46242317 #>>46242483 #>>46242537 #>>46242589 #>>46243494 #>>46243567 #>>46243680 #>>46244002 #>>46244904 #>>46245168 #
phorkyas82 ◴[] No.46241987[source]
Isn't that what no LLM can provide: being free of hallucinations?
replies(5): >>46242091 #>>46242093 #>>46242230 #>>46243681 #>>46244023 #
kyletns ◴[] No.46242093[source]
For the record, brains are also not free of hallucinations.
replies(3): >>46242289 #>>46242311 #>>46244746 #
rimeice ◴[] No.46242311[source]
I still don’t really get this argument/excuse for why it’s acceptable that LLMs hallucinate. These tools are meant to support us, but we end up with two parties who are, as you say, prone to “hallucination” and it becomes a situation of the blind leading the blind. Ideally in these scenarios there’s at least one party with a definitive or deterministic view so the other party (i.e. us) at least has some trust in the information they’re receiving and any decisions they make off the back of it.
replies(4): >>46242664 #>>46242733 #>>46242790 #>>46243300 #
ssl-3 ◴[] No.46242790[source]
Have you ever employed anyone?

People, when tasked with a job, often get it right. I've been blessed by working with many great people who really do an amazing job of generally succeeding to get things right -- or at least, right-enough.

But in any line of work: Sometimes people fuck it up. Sometimes, they forget important steps. Sometimes, they're sure they did it one way when instead they did it some other way and fix it themselves. Sometimes, they even say they did the job and did it as-prescribed and actually believe themselves, when they've done neither -- and they're perplexed when they're shown this. They "hallucinate" and do dumb things for reasons that aren't real.

And sometimes, they just make shit up and lie. They know they're lying and they lie anyway, doubling-down over and over again.

Sometimes they even go all spastic and deliberately throw monkey wrenches into the works, just because they feel something that makes them think that this kind of willfully-destructive action benefits them.

All employees suck some of the time. They each have their own issues. And all employees are expensive to hire, and expensive to fire, and expensive to keep going. But some of their outputs are useful, so we employ people anyway. (And we're human; even the very best of us are going to make mistakes.)

LLMs are not so different in this way, as a general construct. They can get things right. They can also make shit up. They can skip steps. The can lie, and double-down on those lies. They hallucinate.

LLMs suck. All of them. They all fucking suck. They aren't even good at sucking, and they persist at doing it anyway.

(But some of their outputs are useful, and LLMs generally cost a lot less to make use of than people do, so here we are.)

replies(1): >>46243477 #
1. vitorfblima ◴[] No.46243477[source]
I don’t get the comparison. It would be like saying it’s okay if an excel formula gives me different outcomes everytime with the same arguments, sometimes right, but mostly wrong.
replies(1): >>46243529 #
2. ssl-3 ◴[] No.46243529[source]
People can accomplish useful things, but sometimes make mistakes and do shit wrong.

The bot can also accomplish useful things, and sometimes make mistakes and do shit wrong.

(These two statements are more similar in their truthiness than they are different.)