GPT-5.2

(openai.com)

1053 points atgctg | 1 comments | 11 Dec 25 18:04 UTC | HN request time: 0.244s | source

https://platform.openai.com/docs/guides/latest-model

System card: https://cdn.openai.com/pdf/3a4153c8-c748-4b71-8e31-aecbde944...

Show context

svara ◴[12 Dec 25 08:08 UTC] No.46241936[source]▶

In my experience, the best models are already nearly as good as you can be for a large fraction of what I personally use them for, which is basically as a more efficient search engine.

The thing that would now make the biggest difference isn't "more intelligence", whatever that might mean, but better grounding.

It's still a big issue that the models will make up plausible sounding but wrong or misleading explanations for things, and verifying their claims ends up taking time. And if it's a topic you don't care about enough, you might just end up misinformed.

I think Google/Gemini realize this, since their "verify" feature is designed to address exactly this. Unfortunately it hasn't worked very well for me so far.

But to me it's very clear that the product that gets this right will be the one I use.

replies(12): >>46241987 #>>46242107 #>>46242173 #>>46242280 #>>46242317 #>>46242483 #>>46242537 #>>46242589 #>>46243494 #>>46243567 #>>46243680 #>>46244002 #

phorkyas82 ◴[12 Dec 25 08:18 UTC] No.46241987[source]▶

>>46241936 #

Isn't that what no LLM can provide: being free of hallucinations?

replies(5): >>46242091 #>>46242093 #>>46242230 #>>46243681 #>>46244023 #

arw0n ◴[12 Dec 25 09:03 UTC] No.46242230[source]▶

>>46241987 #

I think the better word is confabulation; fabricating plausible but false narratives based on wrong memory. Fundamentally, these models try to produce plausible text. With language models getting large, they start creating internal world models, and some research shows they actually have truth dimensions. [0]

I'm not an expert on the topic, but to me it sounds plausible that a good part of the problem of confabulation comes down to misaligned incentives. These models are trained hard to be a 'helpful assistant', and this might conflict with telling the truth.

Being free of hallucinations is a bit too high a bar to set anyway. Humans are extremely prone to confabulations as well, as can be seen by how unreliable eye witness reports tend to be. We usually get by through efficient tool calling (looking shit up), and some of us through expressing doubt about our own capabilities (critical thinking).

[0] https://arxiv.org/abs/2407.12831

replies(3): >>46242370 #>>46242925 #>>46243003 #

1. officialchicken ◴[12 Dec 25 11:13 UTC] No.46243003[source]▶

>>46242230 #

No, the correct word is hallucinating. That's the word everyone uses and has been using. While it might not be technically correct, everyone knows what it means and more importantly, it's not a $3 word and everyone can relate to the concept. I also prefer all the _other_ more accurate alternative words Wikipedia offers to describe it:

"In the field of artificial intelligence (AI), a hallucination or artificial hallucination (also called bullshitting,[1][2] confabulation,[3] or delusion[4]) is"

↑