←back to thread

GPT-5.2

(openai.com)
1019 points atgctg | 1 comments | | HN request time: 0.199s | source
Show context
svara ◴[] No.46241936[source]
In my experience, the best models are already nearly as good as you can be for a large fraction of what I personally use them for, which is basically as a more efficient search engine.

The thing that would now make the biggest difference isn't "more intelligence", whatever that might mean, but better grounding.

It's still a big issue that the models will make up plausible sounding but wrong or misleading explanations for things, and verifying their claims ends up taking time. And if it's a topic you don't care about enough, you might just end up misinformed.

I think Google/Gemini realize this, since their "verify" feature is designed to address exactly this. Unfortunately it hasn't worked very well for me so far.

But to me it's very clear that the product that gets this right will be the one I use.

replies(8): >>46241987 #>>46242107 #>>46242173 #>>46242280 #>>46242317 #>>46242483 #>>46242537 #>>46242589 #
1. fauigerzigerk ◴[] No.46242589[source]
I agree, but the question is how better grounding can be achieved without a major research breakthrough.

I believe the real issue is that LLMs are still so bad at reasoning. In my experience, the worst hallucinations occur where only handful of sources exist for some set of facts (e.g laws of small countries or descriptions of niche products).

LLMs know these sources and they refer to them but they are interpreting them incorrectly. They are incapable of focusing on the semantics of one specific page because they get "distracted" by their pattern matching nature.

Now people will say that this is unavoidable given the way in which transformers work. And this is true.

But shouldn't it be possible to include some measure of data sparsity in the training so that models know when they don't know enough? That would enable them to boost the weight of the context (including sources they find through inference time search/RAG) relative to to their pretraining.