←back to thread

277 points simianwords | 1 comments | | HN request time: 0s | source
Show context
cainxinth ◴[] No.45152987[source]
I find the leader board argument a little strange. All their enterprise clients are clamoring for more reliability from them. If they could train a model that conceded ignorance instead of guessing and thus avoid hallucinations, why aren't they doing that? Because of leader board optics?
replies(1): >>45153065 #
1. ospray ◴[] No.45153065[source]
I think they are trying to communicate that their benchmarks will go down as they try to tackle hallucinations. Honestly I am surprised they didn't just say we think all benchmarks need a incorrect vs abstinence ratio so our cautious honest model can do well on that. Although they did seem to hint that's what they want.