←back to thread

GPT-5.2

(openai.com)
1019 points atgctg | 1 comments | | HN request time: 0.229s | source
1. jrflowers ◴[] No.46237314[source]
OpenAI is really good at just saying stuff on the internet.

I love the way they talk about incorrect responses:

> Errors were detected by other models, which may make errors themselves. Claim-level error rates are far lower than response-level error rates, as most responses contain many claims.

“These numbers might be wrong because they were made up by other models, which we will not elaborate on, also these numbers are much higher by a metric that reflects how people use the product, which we will not be sharing“

I also really love the graph where they drew a line at “wrong half of the time” and labeled it ‘Expert-Level’.

10/10, reading this post is experientially identical to watching that 12 hours of jingling keys video, which is hard to pull off for a blog.