GPT-5.2

(openai.com)

1019 points atgctg | 3 comments | 11 Dec 25 18:04 UTC | HN request time: 0.001s | source

https://platform.openai.com/docs/guides/latest-model

System card: https://cdn.openai.com/pdf/3a4153c8-c748-4b71-8e31-aecbde944...

Show context

onraglanroad ◴[11 Dec 25 21:06 UTC] No.46237160[source]▶

I suppose this is as good a place as any to mention this. I've now met two different devs who complained about the weird responses from their LLM of choice, and it turned out they were using a single session for everything. From recipes for the night, presents for the wife and then into programming issues the next day.

Don't do that. The whole context is sent on queries to the LLM, so start a new chat for each topic. Or you'll start being told what your wife thinks about global variables and how to cook your Go.

I realise this sounds obvious to many people but it clearly wasn't to those guys so maybe it's not!

replies(14): >>46237301 #>>46237674 #>>46237722 #>>46237855 #>>46237911 #>>46238296 #>>46238727 #>>46239388 #>>46239806 #>>46239829 #>>46240070 #>>46240318 #>>46240785 #>>46241428 #

1. vintermann ◴[11 Dec 25 21:17 UTC] No.46237301[source]▶

>>46237160 #

It's not at all obvious where to drop the context, though. Maybe it helps to have similar tasks in the context, maybe not. It did really, shockingly well on a historical HTR task I gave it, so I gave it another one, in some ways an easier one... Thought it wouldn't hurt to have text in a similar style in the context. But then it suddenly did very poorly.

Incidentally, one of the reasons I haven't gotten much into subscribing to these services, is that I always feel like they're triaging how many reasoning tokens to give me, or AB testing a different model... I never feel I can trust that I interact with the same model.

replies(2): >>46239278 #>>46240087 #

2. dcre ◴[12 Dec 25 00:15 UTC] No.46239278[source]▶

>>46237301 (TP) #

The models you interact with through the API (as opposed to chat UIs) are held stable and let you specify reasoning effort, so if you use a client that takes API keys, you might be able to solve both of those problems.

3. eru ◴[12 Dec 25 02:08 UTC] No.46240087[source]▶

>>46237301 (TP) #

> Incidentally, one of the reasons I haven't gotten much into subscribing to these services, is that I always feel like they're triaging how many reasoning tokens to give me, or AB testing a different model... I never feel I can trust that I interact with the same model.

That's what websites have been doing for ages. Just like you can't step twice in the same river, you can't use the same version of Google Search twice, and never could.

↑