Most active commenters

eru(5)
onraglanroad(3)

Popular/hot comments

>>46237674 #
>>46240669 #

←back to thread

GPT-5.2

(openai.com)

https://platform.openai.com/docs/guides/latest-model

System card: https://cdn.openai.com/pdf/3a4153c8-c748-4b71-8e31-aecbde944...

1. onraglanroad ◴[11 Dec 25 21:06 UTC] No.46237160[source]▶

>>46234788 (OP) #

I suppose this is as good a place as any to mention this. I've now met two different devs who complained about the weird responses from their LLM of choice, and it turned out they were using a single session for everything. From recipes for the night, presents for the wife and then into programming issues the next day.

Don't do that. The whole context is sent on queries to the LLM, so start a new chat for each topic. Or you'll start being told what your wife thinks about global variables and how to cook your Go.

I realise this sounds obvious to many people but it clearly wasn't to those guys so maybe it's not!

replies(14): >>46237301 #>>46237674 #>>46237722 #>>46237855 #>>46237911 #>>46238296 #>>46238727 #>>46239388 #>>46239806 #>>46239829 #>>46240070 #>>46240318 #>>46240785 #>>46241428 #

2. vintermann ◴[11 Dec 25 21:17 UTC] No.46237301[source]▶

>>46237160 (TP) #

It's not at all obvious where to drop the context, though. Maybe it helps to have similar tasks in the context, maybe not. It did really, shockingly well on a historical HTR task I gave it, so I gave it another one, in some ways an easier one... Thought it wouldn't hurt to have text in a similar style in the context. But then it suddenly did very poorly.

Incidentally, one of the reasons I haven't gotten much into subscribing to these services, is that I always feel like they're triaging how many reasoning tokens to give me, or AB testing a different model... I never feel I can trust that I interact with the same model.

replies(2): >>46239278 #>>46240087 #

3. noname120 ◴[11 Dec 25 21:49 UTC] No.46237674[source]▶

>>46237160 (TP) #

Problem is that by default ChatGPT has the “Reference chat history” option enabled in the Memory options. This causes any previous conversation to leak into the current one. Just creating a new conversation is not enough, you also need to disable that option.

replies(3): >>46238018 #>>46238056 #>>46238504 #

4. chasd00 ◴[11 Dec 25 21:53 UTC] No.46237722[source]▶

>>46237160 (TP) #

I was listening to a podcast about people becoming obsessed and "in love" with an LLM like ChatGPT. Spouses were interviewed describing how mentally damaging it is to their partner and how their marriage/relationship is seriously at risk because of it. I couldn't believe no one has told these people to just goto the LLM and reset the context, that reverts the LLM back to a complete stranger. Granted that would be pretty devastating to the person in "the relationship" with the LLM since it wouldn't know them at all after that.

replies(2): >>46237779 #>>46237845 #

5. adamesque ◴[11 Dec 25 21:58 UTC] No.46237779[source]▶

>>46237722 #

that's not quite what parent was talking about, which is — don't just use one giant long conversation. resetting "memories" is a totally different thing (which still might be valuable to do occasionally, if they still let you)

replies(1): >>46237924 #

6. jncfhnb ◴[11 Dec 25 22:02 UTC] No.46237845[source]▶

>>46237722 #

It’s the majestic, corrupting glory of having a loyal cadre of empowering yes men normally only available to the rich and powerful, now available to the normies.

7. mmaunder ◴[11 Dec 25 22:03 UTC] No.46237855[source]▶

>>46237160 (TP) #

Yeah I think a lot of us are taking knowing how LLMs work for granted. I did the fast.ai course a while back and then went off and played with VLLM and various LLMs optimizing execution, tweaking params etc. Then moved on and started being a user. But knowing how they work has been a game changer for my team and I. And context window is so obvious, but if you don't know what it is you're going to think AI sucks. Which now has me wondering: Is this why everyone thinks AI sucks? Maybe Simon Willison should write about this. Simon?

replies(1): >>46240116 #

8. TechDebtDevin ◴[11 Dec 25 22:07 UTC] No.46237911[source]▶

>>46237160 (TP) #

How are these devs employed or trusted with anything..

9. onraglanroad ◴[11 Dec 25 22:08 UTC] No.46237924{3}[source]▶

>>46237779 #

Actually, it's kind of the same. LLMs don't have a "new memory" system. They're like the guy from Memento. Context memory and long term from the training data. Can't make new memories from the context though.

(Not addressed to parent comment, but the inevitable others: Yes, this is an analogy, I don't need to hear another halfwit lecture on how LLMs don't really think or have memories. Thank you.)

replies(1): >>46237997 #

10. dragonwriter ◴[11 Dec 25 22:15 UTC] No.46237997{4}[source]▶

>>46237924 #

Context memory arguably is new memory, but because we abused the metaphor of “learning” rather than something more like shaping inborn instinct for trained model weights, we have no fitting metaphor what happens during the “lifetime” of the interaction with a model via its context window as formation of skills/memories.

11. onraglanroad ◴[11 Dec 25 22:17 UTC] No.46238018[source]▶

>>46237674 #

That seems like a terrible default. Unless they have a weighting system for different parts of context?

replies(1): >>46240078 #

12. redhed ◴[11 Dec 25 22:20 UTC] No.46238056[source]▶

>>46237674 #

This is also the default in Gemini pretty sure, at least I remember turning it off. Make's no sense to me why this is the default.

replies(2): >>46238580 #>>46239233 #

13. wickedsight ◴[11 Dec 25 22:38 UTC] No.46238296[source]▶

>>46237160 (TP) #

This is why I love that ChatGPT added branching. Sometimes I end up going some random direction in a thread about some code and then I can go back and start a new branch from the part where the chat was still somewhat clean.

Also works really well when some of my questions may not have been worded correctly and ChatGPT has gone in a direction I don't want it to go. Branch, word my question better and get a better answer.

14. 0xdeafbeef ◴[11 Dec 25 22:57 UTC] No.46238504[source]▶

>>46237674 #

Only your questions are in it though

15. gordonhart ◴[11 Dec 25 23:04 UTC] No.46238580{3}[source]▶

>>46238056 #

> Makes no sense to me why this is the default.

You’re probably pretty far from the average user, who thinks “AI is so dumb” because it doesn’t remember what you told it yesterday.

replies(1): >>46238651 #

16. redhed ◴[11 Dec 25 23:11 UTC] No.46238651{4}[source]▶

>>46238580 #

I was thinking more people would be annoyed by it bringing up unrelated conversations, thinking more I'd say you're probably right that more people are expecting it to remember everything they say.

replies(1): >>46239902 #

17. plaidfuji ◴[11 Dec 25 23:19 UTC] No.46238727[source]▶

>>46237160 (TP) #

It is annoying though, when you start a new chat for each topic you tend to have to re-write context a lot. I use Gemini 3, which I understand doesn’t have as good of a memory system as OpenAI. Even on single-file programming stuff, after a few rounds of iteration I tend to get to its context limit (the thinking model). Either because the answers degrade or it just throws the “oops something went wrong” error. Ok, time to restart from scratch and paste in the latest iteration.

I don’t understand how agentic IDEs handle this either. Or maybe it’s easier - it just resends the entire codebase every time. But where to cut the chat history? It feels to me like every time you re-prompt a convo, it should first tell itself to summarize the existing context as bullets as its internal prompt rather than re-sending the entire context.

replies(1): >>46239006 #

18. int_19h ◴[11 Dec 25 23:48 UTC] No.46239006[source]▶

>>46238727 #

Agentic IDEs/extensions usually continue the conversation until the context gets close to 80% full, then do the compacting. With both Codex and Claude Code you can actually observe that happening.

That said I find that in practice, Codex performance degrades significantly long before it comes to the point of automated compaction - and AFAIK there's no way to trigger it manually. Claude, on the other hand, has a command for to force compacting, but at the same time I rarely use it because it's so good at managing it by itself.

As far as multiple conversations, you can tell the model to update AGENTS.md (or CLAUDE.md or whatever is in their context by default) with things it needs to remember.

replies(1): >>46240679 #

19. astrange ◴[12 Dec 25 00:09 UTC] No.46239233{3}[source]▶

>>46238056 #

Mostly because they built the feature and so that implicitly means they think it's cool.

I recommend turning it off because it makes the models way more sycophantic and can drive them (or you) insane.

20. dcre ◴[12 Dec 25 00:15 UTC] No.46239278[source]▶

>>46237301 #

The models you interact with through the API (as opposed to chat UIs) are held stable and let you specify reasoning effort, so if you use a client that takes API keys, you might be able to solve both of those problems.

21. holtkam2 ◴[12 Dec 25 00:30 UTC] No.46239388[source]▶

>>46237160 (TP) #

I know I sound like a snob but I’ve had many moments with Gen AI tools over the years that made me wonder: I wonder what these tools are like for someone who doesn’t know how LLMs work under the hood? It’s probably completely bizarre? Apps like Cursor or ChatGPT would be incomprehensible to me as a user, I feel.

replies(2): >>46240669 #>>46241731 #

22. blindhippo ◴[12 Dec 25 01:25 UTC] No.46239806[source]▶

>>46237160 (TP) #

Thing is, context management is NOT obvious to most users of these tools. I use agentic coding tools on a daily basis now and still struggle with keeping context focused and useful, usually relying on patterns such as memory banks and task tracking documents to try to keep a log of things as I pop in and out of different agent contexts. Yet still, one false move and I've blown the window leading to a "compression" which is utterly useless.

The tools need to figure out how to manage context for us. This isn't something we have to deal with when working with other humans - we reliably trust that other humans (for the most part) retain what they are told. Agentic use now is like training a team mate to do one thing, then taking it out back to shoot it in the head before starting to train another one. It's inefficient and taxing on the user.

23. SubiculumCode ◴[12 Dec 25 01:30 UTC] No.46239829[source]▶

>>46237160 (TP) #

I constantly switch out, even when it's on the same topic. It starts forming its own 'beliefs and assumptions', gets myopic. I also make use of the big three services in turn to attack ideas from multiple directions

replies(1): >>46240827 #

24. tiahura ◴[12 Dec 25 01:42 UTC] No.46239902{5}[source]▶

>>46238651 #

It’s not that it brings it up in unrelated conversations, it’s that it nudges related conversations in unwanted directions.

25. eru ◴[12 Dec 25 02:05 UTC] No.46240070[source]▶

>>46237160 (TP) #

> I realise this sounds obvious to many people but it clearly wasn't to those guys so maybe it's not!

It's worse: Gemini (and ChatGPT, but to a lesser extent) have started suggesting random follow-up topics when they conclude that a chat in a session has exhausted a topic. Well, when I say random, I mean that they seem to be pulling it from the 'memory' of our other chats.

For a naive user without preconceived notions of how to use these tools, this guidance from the tools themselves would serve as a pretty big hint that they should intermingle their sessions.

replies(1): >>46240643 #

26. eru ◴[12 Dec 25 02:07 UTC] No.46240078{3}[source]▶

>>46238018 #

They do (or at least they have something that behaves like weighting).

27. eru ◴[12 Dec 25 02:08 UTC] No.46240087[source]▶

>>46237301 #

> Incidentally, one of the reasons I haven't gotten much into subscribing to these services, is that I always feel like they're triaging how many reasoning tokens to give me, or AB testing a different model... I never feel I can trust that I interact with the same model.

That's what websites have been doing for ages. Just like you can't step twice in the same river, you can't use the same version of Google Search twice, and never could.

28. eru ◴[12 Dec 25 02:11 UTC] No.46240116[source]▶

>>46237855 #

> Is this why everyone thinks AI sucks?

Who's everyone? There are many, many people who think AI is great.

In reality, our contemporary AIs are (still) tools with glaring limitations. Some people overlook the limitations, or don't see them, and really hype them up. I guess the people who then take the hype at face value are those that think that AI sucks? I mean, they really do honestly suck in comparison to the hypest of hypes.

29. ramoz ◴[12 Dec 25 02:47 UTC] No.46240318[source]▶

>>46237160 (TP) #

Send them this https://backnotprop.substack.com/p/50-first-dates-with-mr-me...

30. ghostpepper ◴[12 Dec 25 03:41 UTC] No.46240643[source]▶

>>46240070 #

For ChatGPT you can turn this memory off in settings and delete the ones it's already created.

replies(1): >>46242026 #

31. Workaccount2 ◴[12 Dec 25 03:46 UTC] No.46240669[source]▶

>>46239388 #

Using my parents as a reference, they just thought it was neat when I showed them GPT-4 years ago. My jaw was on the floor for weeks, but most regular folks I showed had a pretty "oh thats kinda neat" response.

Technology is already so insane and advanced that most people just take it as magic inside boxes, so nothing is surprising anymore. It's all equally incomprehensible already.

replies(3): >>46241337 #>>46241344 #>>46241369 #

32. wahnfrieden ◴[12 Dec 25 03:47 UTC] No.46240679{3}[source]▶

>>46239006 #

Codex has `/compact`

33. layman51 ◴[12 Dec 25 04:09 UTC] No.46240785[source]▶

>>46237160 (TP) #

That is interesting. I already knew about that idea that you’re not supposed to let the conversation drag on too much because its problem solving performance might take a big hit, but then it kind of makes me think that over time, people got away with still using a single conversation for many different topics because of the big context windows.

Now I kind of wonder if I’m missing out by not continuing the conversation too much, or by not trying to use memory features.

34. nrds ◴[12 Dec 25 04:18 UTC] No.46240827[source]▶

>>46239829 #

> beliefs and assumptions

Unfortunately during coding I have found many LLMs like to encode their beliefs and assumptions into comments; and even when they don't, they're unavoidably feeding them into the code. Then future sessions pick up on these.

35. jacobedawson ◴[12 Dec 25 06:10 UTC] No.46241337{3}[source]▶

>>46240669 #

This mirrors my experience, the non-technical people in my life either shrugged and said 'oh yeah that's cool' or started pointing out gnarly edge cases where it didn't work perfectly. Meanwhile as a techie my mind was (and still is) spinning with the shock and joy of using natural human language to converse with a super-humanly adept machine.

replies(1): >>46243193 #

36. khafra ◴[12 Dec 25 06:12 UTC] No.46241344{3}[source]▶

>>46240669 #

LLMs are an especially tough case, because the field of AI had to spend sixty years telling people that real AI was nothing like what you saw in the comics and movies; and now we have real AI that presents pretty much exactly like what you used to see in the comics and movies.

replies(1): >>46242449 #

37. Agentlien ◴[12 Dec 25 06:17 UTC] No.46241369{3}[source]▶

>>46240669 #

My parents reacted in just the same way and the lackluster response really took me by surprise.

38. getnormality ◴[12 Dec 25 06:32 UTC] No.46241428[source]▶

>>46237160 (TP) #

In my recent explorations [1] I noticed it got really stuck on the first thing I said in the chat, obsessively returning to it as a lens through which every new message had to be interpreted. Starting new sessions was very useful to get a fresh perspective. Like a human, an AI that works on a writing piece with you is too close to the work to see any flaw.

[1] https://renormalize.substack.com/p/on-renormalization

replies(1): >>46241444 #

39. ljlolel ◴[12 Dec 25 06:35 UTC] No.46241444[source]▶

>>46241428 #

Probably because the chat name is named after that first message

40. d-lisp ◴[12 Dec 25 07:30 UTC] No.46241731[source]▶

>>46239388 #

Most non tech people I talked with don't care at all about LLMs.

They also are not impressed at all ("Okay, that's like google and internet").

replies(1): >>46242025 #

41. lostmsu ◴[12 Dec 25 08:25 UTC] No.46242025{3}[source]▶

>>46241731 #

Old people? I think it would be hard to find a lot of people under 20 who don't use ChatGPT daily. At least among ones that are still studying.

42. eru ◴[12 Dec 25 08:25 UTC] No.46242026{3}[source]▶

>>46240643 #

I'm not complaining about the memory at all. I was complaining about the suggestion to continue with unrelated topics.

43. xwolfi ◴[12 Dec 25 09:38 UTC] No.46242449{4}[source]▶

>>46241344 #

But it cannot think or mean anything, it's just a clever parrot so it's a bit weird. I guess uncanny is the word. I use it as google now, like just to search stuff that are hard to express with keywords.

44. throw310822 ◴[12 Dec 25 11:41 UTC] No.46243193{4}[source]▶

>>46241337 #

I don't think the divide is between technical and non-technical people. HN is full of people that are weirdly, obstinately dismissive of LLMs (stochastic parrots, glorified autocompletes, AI slop, etc.). Personal anecdote: my father (85yo, humanistic culture) was astounded by the perfectly spot-on analysis Claude provided of a poetic text he had written. He was doubly astounded when, showing Claude's analysis to a close friend, he reacted with complete indifference as if it were normal for computers to competently discuss poetry.

↑