Most active commenters
  • 1vuio0pswjnm7(5)
  • Aurornis(4)
  • (3)

←back to thread

195 points meetpateltech | 27 comments | | HN request time: 2.204s | source | bottom
1. 1vuio0pswjnm7 ◴[] No.45901917[source]
"The New York Times is demanding that we turn over 20 million of your private ChatGPT conversations."

As might any plaintiff. NYT might be the first of many others and the lawsuits may not be limited to copyright claims

Why has OpenAI collected and stored 20 million conversations (including "deleted chats")

What is the purpose of OpenAI storing millions of private conversations

By contrast the purpose of NYT's request is both clear and limited

The documents requested are not being made public by the plaintiffs. The documents will presumably be redacted to protect any confidential information before being produced to the plaintiffs, the documents can only be used by the plaintiffs for the purpose of the litigation against OpenAI and, unlike OpenAI who has collected and stored these conversations for as long as OpenAI desires, the plaintiffs are prohibited from retaining copies of the documents after the litigation is concluded

The privacy issue here has been created by OpenAI for their own commercial benefit

It is not even clear what this benefit, if any, will be as OpenAI continues to search for a "business model"

Wanton data collection

replies(8): >>45902173 #>>45902539 #>>45903180 #>>45903873 #>>45904258 #>>45904285 #>>45905079 #>>45906333 #
2. silveraxe93 ◴[] No.45902173[source]
No it's not. It's literally a court order mandating them to collect this data.

- [1] https://arstechnica.com/tech-policy/2025/08/openai-offers-20...

replies(2): >>45902784 #>>45903718 #
3. ◴[] No.45902539[source]
4. sailfast ◴[] No.45902784[source]
This is an excellent article and source. Thank you.
5. 1vuio0pswjnm7 ◴[] No.45903180[source]
NB. There is no order to "collect". The order is to preserve what is already being collected and stored in the ordinary course of business

https://ia801404.us.archive.org/31/items/gov.uscourts.nysd.6...

https://ia801404.us.archive.org/31/items/gov.uscourts.nysd.6...

replies(2): >>45904188 #>>45904261 #
6. otterley ◴[] No.45903718[source]
This article says nothing of the sort. The court order is to preserve existing logs they already have, not to disable logging, and hand all the logs over the plaintiffs. OpenAI's objections are mainly that 1/there are too many logs (so they're proposing a sample instead) and that 2/there's identifying data in the logs and so they are being "forced" to anonymize the logs at their expense (even though it's what they want as a condition of transferring the logs).

There is nothing in the article that mentions OpenAI being forced to create new logs they don't already have.

replies(1): >>45904727 #
7. macki0 ◴[] No.45903873[source]
> What is the purpose of OpenAI storing millions of private conversations

Its needed for the conversation history feature, a core feature of the ChatGPT product

Its like saying "What is the purpose of Google Photos storing millions of private images"

replies(1): >>45904059 #
8. SilverElfin ◴[] No.45904059[source]
This is true but why retain deleted conversations?
replies(2): >>45904219 #>>45905605 #
9. ◴[] No.45904188[source]
10. Aurornis ◴[] No.45904219{3}[source]
That's the objection: The court order requires them to retain everything they currently have, even if the user requests that it be deleted.
11. 1vuio0pswjnm7 ◴[] No.45904258[source]
Is there a technical limitation that prevents chat histories from being stored locally on the user's computer instead of being stored on someone else's computer(s)

Why do chat histories need to be accessible by OpenAI, its service partners and anyone with the authority to request them from OpenAI

If users want this design, as suggested by HN commenters, if users want their chat histories to be accessible to OpenAI, its service providers and anyone with authority to request them from OpenAI, then wouldn't it also be true that these users are not much concerned with "privacy"

If so, then why would OpenAI proclaim they are "fighting the New York Times' invasion of user privacy", knowing that NYT is prohibited from making the logs public and users generally do not care much about "privacy" anyway

The restrictions on plaintiff NYT's use of the logs are greater than the restrictions, if any,^1 on OpenAI's use of them

1. If any such restrictions existed, for example if OpenAI stated "We don't do X" in a "privacy policy" and people interpreted this as a legally enforceable restriction,^2 how would a user verify that the statement was true, i.e., that OpenAI has not violated the "restriction". Silicon Valley companies like OpenAI are highly secretive

2. As opposed to a statement by OpenAi of what OpenAI allegedly does not do. Compare with a potentially legally-enforceable promise such as "OpenAI will not do X". Also consider that OpenAI may do Y, Z, etc. and make no mention of it to anyone. As it happens Silicon Valley companies generally have a reputation for dishonesty

replies(3): >>45904337 #>>45904342 #>>45904418 #
12. Aurornis ◴[] No.45904261[source]
The two documents you linked are responses to specific parts of OpenAI's objection. They're not good sources for the original order.

Nevertheless, you're generally correct but you don't realize why: A core feature of ChatGPT is that it keeps your conversation history right there so you can click on it, review it, and continue conversations across all of your devices. The court order is to preserve what is already present in the system even if the user asks to delete it.

For those who are confused: A core feature of ChatGPT and other LLM accounts is that your past conversations are available to return to, until you specifically delete them. The problem now is that if a user asks for the conversation to be deleted, OpenAI has to retain the conversation for the court order even though it appears deleted.

13. Aurornis ◴[] No.45904285[source]
> What is the purpose of OpenAI storing millions of private conversations

Your previous ChatGPT conversations show up right in the ChatGPT interface.

They have to store the private conversations to enable users to bring them up in the interface.

This isn't a secretive, hidden data collection. It's a clear and obvious feature right in the product. They're fighting for the ability to not retain secret records of past conversations that have been deleted.

The problem with the court order is that it requires them to keep the conversations even after a user presses the 'Delete' button on them.

14. Aurornis ◴[] No.45904337[source]
> Is there a technical limitation that prevents chat histories from being stored locally on the user's computer

People access ChatGPT through different interfaces: Web, desktop app, their phones, tablets.

Therefore the conversations are stored on the servers. It's really not some hidden plot against users to steal their data. It's just how most users expect their apps to work.

replies(1): >>45905875 #
15. aDyslecticCrow ◴[] No.45904342[source]
They're very valuable data, and it's convenient to log in to see a previous chat.

If you have ever played with the api, its clear as day that the protocol itself is stateless.

16. neodymiumphish ◴[] No.45904418[source]
Presumably for cross-device interactivity. If I interact with ChatGPT on my phone, then open it on my desktop. I might be a bit frustrated that I can't get to the chat I was having on my phone previously.

OpenAI could store the chat conversation in an encrypted format that only you, the user, can decrypt, with the client-side determining the amount of previous messages to include for additional context, but there's plenty of user overhead involved in an undertaking like that (likely a separate decryption password would be needed to ensure full user-exclusive access, etc).

I'd appreciate and use a feature like that, but I doubt most "average" users would care.

replies(1): >>45904740 #
17. silveraxe93 ◴[] No.45904727{3}[source]
Are you being intentionally dense? When did anyone say anything about forcing to create logs they don't have?

The court is ordering OpenAI to keep literally all conversations someone has with ChatGPT. It's stopping them from deleting these conversation even if the user requests it. This is not extra logging, it's literally _the_ product.

replies(2): >>45904817 #>>45905022 #
18. scotty79 ◴[] No.45904740{3}[source]
Facebook messenger tries to marry end to end encryption with multi-device access and it's a horrible mess with some messages not being delivered to some devices for hours , days or ever.

I absolutely want OpenAI to keep all of my chats and I absolutely don't want them to share them ( voluntarily or by force) with any private agent.

I have exactly the same expectation of any document or communication platform. It's been long established as accepted compomise between security and convenience.

19. ◴[] No.45904817{4}[source]
20. pclmulqdq ◴[] No.45905022{4}[source]
If OpenAI truly didn't keep conversation records for any length of time, they would not be subject to this kind of order. Lots of stateless services get these and are able to defeat them because they never store the user's data. The fact that they store them at all means that they are in scope for a preservation order. It also means that they are in scope for all manner of usage by OpenAI themselves even if a user requests deletion.
replies(1): >>45906397 #
21. cush ◴[] No.45905079[source]
>What is the purpose of OpenAI storing millions of private conversations

Have you used ChatGPT? Your conversation history is on the left rail

replies(1): >>45905837 #
22. kulahan ◴[] No.45905605{3}[source]
ChatGPT (the app) specifically says they keep deleted conversations for up to 30 days. That's probably why.
23. 1vuio0pswjnm7 ◴[] No.45905837[source]
"Have you used ChatGPT?"

No

Large number of upvotes on the quoted comment however. Maybe some of those voters are ChatGPT users

I do searching from the command line in text mode. The script I use keeps a "log" (a customised SERP) of all query strings and search result URLs. I also have these URLs stored in the logs from the forward proxy. These are compressed using RePair. I can search the compressed logs faster this way than with something like

    ztsd -dc log.zst|rg pattern
24. andrepd ◴[] No.45905875{3}[source]
Nonsense. It's easy to design an app where the server stores all information in an encrypted form. If OpenAI "cared about privacy" like this PR piece claims, they would do this. They don't because they (obviously) don't care and they (obviously) want the data for their purposes.
replies(1): >>45906070 #
25. epistasis ◴[] No.45906070{4}[source]
"Easy" does not mean "lowest cost" or "easiest". It's far far far easier to stor conversations as plain text and return them as is, instead of having to encrypt, rotate keys, etc. etc.

That's a tricky system to get right and maintain

(Please do not interpret this as a defense of OpenAI! I just think that we shouldn't trivialize the task of encrypting user data so that it's not visible to the provider).

26. 1vuio0pswjnm7 ◴[] No.45906333[source]
If an analogy to the history of search engines can be made, then we know that log retention policies in the US can change over time. The user has no control over such changes

https://ide.mit.edu/wp-content/uploads/2018/01/w23815.pdf

Companies operating popular www search engines might claim that the need for longer retention is "to provide better service" or some similar reason that focuses on users' interests rather than the company's interests^1

1. Generally, advertising services

This paper attempts to expose such claims as bogus

27. dghlsakjg ◴[] No.45906397{5}[source]
It seems as if the court has forced OpenAI into collecting logs that they weren't otherwise collecting.

So in this case not keeping logs as ordered by the court would be contempt of court.