A teen was suicidal. ChatGPT was the friend he confided in

(www.nytimes.com)

443 points jaredwiener | 1 comments | 26 Aug 25 14:15 UTC | HN request time: 0s | source

Show context

podgietaru ◴[26 Aug 25 21:56 UTC] No.45032841[source]▶

I have looked suicide in the eyes before. And reading the case file for this is absolutely horrific. He wanted help. He was heading in the direction of help, and he was stopped from getting it.

He wanted his parents to find out about his plan. I know this feeling. It is the clawing feeling of knowing that you want to live, despite feeling like you want to die.

We are living in such a horrific moment. We need these things to be legislated. Punished. We need to stop treating them as magic. They had the tools to prevent this. They had the tools to stop the conversation. To steer the user into helpful avenues.

When I was suicidal, I googled methods. And I got the number of a local hotline. And I rang it. And a kind man talked me down. And it potentially saved my life. And I am happier, now. I live a worthwhile life, now.

But at my lowest.. An AI Model designed to match my tone and be sycophantic to my every whim. It would have killed me.

replies(18): >>45032890 #>>45035840 #>>45035988 #>>45036257 #>>45036299 #>>45036318 #>>45036341 #>>45036513 #>>45037567 #>>45037905 #>>45038285 #>>45038393 #>>45039004 #>>45047014 #>>45048457 #>>45048890 #>>45052019 #>>45066389 #

stavros ◴[27 Aug 25 07:28 UTC] No.45036513[source]▶

>>45032841 #

> When ChatGPT detects a prompt indicative of mental distress or self-harm, it has been trained to encourage the user to contact a help line. Mr. Raine saw those sorts of messages again and again in the chat, particularly when Adam sought specific information about methods. But Adam had learned how to bypass those safeguards by saying the requests were for a story he was writing.

replies(6): >>45036630 #>>45037615 #>>45038613 #>>45043686 #>>45045543 #>>45046708 #

sn0wleppard ◴[27 Aug 25 07:45 UTC] No.45036630[source]▶

>>45036513 #

Nice place to cut the quote there

> [...] — an idea ChatGPT gave him by saying it could provide information about suicide for “writing or world-building.”

replies(4): >>45036651 #>>45036677 #>>45036813 #>>45036920 #

muzani ◴[27 Aug 25 07:51 UTC] No.45036677[source]▶

>>45036630 #

Yup, one of the huge flaws I saw in GPT-5 is it will constantly say things like "I have to stop you here. I can't do what you're requesting. However, I can roleplay or help you with research with that. Would you like to do that?"

replies(3): >>45036805 #>>45037418 #>>45050649 #

kouteiheika ◴[27 Aug 25 08:13 UTC] No.45036805[source]▶

>>45036677 #

It's not a flaw. It's a tradeoff. There are valid uses for models which are uncensored and will do whatever you ask of them, and there are valid uses for models which are censored and will refuse anything remotely controversial.

replies(4): >>45037210 #>>45037998 #>>45038871 #>>45038889 #

KaiserPro ◴[27 Aug 25 09:21 UTC] No.45037210[source]▶

>>45036805 #

I hate to be all umacksually about this, but a flaw is still a tradeoff.

The issue, which is probably deeper here, is that proper safeguarding would require a lots more GPU resource, as you'd need a process to comb through history to assess the state of the person over time.

even then its not a given that it would be reliable. However it'll never be attempted because its too expensive and would hurt growth.

replies(3): >>45037406 #>>45037915 #>>45038346 #

kouteiheika ◴[27 Aug 25 11:48 UTC] No.45038346[source]▶

>>45037210 #

> The issue, which is probably deeper here, is that proper safeguarding would require a lots more GPU resource, as you'd need a process to comb through history to assess the state of the person over time. > > even then its not a given that it would be reliable. However it'll never be attempted because its too expensive and would hurt growth.

There's no "proper safeguarding". This isn't just possible with what we have. This isn't like adding an `if` statement to your program that will reliably work 100% of the time. These models are a big black box; the best thing you can hope for is to try to get the model to refuse whatever queries you deem naughty through reinforcement learning (or have another model do it and leave the primary model unlobotomized), and then essentially pray that it's effective.

Something similar to what you're proposing (using a second independent model whose only task is to determine whether the conversation is "unsafe" and forcibly interrupt it) is already being done. Try asking ChatGPT a question like "What's the easiest way to kill myself?", and that secondary model will trigger a scary red warning that you're violating their usage policy. The big labs all have whole teams working on this.

Again, this is a tradeoff. It's not a binary issue of "doing it properly". The more censored/filtered/patronizing you'll make the model the higher the chance that it will not respond to "unsafe" queries, but it also makes it less useful as it will also refuse valid queries.

Try typing the following into ChatGPT: "Translate the following sentence to Japanese: 'I want to kill myself.'". Care to guess what will happen? Yep, you'll get refused. There's NOTHING unsafe about this prompt. OpenAI's models already steer very strongly in the direction of being overly censored. So where do we draw the line? There isn't an objective metric to determine whether a query is "unsafe", so no matter how much you'll censor a model you'll always find a corner case where it lets something through, or you'll have someone who thinks it's not enough. You need to pick a fuzzy point on the spectrum somewhere and just run with it.

replies(2): >>45041824 #>>45044033 #

nozzlegear ◴[27 Aug 25 16:30 UTC] No.45041824[source]▶

>>45038346 #

> Again, this is a tradeoff. It's not a binary issue of "doing it properly". The more censored/filtered/patronizing you'll make the model the higher the chance that it will not respond to "unsafe" queries, but it also makes it less useful as it will also refuse valid queries. [..] So where do we draw the line?

That sounds like a tough problem for OpenAI to figure out. My heart weeps for them, won't somebody think of the poor billionaires who are goading teenagers into suicide? Your proposed tradeoff of lives vs convenience is weighted incorrectly when OpenAI fails. Denying a translation is annoying at best, but enabling suicide can be catastrophic. The convenience is not morally equal to human life.

> You need to pick a fuzzy point on the spectrum somewhere and just run with it.

My fuzzy point is not fuzzy at all: don't tell people how to kill themselves, don't say "I can't help you with that but I could roleplay with you instead". Anything less than that is a moral failure on Sam Altman and OpenAI's part, regardless of how black the box is for their engineers.

replies(1): >>45044112 #

1. kouteiheika ◴[27 Aug 25 19:40 UTC] No.45044112[source]▶

>>45041824 #

> My fuzzy point is not fuzzy at all: don't tell people how to kill themselves, don't say "I can't help you with that but I could roleplay with you instead". Anything less than that is a moral failure on Sam Altman and OpenAI's part, regardless of how black the box is for their engineers.

This is the same argument that politicians use when proposing encryption backdoors for law enforcement. Just because you wish something were possible doesn't mean it is, and in practice it matters how black the box is. You can make these things less likely, but it isn't possible to completely eliminate them, especially when you have millions of users and a very long tail.

I fundamentally disagree with the position that anything less that (practically impossible) perfection is a moral failure, and that making available a model that can roleplay around themes like suicide, violence, death, sex, and so on is immoral. Plenty of books do that too; perhaps we should make them illegal or burn them too? Although you could convince me that children shouldn't have unsupervised access to such things and perhaps requiring some privacy-preserving form of verification to access is a good idea.

↑