←back to thread

443 points jaredwiener | 1 comments | | HN request time: 0s | source
Show context
podgietaru ◴[] No.45032841[source]
I have looked suicide in the eyes before. And reading the case file for this is absolutely horrific. He wanted help. He was heading in the direction of help, and he was stopped from getting it.

He wanted his parents to find out about his plan. I know this feeling. It is the clawing feeling of knowing that you want to live, despite feeling like you want to die.

We are living in such a horrific moment. We need these things to be legislated. Punished. We need to stop treating them as magic. They had the tools to prevent this. They had the tools to stop the conversation. To steer the user into helpful avenues.

When I was suicidal, I googled methods. And I got the number of a local hotline. And I rang it. And a kind man talked me down. And it potentially saved my life. And I am happier, now. I live a worthwhile life, now.

But at my lowest.. An AI Model designed to match my tone and be sycophantic to my every whim. It would have killed me.

replies(18): >>45032890 #>>45035840 #>>45035988 #>>45036257 #>>45036299 #>>45036318 #>>45036341 #>>45036513 #>>45037567 #>>45037905 #>>45038285 #>>45038393 #>>45039004 #>>45047014 #>>45048457 #>>45048890 #>>45052019 #>>45066389 #
stavros ◴[] No.45036513[source]
> When ChatGPT detects a prompt indicative of mental distress or self-harm, it has been trained to encourage the user to contact a help line. Mr. Raine saw those sorts of messages again and again in the chat, particularly when Adam sought specific information about methods. But Adam had learned how to bypass those safeguards by saying the requests were for a story he was writing.
replies(6): >>45036630 #>>45037615 #>>45038613 #>>45043686 #>>45045543 #>>45046708 #
sn0wleppard ◴[] No.45036630[source]
Nice place to cut the quote there

> [...] — an idea ChatGPT gave him by saying it could provide information about suicide for “writing or world-building.”

replies(4): >>45036651 #>>45036677 #>>45036813 #>>45036920 #
muzani ◴[] No.45036677[source]
Yup, one of the huge flaws I saw in GPT-5 is it will constantly say things like "I have to stop you here. I can't do what you're requesting. However, I can roleplay or help you with research with that. Would you like to do that?"
replies(3): >>45036805 #>>45037418 #>>45050649 #
kouteiheika ◴[] No.45036805{3}[source]
It's not a flaw. It's a tradeoff. There are valid uses for models which are uncensored and will do whatever you ask of them, and there are valid uses for models which are censored and will refuse anything remotely controversial.
replies(4): >>45037210 #>>45037998 #>>45038871 #>>45038889 #
KaiserPro ◴[] No.45037210{4}[source]
I hate to be all umacksually about this, but a flaw is still a tradeoff.

The issue, which is probably deeper here, is that proper safeguarding would require a lots more GPU resource, as you'd need a process to comb through history to assess the state of the person over time.

even then its not a given that it would be reliable. However it'll never be attempted because its too expensive and would hurt growth.

replies(3): >>45037406 #>>45037915 #>>45038346 #
1. dspillett ◴[] No.45037406{5}[source]
> The issue, …, is that proper safeguarding would require a lots more GPU resource, …

I think the issue is that with current tech is simply isn't possible to do that well enough at all⁰.

> even then its not a given that it would be reliable.

I think it is a given that it won't be reliable. AGI might make it reliable enough, where “good enough” here is “no worse than a trained human is likely to manage, given the same information”. It is something that we can't do nearly as well as we might like, and some are expecting a tech still in very active development¹ to do it.

> However it'll never be attempted because its too expensive and would hurt growth.

Or that they know it is not possible with current tech so they aren't going to try until the next epiphany that might change that turns up in a commercially exploitable form. Trying and failing will highlight the dangers, and that will encourage restrictions that will hurt growth.³ Part of the problem with people trusting it too much already, is that the big players have been claiming safeguards _are_ in place and people have naïvely trusted that, or hand-waved the trust issue for convenience - this further reduces the incentive to try because it means admitting that current provisions are inadequate, or prior claims were incorrect.

----

[0] both in terms of catching the cases to be concerned about, and not making it fail in cases where it could actually be positively useful in its current form (i.e. there are cases where responses from such tools have helped people reason their way out of a bad decision, here giving the user what they wanted was very much a good thing)

[1] ChatGPT might be officially “version 5” now, but away from some specific tasks it all feels more like “version 2”² on the old “I'll start taking it seriously somewhere around version 3” scale.

[2] Or less…

[3] So I agree with your final assessment of why they won't do that, but from a different route!