OpenAI delays launch of open-weight model

(twitter.com)

171 points martinald | 5 comments | 12 Jul 25 01:07 UTC | HN request time: 0.941s | source

Show context

ryao ◴[12 Jul 25 02:18 UTC] No.44538755[source]▶

Am I the only one who thinks mention of “safety tests” for LLMs is a marketing scheme? Cars, planes and elevators have safety tests. LLMs don’t. Nobody is going to die if a LLM gives an output that its creators do not like, yet when they say “safety tests”, they mean that they are checking to what extent the LLM will say things they do not like.

replies(9): >>44538785 #>>44538805 #>>44538808 #>>44538903 #>>44538929 #>>44539030 #>>44539924 #>>44540225 #>>44540905 #

natrius ◴[12 Jul 25 02:30 UTC] No.44538808[source]▶

>>44538755 #

An LLM can trivially instruct someone to take medications with adverse interactions, steer a mental health crisis toward suicide, or make a compelling case that a particular ethnic group is the cause of your society's biggest problem so they should be eliminated. Words can't kill people, but words can definitely lead to deaths.

That's not even considering tool use!

replies(9): >>44538847 #>>44538877 #>>44538896 #>>44538914 #>>44539109 #>>44539685 #>>44539785 #>>44539805 #>>44540111 #

1. thayne ◴[12 Jul 25 03:44 UTC] No.44539109[source]▶

>>44538808 #

Part of the problem is due to the marketing of LLMs as more capable and trustworthy than they really are.

And the safety testing actually makes this worse, because it leads people to trust that LLMs are less likely to give dangerous advice, when they could still do so.

replies(2): >>44540964 #>>44541795 #

2. jdross ◴[12 Jul 25 10:31 UTC] No.44540964[source]▶

>>44539109 (TP) #

Spend 15 minutes talking to a person in their 20's about how they use ChatGPT to work through issues in their personal lives and you'll see how much they already trust the "advice" and other information produced by LLMs.

Manipulation is a genuine concern!

replies(1): >>44541158 #

3. justacrow ◴[12 Jul 25 11:09 UTC] No.44541158[source]▶

>>44540964 #

It's not just young people. My boss (originally a programmer) agreed with me that there's lots of problems using ChatGPT for our products and programs as it gives the wrong answers too often, but tgen 30 seconds later told me that it was apparently great at giving medical advice.

...later someone higher-up decided that it's actually great at programming as well, and so now we all believe it's incredibly useful and necessary for us to be able to do our daily work

replies(1): >>44541570 #

4. literalAardvark ◴[12 Jul 25 12:24 UTC] No.44541570{3}[source]▶

>>44541158 #

Most doctors will prescribe antibiotics for viral infections just to get you out and the next guy in, they have zero interest in sitting there to troubleshoot with you.

For this reason o3 is way better than most of the doctors I've had access to, to the point where my PCP just writes whatever I brought in because she can't follow 3/4 of it.

Yes, the answers are often wrong and incomplete, and it's up to you to guide the model to sort it out, but it's just like vibe coding: if you put in the steering effort, you can get a decent output.

Would it be better if you could hire an actual professional to do it? Of course. But most of us are priced out of that level of care.

5. brookst ◴[12 Jul 25 13:06 UTC] No.44541795[source]▶

>>44539109 (TP) #

Can you point to a specific bit of marketing that says to take whatever medications a LLM suggests, or other similar overreach?

People keep talking about this “marketing”, and I have yet to see a single example.

↑