OpenAI delays launch of open-weight model

(twitter.com)

167 points martinald | 2 comments | 12 Jul 25 01:07 UTC | HN request time: 0.003s | source

Show context

ryao ◴[12 Jul 25 02:18 UTC] No.44538755[source]▶

Am I the only one who thinks mention of “safety tests” for LLMs is a marketing scheme? Cars, planes and elevators have safety tests. LLMs don’t. Nobody is going to die if a LLM gives an output that its creators do not like, yet when they say “safety tests”, they mean that they are checking to what extent the LLM will say things they do not like.

replies(9): >>44538785 #>>44538805 #>>44538808 #>>44538903 #>>44538929 #>>44539030 #>>44539924 #>>44540225 #>>44540905 #

1. ks2048 ◴[12 Jul 25 02:30 UTC] No.44538805[source]▶

>>44538755 #

You could be right about this being an excuse for some other reason, but lots of software has “safety tests” beyond life or death situations.

Most companies, for better or worse (I say for better) don’t want their new chatbot to be a RoboHitler, for example.

replies(1): >>44538829 #

2. ryao ◴[12 Jul 25 02:37 UTC] No.44538829[source]▶

>>44538805 (TP) #

It is possible to turn any open weight model into that with fine tuning. It is likely possible to do that with closed weight models, even when there is no creator provided sandbox for fine tuning them, through clever prompting and trying over and over again. It is unfortunate, but there really is no avoiding that.

That said, I am happy to accept the term safety used in other places, but here it just seems like a marketing term. From my recollection, OpenAI had made a push to get regulation that would stifle competition by talking about these things as dangerous and needing safety. Then they backtracked somewhat when they found the proposed regulations would restrict themselves rather than just their competitors. However, they are still pushing this safety narrative that was never really appropriate. They have a term for this called alignment and what they are doing are tests to verify alignment in areas that they deem sensitive so that they have a rough idea to what extent the outputs might contain things that they do not like in those areas.

↑