OpenAI delays launch of open-weight model

(twitter.com)

171 points martinald | 1 comments | 12 Jul 25 01:07 UTC | HN request time: 0s | source

Show context

ryao ◴[12 Jul 25 02:18 UTC] No.44538755[source]▶

Am I the only one who thinks mention of “safety tests” for LLMs is a marketing scheme? Cars, planes and elevators have safety tests. LLMs don’t. Nobody is going to die if a LLM gives an output that its creators do not like, yet when they say “safety tests”, they mean that they are checking to what extent the LLM will say things they do not like.

replies(9): >>44538785 #>>44538805 #>>44538808 #>>44538903 #>>44538929 #>>44539030 #>>44539924 #>>44540225 #>>44540905 #

natrius ◴[12 Jul 25 02:30 UTC] No.44538808[source]▶

>>44538755 #

An LLM can trivially instruct someone to take medications with adverse interactions, steer a mental health crisis toward suicide, or make a compelling case that a particular ethnic group is the cause of your society's biggest problem so they should be eliminated. Words can't kill people, but words can definitely lead to deaths.

That's not even considering tool use!

replies(9): >>44538847 #>>44538877 #>>44538896 #>>44538914 #>>44539109 #>>44539685 #>>44539785 #>>44539805 #>>44540111 #

123yawaworht456 ◴[12 Jul 25 02:46 UTC] No.44538877[source]▶

>>44538808 #

does your CPU, your OS, your web browser come with ~~built-in censorship~~ safety filters too?

AI 'safety' is one of the most neurotic twitter-era nanny bullshit things in existence, blatantly obviously invented to regulate small competitors out of existence.

replies(3): >>44539019 #>>44539668 #>>44539763 #

no_wizard ◴[12 Jul 25 03:21 UTC] No.44539019[source]▶

>>44538877 #

It isn’t. This is dismissive without first thinking through the difference of application.

AI safety is about proactive safety. Such an example: if an AI model could be used to screen hiring applications, making sure it doesn’t have any weighted racial biases.

The difference here is that it’s not reactive. Reading a book with a racial bias would be the inverse; where you would be reacting to that information.

That’s the basis of proper AI safety in a nutshell

replies(2): >>44539067 #>>44539808 #

ryao ◴[12 Jul 25 03:34 UTC] No.44539067[source]▶

>>44539019 #

As someone who has reviewed people’s résumés that they submitted with job applications in the past, I find it difficult to imagine this. The résumés that I saw had no racial information. I suppose the names might have some correlation to such information, but anyone feeding these things into a LLM for evaluation would likely censor the name to avoid bias. I do not see an opportunity for proactive safety in the LLM design here. It is not even clear that they even are evaluating whether there is bias in such a scenario when someone did not properly sanitize inputs.

replies(2): >>44539127 #>>44539553 #

kalkin ◴[12 Jul 25 05:27 UTC] No.44539553[source]▶

>>44539067 #

> I find it difficult to imagine this

Luckily, this is something that can be studied and has been. Sticking a stereotypically Black name on a resume on average substantially decreases the likelihood that the applicant will get past a resume screen, compared to the same resume with a generic or stereotypically White name:

https://www.npr.org/2024/04/11/1243713272/resume-bias-study-...

replies(1): >>44539705 #

bigstrat2003 ◴[12 Jul 25 06:02 UTC] No.44539705[source]▶

>>44539553 #

That is a terrible study. The stereotypically black names are not just stereotypically black, they are stereotypical for the underclass of trashy people. You would also see much higher rejection rates if you slapped stereotypical white underclass names like "Bubba" or "Cleetus" on resumes. As is almost always the case, this claim of racism in America is really classism and has little to do with race.

replies(1): >>44539846 #

1. stonogo ◴[12 Jul 25 06:34 UTC] No.44539846[source]▶

>>44539705 #

"Names from N.C. speeding tickets were selected from the most common names where at least 90% of individuals are reported to belong to the relevant race and gender group."

Got a better suggestion?

↑