Uncensor any LLM with abliteration

1. 29athrowaway ◴[13 Jun 24 05:45 UTC] No.40666313[source]▶

>>40665721 (OP) #

Uncensoring Llama 3 is a violation of the Llama 3 acceptable use policy.

https://llama.meta.com/llama3/use-policy/

> You agree you will not use, or allow others to use, Meta Llama 3 to: <list of bad things>...

That terminates your Llama 3 license forcing you to delete all the "materials" from your system.

replies(4): >>40666327 #>>40666456 #>>40666503 #>>40667231 #

2. schoen ◴[13 Jun 24 05:48 UTC] No.40666327[source]▶

>>40666313 (TP) #

Do you mean to say that teaching people how to do things should be regarded, for this purpose, as a form of allowing them to do those things?

replies(1): >>40666335 #

3. 29athrowaway ◴[13 Jun 24 05:50 UTC] No.40666335[source]▶

>>40666327 #

The article clearly demonstrates how to circumvent the built-in protections in the model that prevent it from doing the stuff that violates the acceptable use policy. Which are clearly the things that are against the public good.

There should be CVEs for AI.

replies(1): >>40666554 #

4. Y_Y ◴[13 Jun 24 06:11 UTC] No.40666456[source]▶

>>40666313 (TP) #

> That terminates your Llama 3 license forcing you to delete all the "materials" from your system.

Or, it means you have broken a contract (of adhesion) formed by acquiring the weights from Meta. You can break contracts! Meta could take a civil case against you, but that's it. The AUP is a document, it's not going to force you to do anything. The court could potentially force you, but that's unlikely, even in the more unlikely event that anyone cares enough to find out what's happening and bring a case against you.

5. pixxel ◴[13 Jun 24 06:20 UTC] No.40666503[source]▶

>>40666313 (TP) #

You’re on Hacker News. It’s a shadow of its former self, but still.

6. logicchains ◴[13 Jun 24 06:28 UTC] No.40666554{3}[source]▶

>>40666335 #

Giving large, politicised software companies the sole power to determine what LLMs can and cannot say is against the public good.

replies(2): >>40666568 #>>40666973 #

7. 29athrowaway ◴[13 Jun 24 06:32 UTC] No.40666568{4}[source]▶

>>40666554 #

Agreed. But uncensoring Llama 3 can do harm in the immediate term.

As much as I am not a fan of Meta, an uncensored Llama 3 in the wrong hands is a universally bad idea.

replies(3): >>40666747 #>>40666983 #>>40667248 #

8. nottorp ◴[13 Jun 24 06:55 UTC] No.40666747{5}[source]▶

>>40666568 #

Universally eh? Who decides what should be censored and what not? You?

9. atwrk ◴[13 Jun 24 07:34 UTC] No.40666973{4}[source]▶

>>40666554 #

LLMs, in this context, are nothing more than search indexes. The exact same information is a google query away. Publicly crawlable information was the training material for them, after all.

replies(1): >>40667741 #

10. pantalaimon ◴[13 Jun 24 07:36 UTC] No.40666983{5}[source]▶

>>40666568 #

> But uncensoring Llama 3 can do harm in the immediate term

How so?

11. irusensei ◴[13 Jun 24 08:25 UTC] No.40667231[source]▶

>>40666313 (TP) #

I am under the opinion that terms of use from models trained out of public (often stolen) content should be disregarded by the general public.

replies(1): >>40675620 #

12. wruza ◴[13 Jun 24 08:28 UTC] No.40667248{5}[source]▶

>>40666568 #

Almost everything in wrong hands is universally a bad idea. This phrase is just FUD and makes little sense.

13. LoganDark ◴[13 Jun 24 09:59 UTC] No.40667741{5}[source]▶

>>40666973 #

LLMs aren't indexes. You can't query them. There's no way to know if a piece of information exists within it, or how to access the information.

replies(1): >>40667870 #

14. atwrk ◴[13 Jun 24 10:21 UTC] No.40667870{6}[source]▶

>>40667741 #

I'm quite aware, hence in this context, meaning the ability for users to query potentially questionable content, not the inner workings. Probably should have phrased it differently.

replies(1): >>40680175 #

15. 93po ◴[13 Jun 24 22:35 UTC] No.40675620[source]▶

>>40667231 #

i find the concept of "stolen" text, which was originally crowd-sourced for free, to be a really tiresome argument. i don't exactly understand why anyone would defend, for example, reddit's ownership over the content generated by its millions of users. i am glad my decade of shit posting on reddit contributed to something other than reddit's profits

16. LoganDark ◴[14 Jun 24 12:19 UTC] No.40680175{7}[source]▶

>>40667870 #

The danger of LLMs isn't really in their ability to parrot existing questionable content, but in their ability to generate novel questionable content. That's what's got everyone obsessed about safety.

- Generating new malware.

- Generating new propaganda or hate speech.

- Generating directions for something risky (that turn out to be wrong enough to get someone injured or killed).

But LLMs generate nearly everything they output. Even with greedy sampling, they do not always repeat the dataset verbatim, especially if they haven't seen the prompt verbatim. So you need to prevent them from engaging in entire classes of questionable topics if you want any hope of restricting those types of questionable content.

It's not "we can't let this model get into the hands of adversaries, it's too powerful" like every LLM creator claims. It's "we can't let our model be the one adversaries are using", or in other words, "we can't let our reputation be ruined from our model powering something bad".

So, then, it's not "we can't let people get dangerous info from our model". It's "we can't let new dangerous info have come from our model". As an example, Google got so much shit for their LLM-powered dumpster fire telling people to put glue on pizza.