←back to thread

534 points BlueFalconHD | 1 comments | | HN request time: 0.208s | source

I managed to reverse engineer the encryption (refered to as “Obfuscation” in the framework) responsible for managing the safety filters of Apple Intelligence models. I have extracted them into a repository. I encourage you to take a look around.
Show context
trebligdivad ◴[] No.44483981[source]
Some of the combinations are a bit weird, This one has lots of stuff avoiding death....together with a set ensuring all the Apple brands have the correct capitalisation. Priorities hey!

https://github.com/BlueFalconHD/apple_generative_model_safet...

replies(11): >>44483999 #>>44484073 #>>44484095 #>>44484410 #>>44484636 #>>44486072 #>>44487916 #>>44488185 #>>44488279 #>>44488362 #>>44488856 #
grues-dinner ◴[] No.44484073[source]
Interesting that it didn't seem to include "unalive".

Which as a phenomenon is so very telling that no one actually cares what people are really saying. Everyone, including the platforms knows what that means. It's all performative.

replies(11): >>44484164 #>>44484360 #>>44484635 #>>44484665 #>>44485033 #>>44485034 #>>44486246 #>>44487244 #>>44488055 #>>44488114 #>>44500918 #
j-krieger ◴[] No.44488114[source]
It's also a shining example of American puritanism. Asian models or those in Europe are far less censored.
replies(6): >>44488741 #>>44488993 #>>44489194 #>>44489626 #>>44489822 #>>44491464 #
immibis ◴[] No.44489194[source]
Really? What does DeepSeek say about Tiananmen Square? I'm not aware of any German models, but if you find one you should ask it what it thinks about Palestine.

(<s>Qwen</s> Mistral is French, but I have no idea what stuff would be censored in France)

replies(6): >>44489393 #>>44489618 #>>44489714 #>>44490860 #>>44491679 #>>44494195 #
GuB-42 ◴[] No.44494195[source]
> I have no idea what stuff would be censored in France

Being French, what is the most likely to be censored relates to the Nazis. Holocaust denial is a crime for instance. Hate speech in general, including racism, antisemitism, homophobia, sexism, etc... is less tolerated than in countries like the US that have a more "free for all" view of free speech. We also have strong anti-defamation laws, that can also apply to true, but misleading statements.

But other than that, there is not much political censorship. In fact, we are known for our protests, heated debates and satirical papers. It is not perfect, but on top of my head, I can't think of anything particular a LLM could censor except the usual "hate speech" that most LLMs censor already.

When it comes to Israel-Palestine, it is a hot topic, but there is not real censorship here, even though both side will claim they are of course.

replies(2): >>44495564 #>>44498915 #
1. ◴[] No.44498915[source]