(github.com)

536 points BlueFalconHD | 3 comments | 06 Jul 25 19:50 UTC | HN request time: 0.68s | source

I managed to reverse engineer the encryption (refered to as “Obfuscation” in the framework) responsible for managing the safety filters of Apple Intelligence models. I have extracted them into a repository. I encourage you to take a look around.

Show context

trebligdivad ◴[06 Jul 25 20:56 UTC] No.44483981[source]▶

>>44483485 (OP) #

Some of the combinations are a bit weird, This one has lots of stuff avoiding death....together with a set ensuring all the Apple brands have the correct capitalisation. Priorities hey!

https://github.com/BlueFalconHD/apple_generative_model_safet...

replies(11): >>44483999 #>>44484073 #>>44484095 #>>44484410 #>>44484636 #>>44486072 #>>44487916 #>>44488185 #>>44488279 #>>44488362 #>>44488856 #

grues-dinner ◴[06 Jul 25 21:09 UTC] No.44484073[source]▶

>>44483981 #

Interesting that it didn't seem to include "unalive".

Which as a phenomenon is so very telling that no one actually cares what people are really saying. Everyone, including the platforms knows what that means. It's all performative.

replies(11): >>44484164 #>>44484360 #>>44484635 #>>44484665 #>>44485033 #>>44485034 #>>44486246 #>>44487244 #>>44488055 #>>44488114 #>>44500918 #

j-krieger ◴[07 Jul 25 08:45 UTC] No.44488114[source]▶

>>44484073 #

It's also a shining example of American puritanism. Asian models or those in Europe are far less censored.

replies(6): >>44488741 #>>44488993 #>>44489194 #>>44489626 #>>44489822 #>>44491464 #

immibis ◴[07 Jul 25 11:33 UTC] No.44489194[source]▶

>>44488114 #

Really? What does DeepSeek say about Tiananmen Square? I'm not aware of any German models, but if you find one you should ask it what it thinks about Palestine.

(<s>Qwen</s> Mistral is French, but I have no idea what stuff would be censored in France)

replies(6): >>44489393 #>>44489618 #>>44489714 #>>44490860 #>>44491679 #>>44494195 #

MisterTea ◴[07 Jul 25 12:23 UTC] No.44489618[source]▶

>>44489194 #

> but if you find one you should ask it what it thinks about Palestine.

Models can think and have opinions?

replies(1): >>44490943 #

kube-system ◴[07 Jul 25 14:47 UTC] No.44490943[source]▶

>>44489618 #

Non sequitor. Phrasing queries in natural language doesn't mean people actually believe machines are human.

replies(1): >>44491656 #

MisterTea ◴[07 Jul 25 15:54 UTC] No.44491656[source]▶

>>44490943 #

> doesn't mean people actually believe machines are human.

They don't have to believe it's a human. I know a person who admitted to arguing with an LLM.

replies(1): >>44491784 #

1. kube-system ◴[07 Jul 25 16:08 UTC] No.44491784[source]▶

>>44491656 #

Which still does not demonstrate that they believe it has opinions. Natural language is how you interact with an LLM -- interactions will mimic human interaction, even for those who realize it is not sentient.

replies(1): >>44499151 #

2. MisterTea ◴[08 Jul 25 11:53 UTC] No.44499151[source]▶

>>44491784 (TP) #

They were under the impression they could in fact change the AI's mind. So yes, they did believe it has an opinion. They believed it was sentient and able to think for itself. Do not underestimate peoples inability to distinguish between a very clever Markov chain and actual intelligence. The future is going to be ... interesting.

replies(1): >>44502263 #

3. kube-system ◴[08 Jul 25 17:41 UTC] No.44502263[source]▶

>>44499151 #

>They were under the impression they could in fact change the AI's mind.

They aren't really wrong here. LLMs are often trained on input. Have you considered you might just be taking their anthropomorphism a little too literally? People have used these anthropomorphic metaphors for computers since the Babbage machine.

↑

I extracted the safety filters from Apple Intelligence models