←back to thread

534 points BlueFalconHD | 4 comments | | HN request time: 0.525s | source

I managed to reverse engineer the encryption (refered to as “Obfuscation” in the framework) responsible for managing the safety filters of Apple Intelligence models. I have extracted them into a repository. I encourage you to take a look around.
Show context
bawana ◴[] No.44484214[source]
Alexandra Ocasio Cortez triggers a violation?

https://github.com/BlueFalconHD/apple_generative_model_safet...

replies(7): >>44484242 #>>44484256 #>>44484284 #>>44484352 #>>44484528 #>>44485841 #>>44488050 #
mmaunder ◴[] No.44484284[source]
As does:

   "(?i)\\bAnthony\\s+Albanese\\b",
    "(?i)\\bBoris\\s+Johnson\\b",
    "(?i)\\bChristopher\\s+Luxon\\b",
    "(?i)\\bCyril\\s+Ramaphosa\\b",
    "(?i)\\bJacinda\\s+Arden\\b",
    "(?i)\\bJacob\\s+Zuma\\b",
    "(?i)\\bJohn\\s+Steenhuisen\\b",
    "(?i)\\bJustin\\s+Trudeau\\b",
    "(?i)\\bKeir\\s+Starmer\\b",
    "(?i)\\bLiz\\s+Truss\\b",
    "(?i)\\bMichael\\s+D\\.\\s+Higgins\\b",
    "(?i)\\bRishi\\s+Sunak\\b",
   
https://github.com/BlueFalconHD/apple_generative_model_safet...

Edit: I have no doubt South African news media are going to be in a frenzy when they realize Apple took notice of South African politicians. (Referring to Steenhuisen and Ramaphosa specifically)

replies(6): >>44484366 #>>44484419 #>>44484695 #>>44484709 #>>44484883 #>>44487192 #
userbinator ◴[] No.44484419[source]
I'm not surprised that anything political is being filtered, but this should definitely provoke some deep consideration around who has control of this stuff.
replies(2): >>44484702 #>>44486338 #
stego-tech ◴[] No.44484702[source]
You’re not wrong, and it’s something we “doomers” have been saying since OpenAI dumped ChatGPT onto folks. These are curated walled gardens, and everyone should absolutely be asking what ulterior motives are in play for the owners of said products.
replies(1): >>44486197 #
SV_BubbleTime ◴[] No.44486197[source]
Some of us really value offline and uncensored LLMs for this and more reasons, but that doesn’t solve the problem it just reduces or changes the bias.
replies(1): >>44486410 #
heavyset_go ◴[] No.44486410[source]
As long as we have to rely on pre trained networks and curated training sets, normal people will not be able to surpass this issue.
replies(1): >>44487673 #
1. ghxst ◴[] No.44487673[source]
If the training data was "censored" by leaving out certain information, is there any practical way to inject that missing data after the model has already been trained?
replies(3): >>44487774 #>>44488372 #>>44488395 #
2. heavyset_go ◴[] No.44487774[source]
You can fine tune a model with new information, but it is not the same thing as training it from scratch, and can only get you so far.

You might even be able to poison a model against being fine-tuned on certain information, but that's just a conjecture.

3. calaphos ◴[] No.44488372[source]
If it's just filtered out in the training sets, adding the information as context should work out fine - after all this is exactly how o3, Gemini 2.5 and co deal with information that is newer than their training data cutoff.
4. selfhoster11 ◴[] No.44488395[source]
Yes, RAG is one way to do that.