(github.com)

534 points BlueFalconHD | 1 comments | 06 Jul 25 19:50 UTC | HN request time: 0.209s | source

I managed to reverse engineer the encryption (refered to as “Obfuscation” in the framework) responsible for managing the safety filters of Apple Intelligence models. I have extracted them into a repository. I encourage you to take a look around.

Show context

binarymax ◴[06 Jul 25 20:50 UTC] No.44483936[source]▶

>>44483485 (OP) #

Wow, this is pretty silly. If things are like this at Apple I’m not sure what to think.

https://github.com/BlueFalconHD/apple_generative_model_safet...

EDIT: just to be clear, things like this are easily bypassed. “Boris Johnson”=>”B0ris Johnson” will skip right over the regex and will be recognized just fine by an LLM.

replies(7): >>44484127 #>>44484154 #>>44484177 #>>44484296 #>>44484501 #>>44484693 #>>44489367 #

Lockal ◴[07 Jul 25 11:53 UTC] No.44489367[source]▶

>>44483936 #

What prevents Apple from applying a quick anti-typo LLM which restores B0ris, unalive, fixs tpyos, and replaces "slumbering steed" with a "sleeping horse", not just for censorship, but also to improve generation results?

replies(1): >>44491272 #

1. the_mar ◴[07 Jul 25 15:16 UTC] No.44491272[source]▶

>>44489367 #

why do you think this doesn't already exist?

↑

I extracted the safety filters from Apple Intelligence models