←back to thread

534 points BlueFalconHD | 3 comments | | HN request time: 0.457s | source

I managed to reverse engineer the encryption (refered to as “Obfuscation” in the framework) responsible for managing the safety filters of Apple Intelligence models. I have extracted them into a repository. I encourage you to take a look around.
1. bombcar ◴[] No.44483830[source]
There’s got to be a way to turn these lists of “naughty words” into shibboleths somehow.
replies(2): >>44484345 #>>44485000 #
2. spydum ◴[] No.44484345[source]
Love idea, but I think there are simply too many models to make it practical?
3. immibis ◴[] No.44485000[source]
Like asking sensitive employment candidates about Kim Jong Un's roundness to check if they're North Korean spies, we could ask humans what they think about Trump and Palestine to check if they're computers.

However, I think about half of real humans would also fail the test.