Most active commenters
  • qingcharles(4)

←back to thread

536 points BlueFalconHD | 13 comments | | HN request time: 0.868s | source | bottom

I managed to reverse engineer the encryption (refered to as “Obfuscation” in the framework) responsible for managing the safety filters of Apple Intelligence models. I have extracted them into a repository. I encourage you to take a look around.
Show context
trebligdivad ◴[] No.44483981[source]
Some of the combinations are a bit weird, This one has lots of stuff avoiding death....together with a set ensuring all the Apple brands have the correct capitalisation. Priorities hey!

https://github.com/BlueFalconHD/apple_generative_model_safet...

replies(11): >>44483999 #>>44484073 #>>44484095 #>>44484410 #>>44484636 #>>44486072 #>>44487916 #>>44488185 #>>44488279 #>>44488362 #>>44488856 #
grues-dinner ◴[] No.44484073[source]
Interesting that it didn't seem to include "unalive".

Which as a phenomenon is so very telling that no one actually cares what people are really saying. Everyone, including the platforms knows what that means. It's all performative.

replies(11): >>44484164 #>>44484360 #>>44484635 #>>44484665 #>>44485033 #>>44485034 #>>44486246 #>>44487244 #>>44488055 #>>44488114 #>>44500918 #
qingcharles ◴[] No.44484164[source]
It's totally performative. There's no way to stay ahead of the new language that people create.

At what point do the new words become the actual words? Are there many instances of people using unalive IRL?

replies(17): >>44484171 #>>44484218 #>>44484614 #>>44484958 #>>44484970 #>>44484989 #>>44485202 #>>44485277 #>>44485309 #>>44486128 #>>44486394 #>>44487625 #>>44487839 #>>44487936 #>>44488097 #>>44488704 #>>44493436 #
1. fouronnes3 ◴[] No.44484218[source]
This question is sort of the same as asking why the universal translator wasn't able to translate the metaphor language of the Star Trek episode Darmok. Surely if the metaphor has become the first order meaning then there's no litteral meaning anymore.
replies(2): >>44484280 #>>44485002 #
2. qingcharles ◴[] No.44484280[source]
I guess, so far, the people inventing the words have left the meaning clear with things like "un-alive" which is readable even to someone coming across it for the first time.

Your point stands when we start replacing the banned words with things like "suicide" for "donkeyrhubarb" and then the walls really will fall.

replies(3): >>44484518 #>>44484885 #>>44485791 #
3. userbinator ◴[] No.44484518[source]
This form of obfuscation has actually already occurred over a century ago: https://en.wikipedia.org/wiki/Cockney_rhyming_slang
replies(3): >>44485043 #>>44487942 #>>44496850 #
4. mananaysiempre ◴[] No.44484885[source]
Aquatic product[1]?

[1] https://en.wikipedia.org/wiki/Euphemisms_for_Internet_censor...

replies(1): >>44484935 #
5. immibis ◴[] No.44484935{3}[source]
An English equivalent is "sewer slide".
6. tjwebbnorfolk ◴[] No.44485002[source]
The only reason kids started using "unalive" is to get around Youtube filters that disallow the use of the word "kill"
replies(1): >>44489056 #
7. t-3 ◴[] No.44485043{3}[source]
Rhyming slang rhymes tho. The recipient can understand what's meant by de-obfuscating in-context. Random strings substituted for $proscribed_word don't work in the same way.
replies(1): >>44485127 #
8. waterproof ◴[] No.44485127{4}[source]
In Cockney rhyming slang, the rhyming word (which would be easy to reverse engineer) is omitted. So if "stairs" is rhyme-paired with "apples and pears" and then people just use the word "apples" in place of "stairs". "Pears" is omitted in common use so you can't just reverse the rhyme.

The example photo on Wikipedia includes the rhyming words but that's not how it would be used IRL.

9. marcus_holmes ◴[] No.44485791[source]
I've heard "pr0n" used in actual real-world conversation, only slightly ironically.
10. zimpenfish ◴[] No.44487942{3}[source]
See also Polari[0] and the Grass Mud Horse Lexicon[1]

[0] https://en.wikipedia.org/wiki/Polari

[1] https://languagelog.ldc.upenn.edu/nll/?p=6538 (CDT links broken, use [2])

[2] https://chinadigitaltimes.net/space/Grass-Mud_Horse_Lexicon_...

11. mattigames ◴[] No.44489056[source]
Pretty sure TikTok filters do the same and was also a major influence in using that term
replies(1): >>44496872 #
12. qingcharles ◴[] No.44496850{3}[source]
Shaka!
13. qingcharles ◴[] No.44496872{3}[source]
They do. I made a joke about cocaine in old Coca-Cola in a text caption† on a video, and while TikTok didn't ban the post per se it refused to allow it on the FYP.

† proving that TikTok's system actually analyzes every frame of an uploaded video with OCR of some sort to see what's on there.