←back to thread

534 points BlueFalconHD | 7 comments | | HN request time: 0.437s | source | bottom

I managed to reverse engineer the encryption (refered to as “Obfuscation” in the framework) responsible for managing the safety filters of Apple Intelligence models. I have extracted them into a repository. I encourage you to take a look around.
Show context
trebligdivad ◴[] No.44483981[source]
Some of the combinations are a bit weird, This one has lots of stuff avoiding death....together with a set ensuring all the Apple brands have the correct capitalisation. Priorities hey!

https://github.com/BlueFalconHD/apple_generative_model_safet...

replies(11): >>44483999 #>>44484073 #>>44484095 #>>44484410 #>>44484636 #>>44486072 #>>44487916 #>>44488185 #>>44488279 #>>44488362 #>>44488856 #
grues-dinner ◴[] No.44484073[source]
Interesting that it didn't seem to include "unalive".

Which as a phenomenon is so very telling that no one actually cares what people are really saying. Everyone, including the platforms knows what that means. It's all performative.

replies(11): >>44484164 #>>44484360 #>>44484635 #>>44484665 #>>44485033 #>>44485034 #>>44486246 #>>44487244 #>>44488055 #>>44488114 #>>44500918 #
qingcharles ◴[] No.44484164[source]
It's totally performative. There's no way to stay ahead of the new language that people create.

At what point do the new words become the actual words? Are there many instances of people using unalive IRL?

replies(17): >>44484171 #>>44484218 #>>44484614 #>>44484958 #>>44484970 #>>44484989 #>>44485202 #>>44485277 #>>44485309 #>>44486128 #>>44486394 #>>44487625 #>>44487839 #>>44487936 #>>44488097 #>>44488704 #>>44493436 #
Terr_ ◴[] No.44484970[source]
> There's no way to stay ahead of the new language that people create.

I'm imagining a new exploit: After someone says something totally innocent, people gang up in the comments to act like a terrible vicious slur has been said, and then the moderation system (with an LLM involved somewhere) "learns" that an arbitrary term is heinous eand indirectly bans any discussion of that topic.

replies(5): >>44485038 #>>44485110 #>>44485356 #>>44486827 #>>44486843 #
grues-dinner ◴[] No.44486843[source]
The first half of that already happened with the OK gesture: https://www.bbc.co.uk/news/newsbeat-49837898.

Though it would be fun to see what happens if an LLM if used to ban anything that tends to generate heated exchanges. It would presumably learn to ban racial terms, politics and politicians and words like "immigrant" (i.e. basically the list in this repo), but what else could it be persuaded to ban? Vim and Emacs? SystemD? Anything involving cyclists? Parenting advice?

replies(3): >>44487593 #>>44488628 #>>44488746 #
immibis ◴[] No.44487593[source]
People weren't using the OK gesture innocently. After 4chan trolls decided to start pretending it was a white supremacist symbol, actual white supremacists started using it as a symbol.
replies(2): >>44488027 #>>44488626 #
1. PunchyHamster ◴[] No.44488626[source]
then congratulations on making white supremacists define your langyage
replies(1): >>44489084 #
2. immibis ◴[] No.44489084[source]
Do you still use swastikas as symbols of peace and love because you don't want white supremacists to define your language?

I strongly doubt you do that. Whether you like it or not, the Nazis defined what the swastika means now.

replies(4): >>44489609 #>>44490272 #>>44495182 #>>44500418 #
3. anton-c ◴[] No.44489609[source]
It's still seen in the countries that used it that way and is seen as benign.

It can be easily summoned with the Japanese keyboard. It's seen on Buddhist temples all over Asia.

replies(1): >>44512393 #
4. mopsi ◴[] No.44490272[source]
Finnish use of swastika predates Germany and the Finnish Air Force Academy uses swastika to this day in their official insignia: https://en.wikipedia.org/wiki/Air_Force_Academy_(Finland)

Taboos are a cultural thing, and the world is (thankfully) very far from having a monoculture shaped by NYC's neurotic intellectuals.

5. coldtea ◴[] No.44495182[source]
>Do you still use swastikas as symbols of peace and love because you don't want white supremacists to define your language?

They were hardly ever used in the west for at least a full millenium before the Nazis too (except a handful of cases, where they still use them, like the Finnish Air Force), so that's a moot analogy.

In Asia, they still use them just fine, in houses, temples, businesses, and elsewhere.

6. fennecbutt ◴[] No.44500418[source]
No, because western culture never really did. However the countries who have been using it for at least thousands of years in Buddhism are still using it just fine.

In fact there was a recent thing with one of the BTS members' uniform (worn during mandatory military service period in South Korea), which had the regular (not tilted) swastika on it because he was assigned to religious duties.

And of course the western world/media ran away with it. Plenty of absolutely brain dead people out there who couldn't research a topic to gain an understanding to save their lives.

7. immibis ◴[] No.44512393{3}[source]
Do Japanese people speak the same language as you and I?