←back to thread

534 points BlueFalconHD | 2 comments | | HN request time: 0.016s | source

I managed to reverse engineer the encryption (refered to as “Obfuscation” in the framework) responsible for managing the safety filters of Apple Intelligence models. I have extracted them into a repository. I encourage you to take a look around.
Show context
torginus ◴[] No.44484236[source]
I find it funny that AGI is supposed to be right around the corner, while these supposedly super smart LLMs still need to get their outputs filtered by regexes.
replies(8): >>44484268 #>>44484323 #>>44484354 #>>44485047 #>>44485237 #>>44486883 #>>44487765 #>>44493460 #
1. fl0id ◴[] No.44487765[source]
Actually even of their was AGI, it would be even more necessary to control it.
replies(1): >>44489484 #
2. mailund ◴[] No.44489484[source]
I feel that if teenagers are able to trivially bypass illegal-word filters by substituting with words that obviously mean the same thing, I think an AGI wouldn't be too inhibited by this either