←back to thread

534 points BlueFalconHD | 1 comments | | HN request time: 0.233s | source

I managed to reverse engineer the encryption (refered to as “Obfuscation” in the framework) responsible for managing the safety filters of Apple Intelligence models. I have extracted them into a repository. I encourage you to take a look around.
Show context
torginus ◴[] No.44484236[source]
I find it funny that AGI is supposed to be right around the corner, while these supposedly super smart LLMs still need to get their outputs filtered by regexes.
replies(8): >>44484268 #>>44484323 #>>44484354 #>>44485047 #>>44485237 #>>44486883 #>>44487765 #>>44493460 #
1. crazylogger ◴[] No.44486883[source]
Humans are checked against various rules and laws (often carried out by other humans.) So this is how it's going to be implemented in an "AI organization" as well. Nothing strange about this really.

LLM is easier to work with because you can stop a bad behavior before it happens. It can be done either with deterministic programs or using LLM. Claude Code uses a LLM to review every bash command to be run - simple prefix matching has loopholes.