(github.com)

536 points BlueFalconHD | 1 comments | 06 Jul 25 19:50 UTC | HN request time: 0.218s | source

I managed to reverse engineer the encryption (refered to as “Obfuscation” in the framework) responsible for managing the safety filters of Apple Intelligence models. I have extracted them into a repository. I encourage you to take a look around.

Show context

torginus ◴[06 Jul 25 21:31 UTC] No.44484236[source]▶

>>44483485 (OP) #

I find it funny that AGI is supposed to be right around the corner, while these supposedly super smart LLMs still need to get their outputs filtered by regexes.

replies(8): >>44484268 #>>44484323 #>>44484354 #>>44485047 #>>44485237 #>>44486883 #>>44487765 #>>44493460 #

1. crazylogger ◴[07 Jul 25 04:56 UTC] No.44486883[source]▶

>>44484236 #

Humans are checked against various rules and laws (often carried out by other humans.) So this is how it's going to be implemented in an "AI organization" as well. Nothing strange about this really.

LLM is easier to work with because you can stop a bad behavior before it happens. It can be done either with deterministic programs or using LLM. Claude Code uses a LLM to review every bash command to be run - simple prefix matching has loopholes.

↑

I extracted the safety filters from Apple Intelligence models