←back to thread

534 points BlueFalconHD | 1 comments | | HN request time: 0.367s | source

I managed to reverse engineer the encryption (refered to as “Obfuscation” in the framework) responsible for managing the safety filters of Apple Intelligence models. I have extracted them into a repository. I encourage you to take a look around.
Show context
userbinator ◴[] No.44484484[source]
China calls it "harmonious society", we call it "safety". Censorship by any other name would be just as effective for manipulating the thoughts of the populace. It's not often that you get to see stuff like this.
replies(4): >>44484542 #>>44485060 #>>44487180 #>>44487705 #
1. jeroenhd ◴[] No.44487705[source]
I still remember when "bush hid the facts" went around the news cycle. Entertainment services will absolutely slam and misrepresent any small mistake made by large companies.

I don't think it's as much a problem with safety as it is a problem with AI. We haven't figured out how to remove information from LLMs so when an LLM starts spouting bullshit like "<random name> is a paedophile", companies using AI have no recourse but to rewrite the input/output of their predictive text engines. It's no different than when Microsoft manually blacklisted the function name for the Fast Inverse Square Root that it spat out verbatim, rather than actually removing the code from their LLM.

This isn't 1984 as much as it's companies trying to hide that their software isn't ready for real world use by patching up the mistakes in real time.