I managed to reverse engineer the encryption (refered to as “Obfuscation” in the framework) responsible for managing the safety filters of Apple Intelligence models. I have extracted them into a repository. I encourage you to take a look around.
Perhaps in context? Maybe the training data picked up on her name as potentially used as a "slur" associated with her race. Wonder if there are others I know I can look.