https://github.com/BlueFalconHD/apple_generative_model_safet...
EDIT: just to be clear, things like this are easily bypassed. “Boris Johnson”=>”B0ris Johnson” will skip right over the regex and will be recognized just fine by an LLM.
https://github.com/BlueFalconHD/apple_generative_model_safet...
EDIT: just to be clear, things like this are easily bypassed. “Boris Johnson”=>”B0ris Johnson” will skip right over the regex and will be recognized just fine by an LLM.
I can tell you from using Microsoft's products that safety filters appears in a bunch of places. M365 for example, your prompts are never totally your prompts, every single one gets rewritten. It's detailed here: https://learn.microsoft.com/en-us/copilot/microsoft-365/micr...
There's a more illuminating image of the Copilot architecture here: https://i.imgur.com/2vQYGoK.png which I was able to find from https://labs.zenity.io/p/inside-microsoft-365-copilot-techni...
The above appears to be scrubbed, but it used to be available from the learn page months ago. Your messages get additional context data from Microsoft's Graph, which powers the enterprise version of M365 Copilot. There's significant benefits to this, and downsides. And considering the way Microsoft wants to control things, you will get an overindex toward things that happen inside of your organization than what will happen in the near real-time web.