https://github.com/BlueFalconHD/apple_generative_model_safet...
https://github.com/BlueFalconHD/apple_generative_model_safet...
Which as a phenomenon is so very telling that no one actually cares what people are really saying. Everyone, including the platforms knows what that means. It's all performative.
At what point do the new words become the actual words? Are there many instances of people using unalive IRL?
I'm imagining a new exploit: After someone says something totally innocent, people gang up in the comments to act like a terrible vicious slur has been said, and then the moderation system (with an LLM involved somewhere) "learns" that an arbitrary term is heinous eand indirectly bans any discussion of that topic.
Though it would be fun to see what happens if an LLM if used to ban anything that tends to generate heated exchanges. It would presumably learn to ban racial terms, politics and politicians and words like "immigrant" (i.e. basically the list in this repo), but what else could it be persuaded to ban? Vim and Emacs? SystemD? Anything involving cyclists? Parenting advice?
I strongly doubt you do that. Whether you like it or not, the Nazis defined what the swastika means now.
In fact there was a recent thing with one of the BTS members' uniform (worn during mandatory military service period in South Korea), which had the regular (not tilted) swastika on it because he was assigned to religious duties.
And of course the western world/media ran away with it. Plenty of absolutely brain dead people out there who couldn't research a topic to gain an understanding to save their lives.