https://github.com/BlueFalconHD/apple_generative_model_safet...
https://github.com/BlueFalconHD/apple_generative_model_safet...
Which as a phenomenon is so very telling that no one actually cares what people are really saying. Everyone, including the platforms knows what that means. It's all performative.
At what point do the new words become the actual words? Are there many instances of people using unalive IRL?
I'm imagining a new exploit: After someone says something totally innocent, people gang up in the comments to act like a terrible vicious slur has been said, and then the moderation system (with an LLM involved somewhere) "learns" that an arbitrary term is heinous eand indirectly bans any discussion of that topic.
Though it would be fun to see what happens if an LLM if used to ban anything that tends to generate heated exchanges. It would presumably learn to ban racial terms, politics and politicians and words like "immigrant" (i.e. basically the list in this repo), but what else could it be persuaded to ban? Vim and Emacs? SystemD? Anything involving cyclists? Parenting advice?
No, it isn't, and especially hasn't been historically. The negative connotations are overwhelmingly modern.
The areas where it is very inappropriate right now tally up to maybe 1 billion people*. That's pretty far from "most". For everyone else it is mostly positive, neutral, or meaningless.
*Brazil, Turkey, Iran, Iraq, Saudi Arabia, Greece, Italy, Spain, Russia, Ukraine, Belarus, other parts of Eastern Europe
It's perfectly OK in Greece.