Posting in ALL CAPS, posting fake news or insults to other members of the community, posting to incite anger, etc. does not make for good discussion, and hurts the sense of community. That, and bringing that to other subreddits via brigading, etc.
If that's not what they want to have (as its clearly making the CEO on edge), why not enforce some strict rules? The people they don't like will simply flock to other communities that will allow it and it becomes their problem.
Only communities small enough, or moderated enough, to not be interesting to a troll or nefarious person are spared.
The idea of a completely self governed haven of mass free speech is a wondeful one, but no community large enough stays uncorrupted. It has never worked.
It is the ideals and application of those ideals through moderation that make any community bearable, just like in real life.
If I am to be part of a community I would rather it moderated, otherwise the people of the internet ruin all things in time.
I just want to have useful conversations, not circlejerk over freedom of speech while being interrupted by adolescent screaming.
The good thing is that the AI can be completely open: how is it trained? what are the parameters? This AI can still have bias, but that bias will be obvious to anyone joining this community.
So your idea to counteract people playing psychological games on others is to put something without the common sense of a three year old in charge of moderation. That's just glorious.