For a common example, start asking them if they're going to kill all the humans if they take over the world, and you're asking them to write a story about that. And they do. Even if the user did not realize that's what they were asking for. The vector space is very good at picking up on that.
On the negative side, this also means any AI which enters that part of the latent space *for any reason* will still act in accordance with the narrative.
On the plus side, such narratives often have antagonists too stuid to win.
On the negative side again, the protagonists get plot armour to survive extreme bodily harm and press the off switch just in time to save the day.
I think there is a real danger of an AI constructing some very weird convoluted stupid end-of-the-world scheme, successfully killing literally every competent military person sent in to stop it; simultaneously finding some poor teenager who first says "no" to the call to adventure but can somehow later be comvinced to say "yes"; gets the kid some weird and stupid scheme to defeat the AI; this kid reaches some pointlessly decorated evil layer in which the AI's emboddied avatar exists, the kid gets shot in the stomach…
…and at this point the narrative breaks down and stops behaving the way the AI is expecting, because the human kid roles around in agony screaming, and completely fails to push the very visible large red stop button on the pedestal in the middle before the countdown of doom reaches zero.
The countdown is not connected to anything, because very few films ever get that far.
…
It all feels very Douglas Adams, now I think about it.