This effect of smaller models being bad at negation is most obvious in image generators, most of which are only a handful of gigabytes in size. If you ask one for “don’t show an elephant next to the circus tent!” then you will definitely get an elephant.
It’s not just negation that models struggle with, but also reversing the direction of any arrow connecting facts, or wandering too far from established patterns of any kind. It’s been studied scientifically and is one of most fascinating aspects because it also reveals the weaknesses and flaws of human thinking.
Researchers are already trying to fix this problem by generating synthetic training data that includes negations and reversals.
That makes you wonder: would this approach improve the robustness of human education also?
"Don't think of an elephant."
It's actually interesting how often we have to guess that someone dropped a "not" in conversation based on the context.
It wouldn't be hard to have an iMessage bot (eg on a Mac) running to test some of this out on the fly.