It’s because they generate a seeming of reasoning, and don’t actually reason!
(Slams the door angrily)
(stomps out angrily)
(touches the grass angrily)
replies(2):
(Slams the door angrily)
(stomps out angrily)
(touches the grass angrily)
That said the input space of supported problems is quite large and you can configure the problem parametrs quite flexibly.
I guess the issue is that what the model _actually_ provides you is this idiot savant who has pre-memorized everything without offering a clear index that would disambiguate well-supported problems from ”too difficult” (i.e. novel) ones