The essential complexity is the inherent incompatibility of any given answer. Every answer must be written in a specific implementation. Once you have written your answer, you have cemented it into its environment. We can't answer answers. Technically we can, but it tends to be a huge undertaking.
Each answer to "what?" and "how?" has a special property: it is context-free. We have to choose a specific "context-free grammar" to write it in, but the answer itself can be fully expressed there. That means that every implementation that answers the same question must be somehow equivalent. That equivalence is, unfortunately, lost at time of writing.
---
We need to take a step back, and recognize the ultimate question: "why?".
The very questions "what?" and "how?" belong to the answer to "why?". If we could just write the reason why, we could compile that answer into a complete collection of compatible whats and hows.
That's the trickiest part of all, because the answer to "why?" is context-dependent. We can't write the answer to "why?" in any programming language, because that category of language cannot express context-dependence. That means we can't write a parser for it, let alone compile.
Solve natural language processing, and we solve incompatibility.