Is this really how SOTA LLMs parse our queries? To what extent is this a simplified representation of what they really "see"?
Yes, tokenization and embeddings are exactly how LLMs process input—they break text into tokens and map them to vectors. POS tags and SVOs aren't part of the model pipeline but help visualize structures the models learn implicitly.