(blogpost author here)
You're right! I did make the distinction in an earlier draft, but decided to use "RAG" interchangeably with vector search, as it is popularly known today in code-gen systems. I'd probably go back to the previous version too.
But I do think there is a qualitative different between getting candidates and adding them to context before generating (retrieval augmented generation) vs the LLM searching for context till it is satisfied.