That matches my impression. For the past month or two, I have been running informal side-by-side tests of the Deep Research products from OpenAI, Perplexity, and Google. OpenAI was clearly winning—more complete and incisive, and no hallucinated sources that I noticed.
That changed a few days ago, when Google switched their Deep Research over to Gemini 2.5 Pro Experimental. While OpenAI’s and Perplexity’s reports are still pretty good, Google’s usually seem deeper, more complete, and more incisive.
My prompting technique, by the way, is to first explain to a regular model the problem I’m interested in and ask it to write a full prompt that can be given to a reasoning LLM that can search the web. I check the suggested prompt, make a change or two, and then feed it to the Deep Research models.
One thing I’ve been playing with is asking for reports that discuss and connect three disparate topics. Below are the reports that the three Deep Research models gave me just now on surrealism, Freudian dream theory, and AI image prompt engineering. Deciding which is best is left as an exercise to the reader.
OpenAI:
https://chatgpt.com/share/67fa21eb-18a4-8011-9a97-9f8b051ad3...
Google:
https://docs.google.com/document/d/10mF_qThVcoJ5ouPMW-xKg7Cy...
Perplexity:
https://www.perplexity.ai/search/subject-analytical-report-i...