It is not a simple matter of patching the rough edges. We are fundamentally not using an architecture that is capable of intelligence.
Personally the first time I tried deep research on a real topic it was disastrously incorrect on a key point.
It is not a simple matter of patching the rough edges. We are fundamentally not using an architecture that is capable of intelligence.
Personally the first time I tried deep research on a real topic it was disastrously incorrect on a key point.
If you ask an intelligent being the same question they may occasionally change the precise words they use but their answer will be the same over and over.
Heck, I can't even get LLMs to be consistent about *their own capabilities*.
Bias disclaimer: I work at Google, but not on Gemini. If I ask Gemini to produce an SVG file, it will sometimes do so and sometimes say "sorry, I can't, I can only produce raster images". I cannot deterministically produce either behavior - it truly seems to vary randomly.