a little anecdote:
i used gemini cli for a large implementation of a feature for a cpp api.
gemini did a huge amount of work i otherwise had to write by myself.
problem? in all this great work was somewhere a memory bug hidden. there was no error you just feed to the cli and call it a day. after 4 days debugging i found the bug. needless to say, gemini did not even once came close to where the bug was in the guessing game...
will this change in the future? we will see...