Gemini 2.0 is now available to everyone

(blog.google)

612 points meetpateltech | 2 comments | 05 Feb 25 16:03 UTC | HN request time: 0.034s | source

Show context

sho_hn ◴[05 Feb 25 16:25 UTC] No.42950830[source]▶

Anyone have a take on how the coding performance (quality and speed) of the 2.0 Pro Experimental compares to o3-mini-high?

The 2 million token window sure feels exciting.

replies(2): >>42950892 #>>42956069 #

1. TuxSH ◴[05 Feb 25 22:16 UTC] No.42956069[source]▶

>>42950830 #

Bad (though I haven't tested autocompletion). It's underperforming other models on livebench.ai.

With Copilot Pro and DeepSeek's website, I ran "find logic bugs" on a 1200 LOC file I actually needed code review for:

- DeepSeek R1 found like 7 real bugs out of 10 suggested with the remaining 3 being acceptable false positives due to missing context

- Claude was about the same with fewer remaining bugs; no hallucinations either

- Meanwhile, Gemini had 100% false positive rate, with many hallucinations and unhelpful answers to the prompt

I understand Gemini 2.0 is not a reasoning model, but DeepClaude remains the most effective LLM combo so far.

replies(1): >>42958483 #

2. ryao ◴[06 Feb 25 03:00 UTC] No.42958483[source]▶

>>42956069 (TP) #

I have seen Gemini hallucinate ridiculous bugs in a file that had less than 1000 LOC when I was scratching my head over what was wrong. The issue turned out to be that the cbBLAS matrix multiplication functions expected column major indexing while the code expected row major indexing.

↑