(www.thealgorithmicbridge.com)

1005 points vinhnx | 2 comments | 12 Apr 25 03:58 UTC | HN request time: 0.405s | source

Show context

godjan ◴[12 Apr 25 10:21 UTC] No.43663091[source]▶

The article doesn't mention one of the most complex benchmarks - ARC challenge. All models suck in it https://arcprize.org/leaderboard

but Gemini and Claude still suck much worse then ChatGPT models

replies(1): >>43663842 #

1. nolist_policy ◴[12 Apr 25 12:35 UTC] No.43663842[source]▶

They haven't tested Gemini 2.5 Pro yet.

replies(1): >>43664780 #

2. usaar333 ◴[12 Apr 25 14:41 UTC] No.43664780[source]▶

They have.

Google is winning on every AI front