←back to thread

AI agent benchmarks are broken

(ddkang.substack.com)
181 points neehao | 1 comments | | HN request time: 0.239s | source
1. beebmam ◴[] No.44533728[source]
I don't think "Benchmarks" are the right way to analyze AI-related processes, which is probably similar to the complexity surrounding human intelligence measurements and how well each human can handle real-world problems.