/top/
/new/
/best/
/ask/
/show/
/job/
^
slacker news
login
about
←back to thread
AI agent benchmarks are broken
(ddkang.substack.com)
181 points
neehao
| 1 comments |
11 Jul 25 13:06 UTC
|
HN request time: 0.293s
|
source
1.
rsynnott
◴[
11 Jul 25 14:23 UTC
]
No.
44532491
[source]
▶
>>44531697 (OP)
#
> 45 + 8 = 63
> Pass
Yeah, this generally feels like about the quality one would expect from the industry.
ID:
GO
↑