←back to thread

AI agent benchmarks are broken

(ddkang.substack.com)
181 points neehao | 1 comments | | HN request time: 0.247s | source
Show context
xnx ◴[] No.44531958[source]
All benchmarks are flawed. Some benchmarks are useful.
replies(1): >>44532081 #
yifanl ◴[] No.44532081[source]
Here's a third sentence fragment: These benchmarks are not.
replies(2): >>44532272 #>>44534649 #
1. lcnPylGDnU4H9OF ◴[] No.44534649[source]
Just want to nit: none of those are sentence fragments, they are complete thoughts with a subject and a predicate. Yours kinda comes close to being a fragment but it really just omits what "are not" (the predicate) is referring to, which is included in prior context.

For example, a fragment with a missing predicate.