←back to thread

AI agent benchmarks are broken

(ddkang.substack.com)
181 points neehao | 4 comments | | HN request time: 0.754s | source
Show context
xnx ◴[] No.44531958[source]
All benchmarks are flawed. Some benchmarks are useful.
replies(1): >>44532081 #
1. yifanl ◴[] No.44532081[source]
Here's a third sentence fragment: These benchmarks are not.
replies(2): >>44532272 #>>44534649 #
2. suddenlybananas ◴[] No.44532272[source]
It's nearly a haiku!
replies(1): >>44533605 #
3. layer8 ◴[] No.44533605[source]

  All benchmarks are flawed.
  Not all benchmarks are useless.
  But these benchmarks are.
4. lcnPylGDnU4H9OF ◴[] No.44534649[source]
Just want to nit: none of those are sentence fragments, they are complete thoughts with a subject and a predicate. Yours kinda comes close to being a fragment but it really just omits what "are not" (the predicate) is referring to, which is included in prior context.

For example, a fragment with a missing predicate.