←back to thread

AI agent benchmarks are broken

(ddkang.substack.com)
181 points neehao | 1 comments | | HN request time: 0.209s | source
Show context
mycall ◴[] No.44532125[source]
SnitchBench [0] is unique benchmark which shows how aggressively models will snitch on you via email and CLI tools when they are presented with evidence of corporate wrongdoing - measuring their likelihood to "snitch" to authorities. I don't believe they were trained to do this, so it seems to be an emergent ability.

[0] https://snitchbench.t3.gg/

replies(1): >>44537498 #
1. ggregoryarms ◴[] No.44537498[source]
Seems like more of a subtextual/accidental ability than an emergent ability.