(www.lesswrong.com)

579 points paulpauper | 1 comments | 06 Apr 25 18:01 UTC | HN request time: 0.207s | source

1. crvdgc ◴[06 Apr 25 23:23 UTC] No.43605830[source]▶

> But in recent months I've spoken to other YC founders doing AI application startups [...] in different industries, on different problem sets.

Maybe they should create a benchmark collectively called YC founders. Gather various test cases. Never make it public. And use that to evaluate newly released models.

↑

Recent AI model progress feels mostly like bullshit