←back to thread

3338 points keepamovin | 1 comments | | HN request time: 0.397s | source
1. Simplita ◴[] No.46214771[source]
Big models keep getting better at benchmarks, but reliability under messy real world inputs still feels stuck in place.