←back to thread

504 points Terretta | 1 comments | | HN request time: 0.314s | source
Show context
boole1854 ◴[] No.45064512[source]
It's interesting that the benchmark they are choosing to emphasize (in the one chart they show and even in the "fast" name of the model) is token output speed.

I would have thought it uncontroversial view among software engineers that token quality is much important than token output speed.

replies(14): >>45064582 #>>45064587 #>>45064594 #>>45064616 #>>45064622 #>>45064630 #>>45064757 #>>45064772 #>>45064950 #>>45065131 #>>45065280 #>>45065539 #>>45067136 #>>45077061 #
jsheard ◴[] No.45064594[source]
That's far from the worst metric that xAI has come up with...

https://xcancel.com/elonmusk/status/1958854561579638960

replies(1): >>45066065 #
Rover222 ◴[] No.45066065[source]
what's wrong with rapid updates to an app?
replies(5): >>45067028 #>>45067061 #>>45068102 #>>45069218 #>>45070365 #
1. kelnos ◴[] No.45070365[source]
That metric doesn't really tell you anything. Maybe I'm making rapid updates to my app because I'm a terrible coder and I keep having to push out fixes to critical bugs. Maybe I'm bored and keep making little tweaks to the UI, and for some reason think that's worth people's time to upgrade. (And that's another thing: frequent upgrades can be annoying!)

But sure, ok, maybe it could mean making much faster progress than competitors. But then again, it could also mean that competitors have a much more mature platform, and you're only releasing new things so often because you're playing catch-up.

(And note that I'm not specifically talking about LLMs here. This metric is useless for pretty much any kind of app or service.)