←back to thread

504 points Terretta | 2 comments | | HN request time: 0s | source
Show context
boole1854 ◴[] No.45064512[source]
It's interesting that the benchmark they are choosing to emphasize (in the one chart they show and even in the "fast" name of the model) is token output speed.

I would have thought it uncontroversial view among software engineers that token quality is much important than token output speed.

replies(14): >>45064582 #>>45064587 #>>45064594 #>>45064616 #>>45064622 #>>45064630 #>>45064757 #>>45064772 #>>45064950 #>>45065131 #>>45065280 #>>45065539 #>>45067136 #>>45077061 #
eterm ◴[] No.45064582[source]
It depends how fast.

If an LLM is often going to be wrong anyway, then being able to try prompts quickly and then iterate on those prompts, could possibly be more valuable than a slow higher quality output.

Ad absurdum, if it could injest and work on an entire project in milliseconds, then it has mucher geater value to me, than a process which might take a day to do the same, even if the likelihood of success is also strongly affected.

It simply enables a different method of interactive working.

Or it could supply 3 different suggestions in-line while working on something, rather than a process which needs to be explicitly prompted and waited on.

Latency can have critical impact on not just user experience but the very way tools are used.

Now, will I try Grok? Absolutely not, but that's a personal decision due to not wanting anything to do with X, rather than a purely rational decision.

replies(3): >>45064736 #>>45064784 #>>45064870 #
postalcoder ◴[] No.45064736[source]
Besides being a faster slot machine, to the extent that they're any good, a fast agentic LLM would be very nice to have for codebase analysis.
replies(1): >>45067357 #
1. fmbb ◴[] No.45067357[source]
For 10% less time you can get 10% worse analysis? I don’t understand the tradeoff.
replies(1): >>45070324 #
2. kelnos ◴[] No.45070324[source]
I mean, if that's literally what the numbers are, sure, maybe that's not great. But what if it's 10% less time and 3% worse analysis? Maybe that's valuable.