I would have thought it uncontroversial view among software engineers that token quality is much important than token output speed.
I would have thought it uncontroversial view among software engineers that token quality is much important than token output speed.
If an LLM is often going to be wrong anyway, then being able to try prompts quickly and then iterate on those prompts, could possibly be more valuable than a slow higher quality output.
Ad absurdum, if it could injest and work on an entire project in milliseconds, then it has mucher geater value to me, than a process which might take a day to do the same, even if the likelihood of success is also strongly affected.
It simply enables a different method of interactive working.
Or it could supply 3 different suggestions in-line while working on something, rather than a process which needs to be explicitly prompted and waited on.
Latency can have critical impact on not just user experience but the very way tools are used.
Now, will I try Grok? Absolutely not, but that's a personal decision due to not wanting anything to do with X, rather than a purely rational decision.
Asking any model to do things in steps is usually better too, as opposed to feeding it three essays.
Before MoE was a thing, I built what I called the Dictator, which was one strong model working with many weaker ones to achieve a similar result as MoE, but all the Dictator ever got was Garbage In, so guess what came out?
* Scaffolding
* Ask it what's wrong with the code
* Ask it for improvements I could make
* Ask it what the code does (amazing for old code you've never seen)
* Ask it to provide architect level insights into best practices
One area where they all seem to fail is lesser known packages they tend to either reference old functionality that is not there anymore, or never was, they hallucinate. Which is part of why I don't ask it for too much.
Junie did impress me, but it was very slow, so I would love to see a version of Junie using this version of Grok, it might be worthwhile.
this site is the fucking worst
The IP risks taken may be well worth of productiviry boosts.
That's phase 1, ask it to "think deeply" (Claude keyword, only works with the anthropic models) while doing that. Then ask it to make a detailed plan of solving the issue and write that into current-fix.md and ask it to add clearly testable criteria when the issuen is solved.
Now you manually check the criteria wherever they sound plausible, if not - it's analysis failed and its output was worthless.
But if it sounds good, you can then start a new session and ask it to read the-markdown-file and implement the change.
Now you can plausibility check the diff and are likely done
But as the sister comment pointed out, agentic coding really breaks apart with large files like you usually have in brownfield projects.
I think the biggest thing for offline LLMs will have to be consistency for having them search the web with an API like Google's or some other search engines API, maybe Kagi could provide an API for people who self-host LLMs (not necessarily for free, but it would still be useful).
Not sure who was taking SamA seriously about that; personally I think he's a ridiculous blowhard, and statements like that just reinforce that view for me.
Please don't make generalizations about HN's visitors'/commenters' attitudes on things. They're never generally correct.