←back to thread

504 points Terretta | 7 comments | | HN request time: 0.543s | source | bottom
Show context
NitpickLawyer ◴[] No.45066063[source]
Tested this yesterday with Cline. It's fast, works well with agentic flows, and produces decent code. No idea why this thread is so negative (also got flagged while I was typing this?) but it's a decent model. I'd say it's at or above gpt5-mini level, which is awesome in my book (I've been maining gpt5-mini for a few weeks now, does the job on a budget).

Things I noted:

- It's fast. I tested it in EU tz, so ymmv

- It does agentic in an interesting way. Instead of editing a file whole or in many places, it does many small passes.

- Had a feature take ~110k tokens (parsing html w/ bs4). Still finished the task. Didn't notice any problems at high context.

- When things didn't work first try, it created a new file to test, did all the mocking / testing there, and then once it worked edited the main module file. Nice. GPT5-mini would often times edit working files, and then get confused and fail the task.

All in all, not bad. At the price point it's at, I could see it as a daily driver. Even agentic stuff w/ opus + gpt5 high as planners and this thing as an implementer. It's fast enough that it might be worth setting it up in parallel and basically replicate pass@x from research.

IMO it's good to have options at every level. Having many providers fight for the market is good, it keeps them on their toes, and brings prices down. GPT5-mini is at 2$/MTok, this is at 1.5$/MTok. This is basically "free", in the great scheme of things. I ndon't get the negativity.

replies(10): >>45066728 #>>45067116 #>>45067311 #>>45067436 #>>45067602 #>>45067936 #>>45068543 #>>45068653 #>>45068788 #>>45074597 #
jameshart ◴[] No.45067602[source]
If the Grok brand wasn’t terminally tarnished for you by the ‘mechahitler’ incident, I’m not sure what more it would take.

This is an offering being produced by a company whose idea of responsible AI use involves prompting a chatbot that “You spend a lot of time on 4chan, watching InfoWars videos” - https://www.404media.co/grok-exposes-underlying-prompts-for-...

A lot of people rightly don’t want any such thing anywhere near their code.

replies(16): >>45067741 #>>45067793 #>>45067834 #>>45067845 #>>45067876 #>>45067950 #>>45068178 #>>45068224 #>>45068385 #>>45068645 #>>45068805 #>>45068858 #>>45069087 #>>45069800 #>>45070448 #>>45071147 #
1. Nuzzerino ◴[] No.45068645[source]
The article you linked talks about the voice personality prompt for "unhinged mode", which is an entertainment mode. It has nothing to do with the code writing model.
replies(2): >>45068792 #>>45068803 #
2. jameshart ◴[] No.45068792[source]
The fact that that represents something the folks at xAI think would be entertaining can certainly be a basis for thinking twice about trusting their judgement in other matters, though, right?
replies(2): >>45069050 #>>45069115 #
3. drusepth ◴[] No.45068803[source]
It's a comment about the company/brand behind the models, not the individual models themselves.
4. Nuzzerino ◴[] No.45069050[source]
I got a lot of entertainment out of it, don't knock it till you tried it, it's just a prompt.

The great thing about xAI is that it is just a company and there are other AI companies that have AIs that match your values, even though between Grok, ChatGPT, and Claude there are minimal actual differences.

An AI will be anything that the prompt says it is. Because a prompt exists doesn't condemn the company.

replies(1): >>45070010 #
5. kristianbrigman ◴[] No.45069115[source]
If they represent it as entertainment… it’s a common genre to make fun of what you see as the most extreme views of the other side.
replies(1): >>45076386 #
6. sebastiennight ◴[] No.45070010{3}[source]
> An AI will be anything that the prompt says it is

Within the boundaries of pre-training, yes. It is definitely possible, in training and in fine-tuning, to make a LLM resistant to engaging in the role-playing requested in the prompt.

7. whattheheckheck ◴[] No.45076386{3}[source]
Its the next iteration of pewdiepipeline. The end of the jokes in genocide. Not a joking matter