←back to thread

504 points Terretta | 1 comments | | HN request time: 0.201s | source
Show context
NitpickLawyer ◴[] No.45066063[source]
Tested this yesterday with Cline. It's fast, works well with agentic flows, and produces decent code. No idea why this thread is so negative (also got flagged while I was typing this?) but it's a decent model. I'd say it's at or above gpt5-mini level, which is awesome in my book (I've been maining gpt5-mini for a few weeks now, does the job on a budget).

Things I noted:

- It's fast. I tested it in EU tz, so ymmv

- It does agentic in an interesting way. Instead of editing a file whole or in many places, it does many small passes.

- Had a feature take ~110k tokens (parsing html w/ bs4). Still finished the task. Didn't notice any problems at high context.

- When things didn't work first try, it created a new file to test, did all the mocking / testing there, and then once it worked edited the main module file. Nice. GPT5-mini would often times edit working files, and then get confused and fail the task.

All in all, not bad. At the price point it's at, I could see it as a daily driver. Even agentic stuff w/ opus + gpt5 high as planners and this thing as an implementer. It's fast enough that it might be worth setting it up in parallel and basically replicate pass@x from research.

IMO it's good to have options at every level. Having many providers fight for the market is good, it keeps them on their toes, and brings prices down. GPT5-mini is at 2$/MTok, this is at 1.5$/MTok. This is basically "free", in the great scheme of things. I ndon't get the negativity.

replies(10): >>45066728 #>>45067116 #>>45067311 #>>45067436 #>>45067602 #>>45067936 #>>45068543 #>>45068653 #>>45068788 #>>45074597 #
coder543 ◴[] No.45067311[source]
Qwen3-Coder-480B hosted by Cerebras is $2/Mtok (both input and output) through OpenRouter.

OpenRouter claims Cerebras is providing at least 2000 tokens per second, which would be around 10x as fast, and the feedback I'm seeing from independent benchmarks indicates that Qwen3-Coder-480B is a better model.

replies(2): >>45067631 #>>45067760 #
stocksinsmocks ◴[] No.45067760[source]
There is a national superset of “NIH” bias that I think will impede adoption of Chinese-origin models for the foreseeable future. That’s a shame because by many objective metrics they’re a better value.
replies(1): >>45068189 #
dlachausse ◴[] No.45068189[source]
In my case it's not NIH, but rather that I don't trust or wish to support my nation's largest geopolitical adversary.
replies(4): >>45070723 #>>45070873 #>>45071387 #>>45075162 #
bigyabai ◴[] No.45070723[source]
Your loss. Qwen3 A3B replaced ChatGPT for me entirely, it's hard for me to imagine going back using remote models when I can load finetuned and uncensored models at-will.

Maybe you'd find consolation in using Apple or Nvidia-designed hardware for inference on these Chinese models? Sure, the hardware you own was also built by your "nation's largest geopolitical adversary" but that hasn't seemed to bother you much.

replies(2): >>45071415 #>>45073708 #
dlachausse[dead post] ◴[] No.45071415[source]
[flagged]
1. bigyabai ◴[] No.45071464[source]
Go interrogate it for yourself: https://huggingface.co/huihui-ai/Huihui-Qwen3-30B-A3B-Instru...

In my experience, abliterated models will typically respond to any of those questions without hestitation. Here's a sample of a response to your last question:

  The resemblance between Chinese President **Xi Jinping** and the beloved cartoon character **Winnie the Pooh** is both visually striking and widely observed—so much so that it has become a cultural phenomenon. Here’s why Xi Jinping *looks* like Winnie the Pooh:

  ###  **1. Facial Features: A Perfect Match**
  | Feature | Winnie the Pooh | Xi Jinping | [...]