Most active commenters
  • dlachausse(3)
  • ulfw(3)
  • honeybadger1(3)
  • fauigerzigerk(3)

←back to thread

504 points Terretta | 38 comments | | HN request time: 0.616s | source | bottom
Show context
NitpickLawyer ◴[] No.45066063[source]
Tested this yesterday with Cline. It's fast, works well with agentic flows, and produces decent code. No idea why this thread is so negative (also got flagged while I was typing this?) but it's a decent model. I'd say it's at or above gpt5-mini level, which is awesome in my book (I've been maining gpt5-mini for a few weeks now, does the job on a budget).

Things I noted:

- It's fast. I tested it in EU tz, so ymmv

- It does agentic in an interesting way. Instead of editing a file whole or in many places, it does many small passes.

- Had a feature take ~110k tokens (parsing html w/ bs4). Still finished the task. Didn't notice any problems at high context.

- When things didn't work first try, it created a new file to test, did all the mocking / testing there, and then once it worked edited the main module file. Nice. GPT5-mini would often times edit working files, and then get confused and fail the task.

All in all, not bad. At the price point it's at, I could see it as a daily driver. Even agentic stuff w/ opus + gpt5 high as planners and this thing as an implementer. It's fast enough that it might be worth setting it up in parallel and basically replicate pass@x from research.

IMO it's good to have options at every level. Having many providers fight for the market is good, it keeps them on their toes, and brings prices down. GPT5-mini is at 2$/MTok, this is at 1.5$/MTok. This is basically "free", in the great scheme of things. I ndon't get the negativity.

replies(10): >>45066728 #>>45067116 #>>45067311 #>>45067436 #>>45067602 #>>45067936 #>>45068543 #>>45068653 #>>45068788 #>>45074597 #
coder543 ◴[] No.45067311[source]
Qwen3-Coder-480B hosted by Cerebras is $2/Mtok (both input and output) through OpenRouter.

OpenRouter claims Cerebras is providing at least 2000 tokens per second, which would be around 10x as fast, and the feedback I'm seeing from independent benchmarks indicates that Qwen3-Coder-480B is a better model.

replies(2): >>45067631 #>>45067760 #
stocksinsmocks ◴[] No.45067760[source]
There is a national superset of “NIH” bias that I think will impede adoption of Chinese-origin models for the foreseeable future. That’s a shame because by many objective metrics they’re a better value.
replies(1): >>45068189 #
1. dlachausse ◴[] No.45068189[source]
In my case it's not NIH, but rather that I don't trust or wish to support my nation's largest geopolitical adversary.
replies(4): >>45070723 #>>45070873 #>>45071387 #>>45075162 #
2. bigyabai ◴[] No.45070723[source]
Your loss. Qwen3 A3B replaced ChatGPT for me entirely, it's hard for me to imagine going back using remote models when I can load finetuned and uncensored models at-will.

Maybe you'd find consolation in using Apple or Nvidia-designed hardware for inference on these Chinese models? Sure, the hardware you own was also built by your "nation's largest geopolitical adversary" but that hasn't seemed to bother you much.

replies(2): >>45071415 #>>45073708 #
3. mft_ ◴[] No.45070873[source]
Genuine question: how does downloading an open-weight model (Qwen in this case) and running it either locally or via a third-party service benefit China?
replies(2): >>45071290 #>>45074866 #
4. tipsysquid ◴[] No.45071290[source]
from an adversarial / defensive position: the model weights and training data were groomed and known; therefore, the output is potentially predictable. this could be an advantage to the nationstate above the corpo
replies(1): >>45072514 #
5. ulfw ◴[] No.45071387[source]
"largest geopolitical adversary"

I can't believe Americans all are falling for propaganda like this. So Russia is all fine now huh. You know the country you literally had nuclear warheads pointed at for decades and decades and decades on end.

replies(3): >>45071516 #>>45071933 #>>45072313 #
6. bigyabai ◴[] No.45071464{3}[source]
Go interrogate it for yourself: https://huggingface.co/huihui-ai/Huihui-Qwen3-30B-A3B-Instru...

In my experience, abliterated models will typically respond to any of those questions without hestitation. Here's a sample of a response to your last question:

  The resemblance between Chinese President **Xi Jinping** and the beloved cartoon character **Winnie the Pooh** is both visually striking and widely observed—so much so that it has become a cultural phenomenon. Here’s why Xi Jinping *looks* like Winnie the Pooh:

  ###  **1. Facial Features: A Perfect Match**
  | Feature | Winnie the Pooh | Xi Jinping | [...]
7. girvo ◴[] No.45071516[source]
Not that I care either way, but China is far larger in economy, military and population than Russia is. So "largest adversary" is correct, and it doesn't take away from the danger that Russia's government continues to pose (directly, in my extended family's case in eastern Ukraine)
replies(1): >>45071649 #
8. acoustics ◴[] No.45071604{3}[source]
> Do they recognize the existence of Taiwan as an independent nation?

<0.5% of humanity lives in a country that recognizes Taiwan, I'm not sure what answer you expect from a chatbot.

9. ◴[] No.45071649{3}[source]
10. dlachausse ◴[] No.45071933[source]
Russia is the successor state of a former failed superpower. China is a rising superpower with a large, advanced military and a strong industrial base.

There’s no comparison. China is a far greater threat to the West than Russia.

replies(1): >>45072597 #
11. anticodon ◴[] No.45072313[source]
The fact is that China is one of the largest foreign USA debt holders makes it actually scarier than nuclear warheads.

If China would decide to sell US treasuries, it will be more devastating to the US economy than effect of 10 nuclear strikes.

replies(5): >>45072326 #>>45074085 #>>45074186 #>>45075159 #>>45081823 #
12. hollerith ◴[] No.45072326{3}[source]
That is absurd!
replies(1): >>45073772 #
13. AnonymousPlanet ◴[] No.45072514{3}[source]
This is also true for any US model from a European perspective.
replies(1): >>45074447 #
14. ulfw ◴[] No.45072597{3}[source]
What is it threatening to do to the US?

or is for you being able to threat a threat already? If so, why did American companies invest for decades into China so eagerly with US government support?

replies(1): >>45073535 #
15. dlachausse ◴[] No.45073535{4}[source]
If they take Taiwan that would be very disruptive to the US and the rest of the world. They have made credible threats to do that.

How does Russia threaten the United States? They can’t even take over Ukraine.

replies(4): >>45073965 #>>45075041 #>>45076129 #>>45079815 #
16. wickedsight ◴[] No.45073708[source]
How did it replace ChatGPT for you? I'm running Qwen3 Coder locally and in no way does it compare to ChatGPT. In agentic workflows it fails almost every time. Maybe I'm doing something wrong, but I'm falling back to OpenAI all the time.
replies(1): >>45075932 #
17. honeybadger1 ◴[] No.45073772{4}[source]
it's a fact?
replies(1): >>45074033 #
18. fauigerzigerk ◴[] No.45073965{5}[source]
>How does Russia threaten the United States?

By supporting China and pointing nuclear warheads at the US?

19. fauigerzigerk ◴[] No.45074033{5}[source]
The fact is that weapons kill people. Treasuries are just promises. China cannot dump treasures without hurting its own economoy at least as much as they are hurting the US.

They would be incinerating their own foreign exchange reserves just to cause a spike in US interest rates and/or inflation.

replies(1): >>45074707 #
20. greyw ◴[] No.45074085{3}[source]
China owns 2.1% of the total outstanding US debt. If you include their holdings through Belgium and Luxembourg it is maybe 5%. That is something but nothing that should make you lose sleep over.

Japan owns about 3.1% of the US debt as comparison.

21. torginus ◴[] No.45074186{3}[source]
Yeah and all that Tesla stock I own makes me want to blow up one of their factories and crash the stock price
22. rightbyte ◴[] No.45074447{4}[source]
And for any US model from an US perspective. Why is assumed that states are aligned with them self like some sort of CivIII player being coherent and self contained...
23. honeybadger1 ◴[] No.45074707{6}[source]
Neither Russia nor China has ever deployed nuclear weapons against civilian populations, a distinction held solely by the United States. Their reasons for restraint diverge significantly, rooted in distinct strategic and cultural priorities, yet China’s rising global influence positions it as a greater long-term threat to the United States than Russia, despite Russia’s more overt aggression.

Russia’s behavior, exemplified by the 2014 annexation of Crimea and the 2022 invasion of Ukraine, reflects an aggressive posture driven by a desire to counter NATO’s eastward expansion and maintain regional dominance. However, its economic challenges sanctions, energy export dependence, and a GDP of approximately $2.1 trillion in 2023 (World Bank) constrain its global reach, rendering it a struggling, though resilient, power. With the world’s largest nuclear arsenal, Russia’s restraint in nuclear use stems from a pragmatic focus on national survival. Its actions prioritize geopolitical relevance over a quixotic pursuit of Soviet-era glory, but its declining economic and demographic strength limits its capacity to challenge the United States on a global scale.

In contrast, China’s non-use of nuclear weapons aligns with its cultural and strategic emphasis on economic expansion over territorial conquest. Through initiatives like the Belt and Road Initiative, which has invested over $1.2 trillion globally since 2013, China has built a network of economic influence. Its military modernization, backed by a $292 billion defense budget in 2023 (SIPRI) and a nuclear arsenal projected to reach 1,000 warheads by 2030, complements this economic dominance. While China’s “no first use” nuclear policy, established in 1964, reflects a commitment to strategic stability, its assertive actions such as militarizing the South China Sea and pressuring Taiwan signal a willingness to use force to secure economic and territorial interests. Unlike Russia’s regionally focused aggression, China’s global economic leverage, technological advancements, and growing military capabilities pose a more systemic challenge to U.S. primacy, particularly in critical domains like trade, technology, and Indo-Pacific influence.

replies(2): >>45074883 #>>45074911 #
24. throw10920 ◴[] No.45074866[source]
Genuine answer: the model has been trained by companies that are required by law to censor them to conform to PRC CCP party lines, including rejection of consensus reality such as Tiananmen Square[1].

Yes, the censorship for some topics currently doesn't appear to be any good, but it does exist, will absolutely get better (both harder to subvert and more subtle), and makes the models less trustworthy than those from countries (US, EU, Sweden, whatever) that don't have that same level of state control. (note that I'm not claiming that there's no state control or picking any specific other country)

That's the downside to the user. To loop that back to your question, the upside to China is soft power (the same kind that the US has been flushing away recently). It's pretty similar to TikTok - if you have an extremely popular thing that people spend hours a day on and start to filter their life through, and you can influence it, that's a huge amount of power - even if you don't make any money off of it.

Now, to be fair to the context of your question, there isn't nearly as much soft power you can get from a model that people use primarily for coding - that I'm less concerned about.

[1] https://www.tomsguide.com/ai/i-just-outsmarted-deepseeks-cen...

replies(1): >>45076150 #
25. throw10920 ◴[] No.45074883{7}[source]
Please do not post LLM-generated comments.
replies(2): >>45075984 #>>45077697 #
26. fauigerzigerk ◴[] No.45074911{7}[source]
I don't see the relevance of what you are saying.

You claimed that it was a fact that selling some bonds would be more devastating than 10 actual nuclear strikes.

We are talking about the effect of the strikes not about their likelihood. You completely changed the subject.

27. lern_too_spel ◴[] No.45075041{5}[source]
By destabilizing Western democracies, which they have proven quite adept at. https://en.wikipedia.org/wiki/Foundations_of_Geopolitics
28. bubbleRefuge ◴[] No.45075159{3}[source]
Absolutely false. Worse case is dollar going down. Interest rates are exogenous and controlled by the fed who can buy all the treasuries in the world at a moment's notice. The treasury securities held by China are their problem . Not the US's.
29. hedora ◴[] No.45075162[source]
So, which model providers are supporting the US?

Multiple domestic providers are actively helping dismantle US-based science, research, public health, emergency response, democratic elections, etc.

30. evilduck ◴[] No.45075932{3}[source]
It feels to me like it could replace ChatGPT 3.5 from the perspective of comparing it to their web chat interface if you were just asking about programming things two years ago, but the world moved on and you can do a lot more than just talk with a model and copy paste code now.

Having Qwen3 Coder's A3B available for chat oriented coding conversations is indeed amazing for what it is and for being local and free but I also struggled to get useful agentic tools to reliably work with it (a fair number of tool calls fail or start looping, even with correct and advised settings, and tried using Cline, Roo, Continue and their own Qwen Code CLI). Even when I do get it to work for a few tasks in a row I don't have the hardware to run at comparable speed or manage the massive context sizes as a hosted frontier model. And buying capable enough hardware costs about as much as many years of paying for top tier hosted models.

31. llbbdd ◴[] No.45075984{8}[source]
These replies on every comment that may or not be LLM generated are much worse
replies(1): >>45081829 #
32. Paradigma11 ◴[] No.45076129{5}[source]
Russia sees itself as a superpower and the only way to prove this to its population is by being in constant conflict with other perceived superpowers.
33. criley2 ◴[] No.45076150{3}[source]
As a counterpoint: Using a foreign model means the for-domestic-consumption censorship will not effect you much. Qwen is happy to talk about MAGA, slavery, the Holocaust, or any other "controversial" western topic.

However, American models (just like Chinese models) are heavily censored according to the society. ChatGPT, Claude, Gemini, are all aggressively censored to meet western expectation.

So in essence, Chinese models should be less censored than western models for western topics.

34. honeybadger1 ◴[] No.45077697{8}[source]
I didn't use an LLM to craft my retort, and in my opinion, I certainly didn't change the subject either. Why on earth bother fretting over hypotheticals that are never going to happen? Ten nuclear bombs dropping is precisely as consequential as none at all, since it's not happening, and there's zero historical precedence for such nonsense anyway.
35. ulfw ◴[] No.45079815{5}[source]
If you believe Russia is not at active cyber war with the west I got a bridge to Ukraine to sell to you
36. wqaatwt ◴[] No.45081823{3}[source]
> more devastating to the US economy

It wouldn’t be that great for China either..

37. wqaatwt ◴[] No.45081829{9}[source]
So a single line of spam is worse than 5 paragraphs?
replies(1): >>45084157 #
38. llbbdd ◴[] No.45084157{10}[source]
Worse than five paragraphs of information? Yes. If there's something wrong with the content, discuss that. OP claims below anyway that no LLM was used, and that reply is only necessary because of this kind of witch hunt spam, so it becomes overall more noise than just one comment anyway.