(qwenlm.github.io)

544 points tosh | 1 comments | 24 Mar 25 18:35 UTC | HN request time: 0.209s | source

Show context

simonw ◴[24 Mar 25 18:52 UTC] No.43464227[source]▶

Big day for open source Chinese model releases - DeepSeek-v3-0324 came out today too, an updated version of DeepSeek v3 now under an MIT license (previously it was a custom DeepSeek license). https://simonwillison.net/2025/Mar/24/deepseek/

replies(5): >>43464375 #>>43464498 #>>43464686 #>>43465383 #>>43467111 #

echelon ◴[24 Mar 25 19:20 UTC] No.43464498[source]▶

>>43464227 #

Pretty soon I won't be using any American models. It'll be a 100% Chinese open source stack.

The foundation model companies are screwed. Only shovel makers (Nvidia, infra companies) and product companies are going to win.

replies(7): >>43464607 #>>43464651 #>>43464792 #>>43466340 #>>43466493 #>>43469085 #>>43469922 #

jsheard ◴[24 Mar 25 19:32 UTC] No.43464607[source]▶

>>43464498 #

I still don't get where the money for new open source models is going to come from once setting investor dollars on fire is no longer a viable business model. Does anyone seriously expect companies to keep buying and running thousands of ungodly expensive GPUs, plus whatever they spend on human workers to do labelling/tuning, and then giving away the spoils for free, forever?

replies(12): >>43464649 #>>43464673 #>>43464679 #>>43464701 #>>43464720 #>>43464725 #>>43465054 #>>43465195 #>>43465674 #>>43467099 #>>43470575 #>>43471233 #

1. theptip ◴[24 Mar 25 19:43 UTC] No.43464701[source]▶

>>43464607 #

Yeah, this is the obvious objection to the doom. Someone has to pay to train the model that all the small ones distill from.

Companies will have to detect and police distilling if they want to keep their moat. Maybe you have to have an enterprise agreement (and arms control waiver) to get GPT-6-large API access.

↑

Qwen2.5-VL-32B: Smarter and Lighter