(qwenlm.github.io)

544 points tosh | 3 comments | 24 Mar 25 18:35 UTC | HN request time: 0.071s | source

Show context

101008 ◴[24 Mar 25 20:15 UTC] No.43464981[source]▶

Silly question: how can OpenAI, Claude and all, have a valuation so large considering all the open source models? Not saying they will disappear or be tiny (closed models), but why so so so valuable?

replies(10): >>43465120 #>>43465472 #>>43465899 #>>43465934 #>>43466072 #>>43466079 #>>43467117 #>>43467174 #>>43467244 #>>43468107 #

Workaccount2 ◴[25 Mar 25 01:08 UTC] No.43467117[source]▶

>>43464981 #

Because they offer extremely powerful models at pretty modest prices.

The hardware for a local model would cost years and years of a $20/mo subscription, would output lower quality work, and would be much slower.

3.7 Thinking is an insane programming model. Maybe it cannot do an SWE's job, but it sure as hell can write functional narrow-scope programs with a GUI.

replies(1): >>43468350 #

1. mirekrusin ◴[25 Mar 25 05:27 UTC] No.43468350[source]▶

>>43467117 #

For coding and other integrations people pay per token on api key, not subscription. Claude code costs few $ per task on your code - it gets expensive quite quickly.

replies(1): >>43472142 #

2. tempoponet ◴[25 Mar 25 14:57 UTC] No.43472142[source]▶

>>43468350 (TP) #

But something comparable to a local hosted model in the 32-70b range costs pennies on the dollar compared to Claude, will be 50x faster than your gpu, and with a much larger context window.

Local hosting on GPU only really makes sense if you're doing many hours of training/inference daily.

replies(1): >>43479299 #

3. mirekrusin ◴[26 Mar 25 06:10 UTC] No.43479299[source]▶

>>43472142 #

...or working for company which forbids sending IP over wire somewhere.

Also "many hours of inference daily" may mean you're doing your usual stuff daily while running some processing in the background that takes hours/days or you've put together some reactive automation that runs often all the time.

ps. local training rarely makes sense.

ps. 2. not sure where you got 50x slower from; 4090 is actually faster than A100 for example and 5090 is ~75% faster than 4090

↑

Qwen2.5-VL-32B: Smarter and Lighter