←back to thread

579 points paulpauper | 8 comments | | HN request time: 0.001s | source | bottom
1. fxtentacle ◴[] No.43603864[source]
I'd say most of the recent AI model progress has been on price.

A 4-bit quant of QwQ-32B is surprisingly close to Claude 3.5 in coding performance. But it's small enough to run on a consumer GPU, which means deployment price is now down to $0.10 per hour. (from $12+ for models requiring 8x H100)

replies(2): >>43603980 #>>43604115 #
2. shostack ◴[] No.43603980[source]
Yeah, I'm thinking of this from a Wardley map standpoint.

What innovation opens up when AI gets sufficiently commoditized?

replies(2): >>43604105 #>>43604127 #
3. mentalgear ◴[] No.43604105[source]
Brute force, brute force everything at least for the domains you can have automatic verification in.
4. xiphias2 ◴[] No.43604115[source]
Have you compared it with 8-bit QwQ-17B?

In my evals 8 bit quantized smaller Qwen models were better, but again evaluating is hard.

replies(1): >>43608343 #
5. bredren ◴[] No.43604127[source]
One thing I’ve seen is large enterprises extracting money from consumers by putting administrative burden on them.

For example, you can see this in health insurance reimbursements and wireless carriers plan changes. (ie, Verizon’s shift from Do More, etc to what they have now)

Companies basically set up circumstances where consumers lose small amounts of money on a recurring basis or sporadically enough that the people will just pay the money rather than a maze of calls, website navigation and time suck to recover funds due to them or that shouldn’t have been taken in the first place.

I’m hopeful well commoditized AI will give consumers a fighting chance at this and other types of disenfranchisement that seems to be increasingly normalized by companies that have consultants that do nothing but optimize for their own financial position.

6. redrove ◴[] No.43608343[source]
There’s no QwQ 17B that I’m aware of. Do you have a HF link?
replies(1): >>43611902 #
7. xiphias2 ◴[] No.43611902{3}[source]
You're right, sorry...I just tested Qwen models, not QwQ, I see QwQ only has 32B.
replies(1): >>43611938 #
8. redrove ◴[] No.43611938{4}[source]
No worries, QwQ is the thinking model from Qwen, it’s a common misconception.

I think they should’ve named it something else.