←back to thread

361 points mseri | 1 comments | | HN request time: 0s | source
Show context
thot_experiment ◴[] No.46002543[source]
Qwen3-30B-VL is going to be fucking hard to beat as a daily driver, it's so good for the base 80% of tasks I want an AI for, and holy fuck is it fast. 90tok/s on my machine, I pretty much keep it in vram permanently. I think this sort of work is important and I'm really glad it's being done, but in terms of something I want to use every day there's no way a dense model can compete unless it's smart as fuck. Even dumb models like Qwen3-30B get a lot of stuff right and not having to wait is amazing.
replies(3): >>46002752 #>>46005422 #>>46005940 #
andai ◴[] No.46005422[source]
I'm out of the loop... so Qwen3-30B-VL is smart and Qwen3-30B is dumb... and that has to do not with the size but architecture?
replies(2): >>46005995 #>>46008928 #
thot_experiment ◴[] No.46008928[source]
ahaha sorry that was unclear, while i think the VL version is maybe a bit more performant, by "dumb" i meant any low quant low size model you're going to run locally, vs a "smart" model in my book is something like Opus 4.1 or Gemma 3.

I basically class LLM queries into two categories, there's stuff i expect most models to get, and there's stuff i expect only the smartest models to have a shot of getting right, there's some stuff in the middle ground that a quant model running locally might not get but something dumb but acceptable like Sonnet 4.5 or Kimi K2 might be able to handle.

I generally just stick to the two extremes and route my queries accordingly. I've been burned by sonnet 4.5/gpt-5 too many times to trust it.

replies(1): >>46010808 #
1. thot_experiment ◴[] No.46010808[source]
sorry i meant gemini 3