Olmo 3: Charting a path through the model flow to lead open-source AI

(allenai.org)

361 points mseri | 1 comments | 21 Nov 25 06:50 UTC | HN request time: 0s | source

Show context

thot_experiment ◴[21 Nov 25 08:43 UTC] No.46002543[source]▶

Qwen3-30B-VL is going to be fucking hard to beat as a daily driver, it's so good for the base 80% of tasks I want an AI for, and holy fuck is it fast. 90tok/s on my machine, I pretty much keep it in vram permanently. I think this sort of work is important and I'm really glad it's being done, but in terms of something I want to use every day there's no way a dense model can compete unless it's smart as fuck. Even dumb models like Qwen3-30B get a lot of stuff right and not having to wait is amazing.

replies(3): >>46002752 #>>46005422 #>>46005940 #

andai ◴[21 Nov 25 15:28 UTC] No.46005422[source]▶

>>46002543 #

I'm out of the loop... so Qwen3-30B-VL is smart and Qwen3-30B is dumb... and that has to do not with the size but architecture?

replies(2): >>46005995 #>>46008928 #

thot_experiment ◴[21 Nov 25 20:59 UTC] No.46008928[source]▶

>>46005422 #

ahaha sorry that was unclear, while i think the VL version is maybe a bit more performant, by "dumb" i meant any low quant low size model you're going to run locally, vs a "smart" model in my book is something like Opus 4.1 or Gemma 3.

I basically class LLM queries into two categories, there's stuff i expect most models to get, and there's stuff i expect only the smartest models to have a shot of getting right, there's some stuff in the middle ground that a quant model running locally might not get but something dumb but acceptable like Sonnet 4.5 or Kimi K2 might be able to handle.

I generally just stick to the two extremes and route my queries accordingly. I've been burned by sonnet 4.5/gpt-5 too many times to trust it.

replies(1): >>46010808 #

1. thot_experiment ◴[22 Nov 25 00:30 UTC] No.46010808[source]▶

>>46008928 #

sorry i meant gemini 3

↑