(allenai.org)

361 points mseri | 2 comments | 21 Nov 25 06:50 UTC | HN request time: 0s | source

Show context

dangoodmanUT ◴[21 Nov 25 14:44 UTC] No.46005065[source]▶

What are some of the real world applications of small models like this, is it only on-device inference?

In most cases, I'm only seeing models like sonnet being just barely sufficiently for the workloads I've done historically. Would love to know where others are finding use of smaller models (like gpt-oss-120B and below, esp smaller models like this).

Maybe some really lightweight borderline-NLP classification tasks?

replies(3): >>46005122 #>>46005251 #>>46009108 #

fnbr ◴[21 Nov 25 15:05 UTC] No.46005251[source]▶

>>46005065 #

(I’m a researcher on the post-training team at Ai2.)

7B models are mostly useful for local use on consumer GPUs. 32B could be used for a lot of applications. There’s a lot of companies using fine tuned Qwen 3 models that might want to switch to Olmo now that we have released a 32B base model.

replies(2): >>46005571 #>>46010965 #

littlestymaar ◴[21 Nov 25 15:46 UTC] No.46005571[source]▶

>>46005251 #

May I ask why you went for a 7B and a 32B dense models instead of a small MoE like Qwen3-30B-A3B or gpt-oss-20b given how successful these MoE experiments were?

replies(2): >>46005991 #>>46006040 #

1. fnbr ◴[21 Nov 25 16:32 UTC] No.46005991[source]▶

>>46005571 #

MoEs have a lot of technical complexity and aren't well supported in the open source world. We plan to release a MoE soon(ish).

I do think that MoEs are clearly the future. I think we will release more MoEs moving forward once we have the tech in place to do so efficiently. For all use cases except local usage, I think that MoEs are clearly superior to dense models.

replies(1): >>46010921 #

2. trebligdivad ◴[22 Nov 25 00:48 UTC] No.46010921[source]▶

>>46005991 (TP) #

Even local, MoE are just so much faster, and they let you pick a large/less quantized model and still get a useful speed.

↑

Olmo 3: Charting a path through the model flow to lead open-source AI