←back to thread

314 points pretext | 1 comments | | HN request time: 0.362s | source
Show context
gardnr ◴[] No.46220742[source]
This is a 30B parameter MoE with 3B active parameters and is the successor to their previous 7B omni model. [1]

You can expect this model to have similar performance to the non-omni version. [2]

There aren't many open-weights omni models so I consider this a big deal. I would use this model to replace the keyboard and monitor in an application while doing the heavy lifting with other tech behind the scenes. There is also a reasoning version, which might be a bit amusing in an interactive voice chat if it pronounces the thinking tokens while working through to a final answer.

1. https://huggingface.co/Qwen/Qwen2.5-Omni-7B

2. https://artificialanalysis.ai/models/qwen3-30b-a3b-instruct

replies(7): >>46221003 #>>46221130 #>>46221363 #>>46221587 #>>46222499 #>>46222576 #>>46229484 #
olafura ◴[] No.46221130[source]
Looks like it's not open source: https://www.alibabacloud.com/help/en/model-studio/qwen-omni#...
replies(1): >>46221209 #
coder543 ◴[] No.46221209[source]
No... that website is not helpful. If you take it at face value, it is claiming that the previous Qwen3-Omni-Flash wasn't open either, but that seems wrong? It is very common for these blog posts to get published before the model weights are uploaded.
replies(1): >>46222530 #
red2awn ◴[] No.46222530[source]
The previous -Flash weight is closed source. They do have weights for the original model that is slightly behind in performance https://huggingface.co/Qwen/Qwen3-Omni-30B-A3B-Instruct
replies(1): >>46222889 #
coder543 ◴[] No.46222889[source]
Based on things I had read over the past several months, Qwen3-Flash seemed to just be a weird marketing term for the Qwen3-Omni-30B-A3B series, not a different model. If they are not the same, then that is interesting/confusing.
replies(1): >>46223113 #
1. red2awn ◴[] No.46223113[source]
It is an in-house closed weight model for their own chat platform, mentioned in Section 5 of the original paper: https://arxiv.org/pdf/2509.17765

I've seen it in their online materials too but can't seem to find it now.