(qwen.ai)

314 points pretext | 1 comments | 10 Dec 25 16:13 UTC | HN request time: 0.196s | source

Show context

sosodev ◴[10 Dec 25 16:55 UTC] No.46220123[source]▶

>>46219538 (OP) #

Does Qwen3-Omni support real-time conversation like GPT-4o? Looking at their documentation it doesn't seem like it does.

Are there any open weight models that do? Not talking about speech to text -> LLM -> text to speech btw I mean a real voice <-> language model.

edit:

It does support real-time conversation! Has anybody here gotten that to work on local hardware? I'm particularly curious if anybody has run it with a non-nvidia setup.

replies(4): >>46220228 #>>46222544 #>>46223129 #>>46224919 #

red2awn ◴[10 Dec 25 19:38 UTC] No.46222544[source]▶

>>46220123 #

None of inference frameworks (vLLM/SGLang) supports the full model, let alone non-nvidia.

replies(3): >>46223310 #>>46223630 #>>46226911 #

1. sosodev ◴[10 Dec 25 20:29 UTC] No.46223310[source]▶

>>46222544 #

That's unfortunate but not too surprising. This type of model is very new to the local hosting space.

↑

Qwen3-Omni-Flash-2025-12-01：a next-generation native multimodal large model