←back to thread

314 points pretext | 1 comments | | HN request time: 0.196s | source
Show context
sosodev ◴[] No.46220123[source]
Does Qwen3-Omni support real-time conversation like GPT-4o? Looking at their documentation it doesn't seem like it does.

Are there any open weight models that do? Not talking about speech to text -> LLM -> text to speech btw I mean a real voice <-> language model.

edit:

It does support real-time conversation! Has anybody here gotten that to work on local hardware? I'm particularly curious if anybody has run it with a non-nvidia setup.

replies(4): >>46220228 #>>46222544 #>>46223129 #>>46224919 #
red2awn ◴[] No.46222544[source]
None of inference frameworks (vLLM/SGLang) supports the full model, let alone non-nvidia.
replies(3): >>46223310 #>>46223630 #>>46226911 #
1. sosodev ◴[] No.46223310[source]
That's unfortunate but not too surprising. This type of model is very new to the local hosting space.