←back to thread

314 points pretext | 2 comments | | HN request time: 0.802s | source
Show context
sosodev ◴[] No.46220123[source]
Does Qwen3-Omni support real-time conversation like GPT-4o? Looking at their documentation it doesn't seem like it does.

Are there any open weight models that do? Not talking about speech to text -> LLM -> text to speech btw I mean a real voice <-> language model.

edit:

It does support real-time conversation! Has anybody here gotten that to work on local hardware? I'm particularly curious if anybody has run it with a non-nvidia setup.

replies(4): >>46220228 #>>46222544 #>>46223129 #>>46224919 #
1. dsrtslnd23 ◴[] No.46220228[source]
it seems to be able to do native speech-speech
replies(1): >>46220381 #
2. sosodev ◴[] No.46220381[source]
It does for sure. I did some more digging and it does real-time too. That's fascinating.