←back to thread

GPT-5.2

(openai.com)
1019 points atgctg | 1 comments | | HN request time: 0.335s | source
Show context
zug_zug ◴[] No.46235131[source]
For me the last remaining killer feature of ChatGPT is the quality of the voice chat. Do any of the competitors have something like that?
replies(15): >>46235139 #>>46235151 #>>46235193 #>>46235277 #>>46235779 #>>46236133 #>>46236236 #>>46236283 #>>46236341 #>>46236399 #>>46236665 #>>46236951 #>>46237061 #>>46237082 #>>46237617 #
1. sundarurfriend ◴[] No.46236236[source]
Are you saying ChatGPT's voice chat is of good quality? Because for me it's one of its most frustrating weaknesses. I vastly prefer voice input to typing, and would love it if the voice chat mode actually worked well.

But apart from the voices being pretty meh, it's also really bad at detecting and filtering out noise, taking vehicle sounds as breaks to start talking in (even if I'm talking much louder at the same time) or as some random YouTube subtitles (car motor = "Thanks for watching, subscribe!").

The speech-to-text is really unreliable (the single-chat Dictate feature gets about 98% of my words correct, this Voice mode is closer to 75%), and they clearly use an inferior model for the AI backend for this too: with the same question asked in this back-and-forth Voice mode and a normal text chat, the answer quality difference is quite stark: the Voice mode answer is most often close to useless. It seems like they've overoptimized it for speed at the cost of quality, to the extent that it feels like it's a year behind in answer reliability and usefulness.

To your question about competitors, I've recently noticed that Grok seems to be much better at both the speech-to-text part and the noise handling, and the voices are less uncanny-valley sounding too. I'd say they also don't have that stark a difference between text answers and voice mode answers, and that would be true but unfortunately mainly because its text answers are also not great with hallucinations or following instructions.

So Grok has the voice part figured out, ChatGPT has the backend AI reliability figured out, but neither provide a real usable voice mode right now.