The voice input can really be done however you like, the benefit of a device like the Voice PE is the wake word detection on-device.
I have an office-style desk-phone (SNOM) connected to a SIP server and I can pick the receiver up and talk to the Assistant, but you can plug in any way you like to get the audio to/from HA.
With your phone, wake words are usually locked down by Apple/Google so you can't really have it hands-free, and that's the problem this device is solving; not the audio input itself, but the wake-word/handfree input.
On an Android phone, you can replace the Google Assistant with the Home Assistant one, but you still have to activate it the usual way, press a button or launch the app etc.