Im intrigued.. Is this not done just with a phonemizer?
from phonemizer.phonemize import phonemize
text = "hello world"
variations = [
phonemize(text, backend="espeak", language="en-us", strip=True),
phonemize(text, backend="espeak", language="en-gb", strip=True),
phonemize(text, backend="espeak", language="en-au", strip=True),
]
I mean, espeak isnt the best but a lot of folks in the ASR/Speech world still are using this right?(NB: If you are on iOS check out the inbuilt one - Settings -> Accessibility -> Spoken Content -> Pronounciations. Adding one it has the ability to phonemize to IPA your spoken message. If someone can tell me where that SDK/API is they use in that I'd love to know) for i, variation in enumerate(variations, 1): print(f"Variation {i}: {variation}")
replies(1):