←back to thread

Phonetic Matching

(smoores.dev)
77 points raybb | 3 comments | | HN request time: 0.501s | source
Show context
asveikau ◴[] No.42172434[source]
The idea that "shore" and "sure" are pronounced "almost identically" would depend pretty heavily on your accent. The vowel is pretty different to me.

Also, the matches for "sorI" and "sorY" would seem to me to misinterpret the words as having a vowel at the end, rather than a silent vowel. If you're using data meant for foreign surnames, the rules of which may differ from English and which might have silent vowels be very rare depending on the original language, of course you may mispronounce English words like this, saying both shore and sure as "sore-ee".

I'm sure there are much better ways to transcribe orthography to phonetics, probably people have published libraries that do it. From some googling, it seems like some people call this type of library a phonemic transcriber or IPA transcriber.

replies(5): >>42172850 #>>42173496 #>>42177414 #>>42179389 #>>42180312 #
woodrowbarlow ◴[] No.42173496[source]
IPA is the most-used tool by linguistic researchers for encoding pronunciation in a standardized way. IPA is criticized for being a little bit anglo-centric and falls short for some languages and edge cases, but overall it performs pretty well. (learned from an ex who studies linguistics.)
replies(4): >>42173671 #>>42174382 #>>42174781 #>>42177483 #
1. lupire ◴[] No.42174781[source]
Yes, but stay aware that IPA is for pronunciations.

A word doesn't have unique pronunciation. (Speaker, Word) pair has pronunciation, and even those are not unique. (Speaker, Word, Utterane) Triple has a pronunciation.

replies(2): >>42177401 #>>42178237 #
2. jjtheblunt ◴[] No.42177401[source]
even a speaker with a specified word in a specified utterance will vary pronunciation for the context of who is listening (imitation of local accent).

(we worked on all this in Motorola in 2001 extensively....then they dropped it)

3. Funes- ◴[] No.42178237[source]
>Yes, but stay aware that IPA is for pronunciations. A word doesn't have unique pronunciation.

No. IPA encodes sounds based on various aspects of articulation. A word has unique phonemes (enclosed in forward slashes, //), but not necessarily unique sounds (allophones, enclosed in brackets, []).