←back to thread

634 points david927 | 1 comments | | HN request time: 0.285s | source

What are you working on? Any new ideas that you're thinking about?
Show context
Rendello ◴[] No.41348643[source]
I've been building a web extension for an Inuit language (ᐃᓄᒃᑎᑐᑦ ᖃᓂᐅᔮᖅᐸᐃᑦ).

Most Inuit in Canada speak Inuktitut, which is a language with long words and two different writing systems: the latin alphabet, and syllabics. Syllabics are specially adapted for Inuktitut and well-loved by the Inuit, but unfortunately can be a pain to input on a computer, so often times the more cumbersome latin alphabet is used in casual writing.

Tom Scott has an awesome video about how syllabics work[1], but breifly, the shape of the character determined its inital consonant spund, and the rotation determines its vowel sound. So ᐱ = pi, ᐳ = pu, ᐸ = pa, ᑎ = ti, ᑐ = tu, ᑕ = ta, etc. The word "Inuktitut" becomes ᐃᓄᒃᑎᑐᑦ in syllabics.

Transliterating between the two is fairly simple, but there are edge-cases around dialects and whatnot. The more interesting problem from a technical perspective is having a web extension that can detect Inuktitut on a web page (in wither writing system), and transliterate that into whatever writing system the user desires, whilst never accidentally transliterating the other text on the page ("inhabitants" could be picked up and transliterated as "ᐃᓐᕼᐊᖯᐃᑕᓐᑦᔅ", for example, even though that makes no sense).

The project is mostly using Rust via WebAssembly, which has been a lot of fun to work with and has let me do some awesome things, like avoiding heap allocation and using compile-time hashmaps to do conversions on the text. The build system has to do a lot and I eventually settled on python. Right now I'm trying to wrangle JS and the DOM (there's a lot of edge cases to deal with), and that's been difficult as it's not my wheelhouse.

1. https://www.youtube.com/watch?v=xW4hI_METac&t=15s

replies(2): >>41350966 #>>41362256 #
seism ◴[] No.41362256[source]
Could your project also extend beyond DOM, to Canvas and WebGL code? I could imagine some very interesting creative uses in a game engine setting. Also: check out Aya by Cohere, a Canadian AI company.
replies(1): >>41363550 #
Rendello ◴[] No.41363550[source]
I think you may have replied to the wrong comment or misunderstood my project, it's mostly finding text from a certain language based on some (fairly simple) heuristics, then converting it from one script to another in a systematic way, with some options a user can set. I doesn't have much relevance to LLMs or game engines, unless your game has an Inuktitut translation, in which case it'd be better to transliterate ahead of time ;)
replies(1): >>41365342 #
1. seism ◴[] No.41365342[source]
Ah, thanks for clarifying!