←back to thread

313 points mariano54 | 1 comments | | HN request time: 0.234s | source

Hey HN, we're Mariano and Anton from ISSEN (https://issen.com), a foreign language voice tutor app that adapts to your interests, goals, and needs.

Demo: https://www.loom.com/share/a78e713d46934857a2dc88aed1bb100d?...

We started this company after struggling to find great tools to practice speaking Japanese and French. Having a tutor can be awesome, but there are downsides: they can be expensive (since you pay by the hour), difficult to schedule, and have a high upfront cost (finding a tutor you like often forces you to cycle through a few that you don’t).

We wanted something that would talk with us — realistically, in full conversations — and actually help us improve. So we built it ourselves. The app relies on a custom voice AI pipeline combining STT (speech-to-text), TTS (text-to-speech), LLMs, long term memory, interruptions, turn-taking, etc. Getting speech-to-text to work well for learners was one of the hardest parts — especially with accents, multi-lingual sentences, and noisy environments. We now combine Gemini Flash, Whisper, Scribe, and GPT-4o-transcribe to minimize errors and keep the conversation flowing.

We didn’t want to focus too much on gamification. In our experience, that leads to users performing well in the app, achieving long streaks and so on, without actually getting fluent in the language you're wanting to learn.

With ISSEN you instantly speak and immerse yourself in the language, which, while not easy, is a much more efficient way to learn.

We combine this with a word bank and SRS flashcards for new words learned in the AI voice chats, which allows very rapid improvement in both vocabulary and speaking skills. We also create custom curriculums for each student based on goals, interests, and preferences, and fully customizable settings like speed, turn taking, formality, etc.

App: https://issen.com (works on web, iOS, Android) Pricing: 20 min free trial, $20–29/month (depending on duration and specific geography)

We’d love your feedback — on the tech, the UX, or what you’d wish from a tool like this. Thanks!

1. zelphirkalt ◴[] No.44422257[source]
Hi! Not sure you are still reading this, but gonna write feedback here.

I tried the trial and it doesn't work for me. Even though I allowed it to access my mic, it never picks up my voice input. I am on Librewolf. It is also a little bit unclear in the trial, when one is supposed to speak, if one is using the app the first time. The generated voice speaks, and then there is no visual indication of when it is ones turn to speak. Of course one can infer it from the things that the generated voice says, but this made me feel unsure, whether everything is working. But yeah, I cannot reply to the AI generated voice, because it doesn't pick up my input, even after reloading several times. I noticed, that on Librewolf the app doesn't recognize the mic and speakers properly.

On Chromium it seems to work.

I think some work is needed to get everything working cross browser.

I got some other actual usage issues:

- When speaking I also have pauses, but the pauses are then mis-recognized as me having finished my phrase, and then the AI already answers, which is of course annoying and counterproductive. A human would realize, that I am not done speaking and with a little bit of common sense wait a bit longer before speaking. At other times it somehow doesn't get, that I am already done speaking. This happened to me, when I replied only in one word.

- Sometimes it recognizes words completely wrong. (I tried Mandarin.) For example I said "柏林" and it understood some random characters, that did not even sound anything like "bo2lin2".

- The AI sometimes uses quite long phrases and sometimes vocabulary I don't know. If I take too long time to look up characters or reply, then the call might time out.

- Sometimes the AI really butchers the pronunciation of Mandarin.

- Sometimes there are pauses between parts of phrases in the AI output. When I start answering, and then the AI continues, changing the overall meaning or expectation of its message. Then I continue speaking after it is really done, but in my response both parts of what I said are included. The premature reply and the actual reply. This then changes my reply to be nonsensical, as if I did not understand what the AI said.

- The font size is really small for Mandarin, at least in my opinion. Especially in the tool tips when hovering over the words that I don't know.

- I actually didn't get to test learning anything, because the AI kept going on about 介绍 and 做计划 for learning and so on. When I finally said 让我们开始学习, the time was up and the test call closed. So I cannot actually say anything about how well one can learn some grammar with it.

That said, I am surprised, by how well it understands me most of the time. Sometimes though it reinterprets what I have said to fit more as a reply to the questions it asked, rather than taking simply what I have said.