(github.com)

177 points akadeb | 1 comments | 22 Apr 25 14:10 UTC | HN request time: 0.213s | source

Hi HN! Last year the project I launched here got a lot of good feedback on creating speech to speech AI on the ESP32. Recently I revamped the whole stack, iterated on that feedback and made our project fully open-source—all of the client, hardware, firmware code.

This Github repo turns an ESP32-S3 into a realtime AI speech companion using the OpenAI Realtime API, Arduino WebSockets, Deno Edge Functions, and a full-stack web interface. You can talk to your own custom AI character, and it responds instantly.

I couldn't find a resource that helped set up a reliable, secure websocket (WSS) AI speech to speech service. While there are several useful Text-To-Speech (TTS) and Speech-To-Text (STT) repos out there, I believe none gets Speech-To-Speech right. OpenAI launched an embedded-repo late last year which sets up WebRTC with ESP-IDF. However, it's not beginner friendly and doesn't have a server side component for business logic.

This repo is an attempt at solving the above pains and creating a great speech to speech experience on Arduino with Secure Websockets using Edge Servers (with Deno/Supabase Edge Functions) for fast global connectivity and low latency.

Show context

tantalor ◴[22 Apr 25 16:33 UTC] No.43763970[source]▶

>>43762409 (OP) #

I'm surprised by the overwhelming positive vibes in the comments here.

Maybe I'm alone? To me, this comes across as extremely creepy, the exact opposite of what we should desire from AI in products aimed at children.

replies(7): >>43764077 #>>43764125 #>>43764168 #>>43764189 #>>43764195 #>>43764294 #>>43772666 #

adregan ◴[22 Apr 25 16:58 UTC] No.43764168[source]▶

>>43763970 #

Totally get the creepy part, but my criticism of devices like this is that they seem to be made by people with limited exposure to the creative power of children.

Children don’t need this; they are so much more creative than an AI (and the adults that trained the AI), and their creativity is fueled by boredom.

replies(5): >>43764401 #>>43771839 #>>43772703 #>>43773580 #>>43778994 #

1. mst ◴[23 Apr 25 13:20 UTC] No.43771839[source]▶

>>43764168 #

I feel like it would be creepy if the kid was using it without anybody ever checking up on it ... but I think all of my friends with kids would say that the answer to that is "parenting."

I mean, giving a kid an unlocked iPad and not bothering to do basic supervision can also have really creepy results, so I'm unconvinced that something like your work actually makes anything worse in the negligent parenting situation, and seems like it could be a lot of fun in the competent parenting one.

If you haven't already done this, I'd note that I can think of a number of parents who would probably rather enjoy a version of story mode that let them collaborate with their child and your code to put together a bedtime story before they turn it off for the night and tuck the kid into bed.

↑

Show HN: I open-sourced my AI toy company that runs on ESP32 and OpenAI realtime