Maybe I'm alone? To me, this comes across as extremely creepy, the exact opposite of what we should desire from AI in products aimed at children.
This Github repo turns an ESP32-S3 into a realtime AI speech companion using the OpenAI Realtime API, Arduino WebSockets, Deno Edge Functions, and a full-stack web interface. You can talk to your own custom AI character, and it responds instantly.
I couldn't find a resource that helped set up a reliable, secure websocket (WSS) AI speech to speech service. While there are several useful Text-To-Speech (TTS) and Speech-To-Text (STT) repos out there, I believe none gets Speech-To-Speech right. OpenAI launched an embedded-repo late last year which sets up WebRTC with ESP-IDF. However, it's not beginner friendly and doesn't have a server side component for business logic.
This repo is an attempt at solving the above pains and creating a great speech to speech experience on Arduino with Secure Websockets using Edge Servers (with Deno/Supabase Edge Functions) for fast global connectivity and low latency.
Children don’t need this; they are so much more creative than an AI (and the adults that trained the AI), and their creativity is fueled by boredom.
I poured hours into games/programming because it was a happy place away from school etc… These toys could be the same.
This technology is neutral, but I see so much potential for projects that do good.
That said, I totally agree that I wouldn't want this in a kids toy. The whole idea is super creepy in that respect, with so much scope for abuse.
Bots are for doing tasks. I don't want to socialize with them and find the idea of kids being socialized by bots supremely weird. At least the AI girlfriend people are (probably unwell) adults.
The target audience is young kids who are still developing socialization skills. This toy off-boards that development from a human to an AI. We don't really know how that affects a kid.
This also plausibly trains the kid to think of other people as AIs: subservient tools that exist primarily to respond to them. Not exactly a healthy attitude to take towards one's peers.
It's presumably also going to get a lot of unsupervised usage, and the occasional AI model updates. What happens when a bad model update has it advising kids that soap is a forbidden candy that tastes delicious?
(I'm not saying any of these is particularly likely, just trying to share the sort of concerns that would lead someone to feeling creeped out)
I mean, giving a kid an unlocked iPad and not bothering to do basic supervision can also have really creepy results, so I'm unconvinced that something like your work actually makes anything worse in the negligent parenting situation, and seems like it could be a lot of fun in the competent parenting one.
If you haven't already done this, I'd note that I can think of a number of parents who would probably rather enjoy a version of story mode that let them collaborate with their child and your code to put together a bedtime story before they turn it off for the night and tuck the kid into bed.
However, while testing it with a friend who has a 5-year old daughter, I added a `Story mode` feature to create dynamic stories for her which she enjoys.
I think what would be even cooler is if each character in a story has unique voices (like voice of an ogre, voice of an elf etc.) which is currently unsupported in the single websocket connnection.
This is true, I am not a parent. But I have some domain expertise in building a conversational toy... talking to many parents and having been a child myself for several years has helped