Ask HN: Is anyone doing anything cool with tiny language models?

1. flippyhead ◴[21 Jan 25 22:12 UTC] No.42785739[source]▶

I have a tiny device that listens to conversations between two people or more and constantly tries to declare a "winner"

replies(14): >>42785781 #>>42785791 #>>42785949 #>>42785970 #>>42785979 #>>42786455 #>>42786672 #>>42787108 #>>42788174 #>>42788937 #>>42789840 #>>42791711 #>>42807514 #>>42890452 #

2. pseudosavant ◴[21 Jan 25 22:17 UTC] No.42785781[source]▶

>>42785739 (TP) #

I'd love to hear more about the hardware behind this project. I've had concepts for tech requiring a mic on me at all times for various reasons. Always tricky to have enough power in a reasonable DIY form factor.

3. oa335 ◴[21 Jan 25 22:18 UTC] No.42785791[source]▶

>>42785739 (TP) #

This made me actually laugh out loud. Can you share more details on hardware and models used?

4. econ ◴[21 Jan 25 22:34 UTC] No.42785949[source]▶

>>42785739 (TP) #

This is a product I want

5. amelius ◴[21 Jan 25 22:36 UTC] No.42785970[source]▶

>>42785739 (TP) #

You can use the model to generate winning speeches also.

6. jjcm ◴[21 Jan 25 22:37 UTC] No.42785979[source]▶

>>42785739 (TP) #

Are you raising a funding round? I'm bought in. This is hilarious.

7. hn8726 ◴[21 Jan 25 23:17 UTC] No.42786455[source]▶

>>42785739 (TP) #

What approach/stack would you recommend for listening to an ongoing conversation, transcribing it and passing through llm? I had some use cases in mind but I'm not very familiar with AI frameworks and tools

8. eddd-ddde ◴[21 Jan 25 23:41 UTC] No.42786672[source]▶

>>42785739 (TP) #

I love that there's not even a vague idea of the winner "metric" in your explanation. Like it's just, _the_ winner.

9. mkaic ◴[22 Jan 25 00:27 UTC] No.42787108[source]▶

>>42785739 (TP) #

This reminds me of the antics of streamer DougDoug, who often uses LLM APIs to live-summarize, analyze, or interact with his (often multi-thousand-strong) Twitch chat. Most recently I saw him do a GeoGuessr stream where he had ChatGPT assume the role of a detective who must comb through the thousands of chat messages for clues about where the chat thinks the location is, then synthesizes the clamor into a final guess. Aside from constantly being trolled by people spamming nothing but "Kyoto, Japan" in chat, it occasionaly demonstrated a pretty effective incarnation of "the wisdom of the crowd" and was strikingly accurate at times.

10. nejsjsjsbsb ◴[22 Jan 25 02:28 UTC] No.42788174[source]▶

>>42785739 (TP) #

All computation on device?

11. prakashn27 ◴[22 Jan 25 04:14 UTC] No.42788937[source]▶

>>42785739 (TP) #

wifey always wins. ;)

12. deivid ◴[22 Jan 25 06:51 UTC] No.42789840[source]▶

>>42785739 (TP) #

what model do you use for speech to text?

13. TechDebtDevin ◴[22 Jan 25 11:47 UTC] No.42791711[source]▶

>>42785739 (TP) #

Your SO must really love that lmao

14. econ ◴[23 Jan 25 20:04 UTC] No.42807514[source]▶

>>42785739 (TP) #

Tell me it also does sports style commentary on the ongoing debate. My mental image requires it.

15. flippyhead ◴[31 Jan 25 18:54 UTC] No.42890452[source]▶

>>42785739 (TP) #

Heh, I made this comment and forgot to check back -- I'm always missing stuff on HN because of this!

If anyone is still paying attention, email me at hi@seikai.tv and I'll see if I can send you one.

replies(2): >>42981978 #>>42992234 #

16. Shonku_ ◴[08 Feb 25 10:29 UTC] No.42981978[source]▶

>>42890452 #

Yeah I'm still paying attention!

17. ultrasounder ◴[09 Feb 25 17:59 UTC] No.42992234[source]▶

>>42890452 #

Sounds cool! In fact, this can be applied to other areas such as "debate monitoring" for debate competitions