I Self-Hosted Llama 3.2 with Coolify on My Home Server

(geek.sg)

221 points whitefables | 1 comments | 16 Oct 24 05:26 UTC | HN request time: 0.294s | source

Show context

taosx ◴[16 Oct 24 07:41 UTC] No.41856567[source]▶

For the people who self-host LLMs at home: what use cases do you have?

Personally, I have some notes and bookmarks that I'd like to scrape, then have an LLM summarize, generate hierarchical tags, and store in a database. For the notes part at least, I wouldn't want to give them to another provider; even for the bookmarks, I wouldn't be comfortable passing my reading profile to anyone.

replies(11): >>41856653 #>>41856701 #>>41856881 #>>41856970 #>>41856992 #>>41857395 #>>41858199 #>>41858353 #>>41861443 #>>41864562 #>>41890288 #

xyc ◴[16 Oct 24 07:54 UTC] No.41856653[source]▶

>>41856567 #

llama3.2 1b & 3b is really useful for quick tasks like creating some quick scripts from some text, then pasting them to execute as it's super fast & replaces a lot of temporary automation needs. If you don't feel like invest time into automation, sometimes you can just feed into an LLM.

This is one of the reason why recently I added floating chat to https://recurse.chat/ to quickly access local LLM.

Here's a demo: https://x.com/recursechat/status/1846309980091330815

replies(2): >>41856827 #>>41857089 #

taosx ◴[16 Oct 24 08:26 UTC] No.41856827[source]▶

>>41856653 #

Looks very nice, saved it for later. Last week, I worked on implementing always-on speech-to-text functionality for automating tasks. I've made significant progress, achieving decent accuracy, but I imposed some self-imposed constraints to implement certain parts from scratch to deliver a single binary deployable solution, which means I still have work to do (audio processing is new territory for me). However, I'm optimistic about its potential.

That being said, I think the more straightforward approach would be to utilize an existing library like https://github.com/collabora/WhisperLive/ within a Docker container. This way, you can call it via WebSocket and integrate it with my LLM, which could also serve as a nice feature in your product.

replies(1): >>41856926 #

xyc ◴[16 Oct 24 08:43 UTC] No.41856926[source]▶

>>41856827 #

Thanks! lmk when/if you wanna give it a spin as free trial hasn't been updated with the latest but I'll try to do it this week.

I've actually been playing around with speech to text recently. Thank you for the pointer, docker is a bit too heavy to deploy for desktop app use case but it's good to know about the repo. Building binaries with Pyinstaller could be an option though.

Real time transcription seems a bit complicated as it involves VAD so a feasible path for me is to first ship simple transcription with whisper.cpp. large-v3-turbo looks fast enough :D

replies(1): >>41856983 #

1. taosx ◴[16 Oct 24 08:52 UTC] No.41856983[source]▶

>>41856926 #

Yes it's fast enough, especially if you don't need something live.

↑