It looks like continue.dev has a RAG implementation but for other files something else? PDF, word, and other languages.
I’ve been going thru some of the neovim plugins for local llm support.
What does your actually useful local LLM stack look like?
I’m looking for something that provides you with real value — not just a sexy demo.
---
After a recent internet outage, I realized I need a local LLM setup as a backup — not just for experimentation and fun.
My daily (remote) LLM stack:
- Claude Max ($100/mo): My go-to for pair programming. Heavy user of both the Claude web and desktop clients.
- Windsurf Pro ($15/mo): Love the multi-line autocomplete and how it uses clipboard/context awareness.
- ChatGPT Plus ($20/mo): My rubber duck, editor, and ideation partner. I use it for everything except code.
Here’s what I’ve cobbled together for my local stack so far:Tools
- Ollama: for running models locally
- Aider: Claude-code-style CLI interface
- VSCode w/ continue.dev extension: local chat & autocomplete
Models - Chat: llama3.1:latest
- Autocomplete: Qwen2.5 Coder 1.5B
- Coding/Editing: deepseek-coder-v2:16b
Things I’m not worried about: - CPU/Memory (running on an M1 MacBook)
- Cost (within reason)
- Data privacy / being trained on (not trying to start a philosophical debate here)
I am worried about: - Actual usefulness (i.e. “vibes”)
- Ease of use (tools that fit with my muscle memory)
- Correctness (not benchmarks)
- Latency & speed
Right now: I’ve got it working. I could make a slick demo. But it’s not actually useful yet.---
Who I am
- CTO of a small startup (5 amazing engineers)
- 20 years of coding (since I was 13)
- Ex-big tech