91 points Olshansky | 1 comments | 15 Jul 25 15:17 UTC | HN request time: 0.209s | source

What I’m asking HN:

What does your actually useful local LLM stack look like?

I’m looking for something that provides you with real value — not just a sexy demo.

---

After a recent internet outage, I realized I need a local LLM setup as a backup — not just for experimentation and fun.

My daily (remote) LLM stack:

  - Claude Max ($100/mo): My go-to for pair programming. Heavy user of both the Claude web and desktop clients.

  - Windsurf Pro ($15/mo): Love the multi-line autocomplete and how it uses clipboard/context awareness.

  - ChatGPT Plus ($20/mo): My rubber duck, editor, and ideation partner. I use it for everything except code.

Here’s what I’ve cobbled together for my local stack so far:

Tools

  - Ollama: for running models locally

  - Aider: Claude-code-style CLI interface

  - VSCode w/ continue.dev extension: local chat & autocomplete

Models

  - Chat: llama3.1:latest

  - Autocomplete: Qwen2.5 Coder 1.5B

  - Coding/Editing: deepseek-coder-v2:16b

Things I’m not worried about:

  - CPU/Memory (running on an M1 MacBook)

  - Cost (within reason)

  - Data privacy / being trained on (not trying to start a philosophical debate here)

I am worried about:

  - Actual usefulness (i.e. “vibes”)

  - Ease of use (tools that fit with my muscle memory)

  - Correctness (not benchmarks)

  - Latency & speed

Right now: I’ve got it working. I could make a slick demo. But it’s not actually useful yet.

---

Who I am

  - CTO of a small startup (5 amazing engineers)

  - 20 years of coding (since I was 13)

  - Ex-big tech

Show context

timr ◴[15 Jul 25 17:13 UTC] No.44573477[source]▶

>>44572043 (OP) #

I use Copilot, with the occasional free query to the other services. During coding, I mostly use Claude Sonnet 3.7 or 4 in agent mode, but Gemini 2.5 Pro is a close second. ChatGPT 4o is useless except for Q&A. I see no value in paying more -- the utility rapidly diminishes, because at this point the UI surrounding the models is far less important than the models themselves, which in turn are generally less important than the size of their context windows. Even Claude is only marginally better than Gemini (at coding), and they all suck to the point that I wouldn't trust any of them without reviewing every line. Far better to just pick a tool, get comfortable with it, and not screw around too much.

I don't understand people who pay hundreds of dollars a month for multiple tools. It feels like audiophiles paying $1000 for a platinum cable connector.

replies(3): >>44573578 #>>44573660 #>>44641158 #

1. ◴[15 Jul 25 17:20 UTC] No.44573578[source]▶

>>44573477 #

↑

Ask HN: What's Your Useful Local LLM Stack?