Ask HN: What's Your Useful Local LLM Stack?

- Claude Max ($100/mo): My go-to for pair programming. Heavy user of both the Claude web and desktop clients. - Windsurf Pro ($15/mo): Love the multi-line autocomplete and how it uses clipboard/context awareness. - ChatGPT Plus ($20/mo): My rubber duck, editor, and ideation partner. I use it for everything except code.

Just chiming in — been down this exact rabbit hole for months (same pain: useful != demo).

I ended up ditching the usual RAG+embedding route and built a local semantic engine that uses ΔS as a resonance constraint (yeah it sounds crazy, but hear me out).

Still uses local models (Ollama + gguf)

But instead of just vector search, it enforces semantic logic trees + memory drift tracking

Main gain: reduced hallucination in summarization + actual retention of reasoning across files

Weirdly, the thing that made it viable was getting a public endorsement from the guy who wrote tesseract.js (OCR legend). He called the engine’s reasoning “shockingly human-like” — not in benchmark terms, but in sustained thought flow.

Still polishing a few parts, but if you’ve ever hit the wall of “why is my LLM helpful but forgetful?”, this might be a route worth peeking into.

(Also happy to share the GitHub PDF if you’re curious — it’s more logic notes than launch page.)