Ask HN: What's Your Useful Local LLM Stack?

- Claude Max ($100/mo): My go-to for pair programming. Heavy user of both the Claude web and desktop clients. - Windsurf Pro ($15/mo): Love the multi-line autocomplete and how it uses clipboard/context awareness. - ChatGPT Plus ($20/mo): My rubber duck, editor, and ideation partner. I use it for everything except code.

Senior software engineer with 46 years of experience (since I was 7). LLM inference hasn't been too useful for me for writing code, but it has proven very useful for explaining my coworkers' code to me.

Recently I had Gemma3-27B-it explain every Python script and library in a repo with the command:

$ find -name '*.py' -print -exec /home/ttk/bin/g3 "Explain this code in detail:\n\n`cat {}`" \; | tee explain.txt

There were a few files it couldn't figure out without other files, so I ran a second pass with those, giving it the source files it needed to understand source files that used them. Overall, pretty easy, and highly clarifying.

My shell script for wrapping llama.cpp's llama-cli and Gemma3: http://ciar.org/h/g3

That script references this grammar file which forces llama.cpp to infer only ASCII: http://ciar.org/h/ascii.gbnf

Cost: electricity

I've been meaning to check out Aider and GLM-4, but even if it's all it's cracked up to be, I expect to use it sparingly. Skills which aren't exercised are lost, and I'd like to keep my programming skills sharp.