(github.com)

39 points davidye324 | 1 comments | 30 Jun 25 23:43 UTC | HN request time: 0.221s | source

What it is A single 45 MB Windows .exe that embeds llama.cpp and a minimal Tk UI. Copy it (plus any .gguf model) to a flash drive, double-click on any Windows PC, and you’re chatting with an LLM—no admin rights, Cloud, or network.

Why I built it Existing “local LLM” GUIs assume you can pip install, pass long CLI flags, or download GBs of extras.

I wanted something my less-technical colleagues could run during a client visit by literally plugging in a USB drive.

How it works PyInstaller one-file build → bundles Python runtime, llama_cpp_python, and the UI into a single PE.

On first launch, it memory-maps the .gguf; subsequent prompts stream at ~20 tok/s on an i7-10750H with gemma-3-1b-it-Q4_K_M.gguf (0.8 GB).

Tick-driven render loop keeps the UI responsive while llama.cpp crunches.

A parser bold-underlines every token that originated in the prompt; Ctrl+click pops a “source viewer” to trace facts. (Helps spot hallucinations fast.)

1. ge96 ◴[01 Jul 25 03:12 UTC] No.44430225[source]▶

>>44429116 (OP) #

Wonder if you can use/interface with those coral accelerator boards

↑

Show HN: Local LLM Notepad – run a GPT-style model from a USB stick