←back to thread

111 points arnabkarsarkar | 2 comments | | HN request time: 0.447s | source

OP here.

I built this because I recently caught myself almost pasting a block of logs containing AWS keys into Claude.

The Problem: I need the reasoning capabilities of cloud models (GPT/Claude/Gemini), but I can't trust myself not to accidentally leak PII or secrets.

The Solution: A Chrome extension that acts as a local middleware. It intercepts the prompt and runs a local BERT model (via a Python FastAPI backend) to scrub names, emails, and keys before the request leaves the browser.

A few notes up front (to set expectations clearly):

Everything runs 100% locally. Regex detection happens in the extension itself. Advanced detection (NER) uses a small transformer model running on localhost via FastAPI.

No data is ever sent to a server. You can verify this in the code + DevTools network panel.

This is an early prototype. There will be rough edges. I’m looking for feedback on UX, detection quality, and whether the local-agent approach makes sense.

Tech Stack: Manifest V3 Chrome Extension Python FastAPI (Localhost) HuggingFace dslim/bert-base-NER Roadmap / Request for Feedback: Right now, the Python backend adds some friction. I received feedback on Reddit yesterday suggesting I port the inference to transformer.js to run entirely in-browser via WASM.

I decided to ship v1 with the Python backend for stability, but I'm actively looking into the ONNX/WASM route for v2 to remove the local server dependency. If anyone has experience running NER models via transformer.js in a Service Worker, I’d love to hear about the performance vs native Python.

Repo is MIT licensed.

Very open to ideas suggestions or alternative approaches.

Show context
postalcoder ◴[] No.46231530[source]
Very neat, but recently I've tried my best to reduce my extension usage across all apps (browsers/ide).

I do something similar locally by manually specifying all the things I want scrubbed/replaced and having keyboard maestro run a script on my system keyboard whenever doing a paste operation that's mapped to `hyperkey + v`. The plus side of this is that the paste is instant. The latency introduced by even the littlest of inference is enough friction to make you want to ditch the process entirely.

Another plus of the non-extension solution is that it's application agnostic.

replies(2): >>46231627 #>>46233200 #
1. informal007 ◴[] No.46231627[source]
Smart idea! Thanks for sharing.

If we move the detection and modification process from paste to copy operation, that will reduce in-use latency

replies(1): >>46233754 #
2. postalcoder ◴[] No.46233754[source]
That's a great idea. My original excuse to not do that was because I copy so many things but, duh, I could just key the sanitizing copy to `hyperkey + c`.