←back to thread

684 points prettyblocks | 1 comments | | HN request time: 0s | source

I mean anything in the 0.5B-3B range that's available on Ollama (for example). Have you built any cool tooling that uses these models as part of your work flow?
Show context
deivid ◴[] No.42786422[source]
Not sure it qualifies, but I've started building an Android app that wraps bergamot[0] (the firefox translation models) to have on-device translation without reliance on google.

Bergamot is already used inside firefox, but I wanted translation also outside the browser.

[0]: bergamot https://github.com/browsermt/bergamot-translator

replies(2): >>42786996 #>>42789246 #
deivid ◴[] No.42789246[source]
I would be very interested if someone is aware of any small/tiny models to perform OCR, so the app can translate pictures as well
replies(1): >>42789882 #
1. Eisenstein ◴[] No.42789882[source]
MiniCPM-V 2.6 isn't that small (8b) but it can do this.

Here is a demo.

* https://i.imgur.com/pAuTeAf.jpeg

Using this script:

* https://github.com/jabberjabberjabber/LLMOCR/