Most active commenters
  • JimDabell(3)

←back to thread

Web Translator API

(developer.mozilla.org)
97 points kozika | 12 comments | | HN request time: 0.956s | source | bottom
Show context
rhabarba ◴[] No.44375303[source]
You had me at "Browser compatibility".
replies(2): >>44375411 #>>44377716 #
Raed667 ◴[] No.44375411[source]
Chrome embeds a small LLM (never stops being a funny thing) in the browser allowing them to do local translations.

I assume every browser will do the same as on-device models start becoming more useful.

replies(2): >>44375422 #>>44375891 #
1. Asraelite ◴[] No.44375891[source]
What's the easiest way to get this functionality outside of the browser, e.g. as a CLI tool?

Last time I looked I wasn't able to find any easy to run models that supported more than a handful of languages.

replies(6): >>44376224 #>>44376260 #>>44376411 #>>44376506 #>>44378599 #>>44380230 #
2. ukuina ◴[] No.44376224[source]
ollama run gemma3:1b

https://ollama.com/library/gemma3

> support for over 140 languages

replies(1): >>44376601 #
3. _1 ◴[] No.44376260[source]
If you need to support several languages, you're going to have to have a zoo of models. Small ones just can't handle that many; and they especially aren't good enough for distribution, we only use them for understanding.
4. JimDabell ◴[] No.44376411[source]
That depends on what counts as “a handful of languages” for you.

You can use llm for this fairly easily:

    uv tool install llm

    # Set up your model however you like. For instance:
    llm install llm-ollama
    ollama pull mistral-small3.2

    llm --model mistral-small3.2 --system "Translate to English, no other output" --save english
    alias english="llm --template english"

    english "Bonjour"
    english "Hola"
    english "Γειά σου"
    english "你好"
    cat some_file.txt | english
https://llm.datasette.io
replies(2): >>44376778 #>>44377999 #
5. wittjeff ◴[] No.44376506[source]
https://ai.meta.com/blog/nllb-200-high-quality-machine-trans... https://www.youtube.com/watch?v=AGgzRE3TlvU
6. diggan ◴[] No.44376601[source]
Try to translate a paragraph with 1b gemma and compare it to DeepL :) Still amazing it can understand anything at all at that scale, but can't really rely on it for much tbh
7. usagisushi ◴[] No.44376778[source]
Tip: You might want to use `uv tool install llm --with llm-ollama`.

ref: https://github.com/simonw/llm/issues/575

replies(1): >>44376958 #
8. JimDabell ◴[] No.44376958{3}[source]
Thanks!
9. jan_Sate ◴[] No.44377999[source]
That's just the base/stock/instruct model for general use case. There gotta be a finetune specialized in translation, right? Any recommendations for that?

Plus, mistral-small3.2 has too many parameters. Not all devices can run it fast. That probably isn't the exact translation model being used by Chrome.

replies(1): >>44378527 #
10. JimDabell ◴[] No.44378527{3}[source]
I haven’t tried it myself, but NLLB-200 has various sizes going down to 600M params:

https://github.com/facebookresearch/fairseq/tree/nllb/

If running locally is too difficult, you can use llm to access hosted models too.

11. deivid ◴[] No.44378599[source]
You can use bergamot ( https://github.com/browsermt/bergamot-translator ) with Mozilla's models ( https://github.com/mozilla/firefox-translations-models ).

Not the easiest, but easy enough (requires building).

I used these two projects to build an on-device translator for Android.

12. mftrhu ◴[] No.44380230[source]
Setting aside general-purpose LLMs, there exist a handful of models geared towards translation between hundred of language pairs: Meta's NLLB-200 [0] and M2M-100 [1] can be run using HuggingFace's transformers (plus numpy and sentencepieces), while Google's MADLAD-400 [2], in GGUF format [3], is also supported by llama.cpp.

You could also look into Argos Translate, or just use the same models as Firefox through kotki [4].

[0] https://huggingface.co/facebook/nllb-200-distilled-600M [1] https://huggingface.co/facebook/m2m100_418M [2] https://huggingface.co/google/madlad400-3b-mt [3] https://huggingface.co/models?other=base_model:quantized:goo... [4] https://github.com/kroketio/kotki