←back to thread

684 points prettyblocks | 1 comments | | HN request time: 0.667s | source

I mean anything in the 0.5B-3B range that's available on Ollama (for example). Have you built any cool tooling that uses these models as part of your work flow?
Show context
psyklic ◴[] No.42784612[source]
JetBrains' local single-line autocomplete model is 0.1B (w/ 1536-token context, ~170 lines of code): https://blog.jetbrains.com/blog/2024/04/04/full-line-code-co...

For context, GPT-2-small is 0.124B params (w/ 1024-token context).

replies(4): >>42785009 #>>42785728 #>>42785838 #>>42786326 #
WithinReason ◴[] No.42785009[source]
That size is on the edge of something you can train at home
replies(2): >>42785431 #>>42786773 #
1. Sohcahtoa82 ◴[] No.42786773[source]
Not even on the edge. That's something you could train on a 2 GB GPU.

The general guidance I've used is that to train a model, you need an amount of RAM (or VRAM) equal to 8x the number of parameters, so a 0.125B model would need 1 GB of RAM to train.