684 points prettyblocks | 4 comments | 21 Jan 25 19:39 UTC | HN request time: 0.774s | source

I mean anything in the 0.5B-3B range that's available on Ollama (for example). Have you built any cool tooling that uses these models as part of your work flow?

Show context

psyklic ◴[21 Jan 25 20:05 UTC] No.42784612[source]▶

>>42784365 (OP) #

JetBrains' local single-line autocomplete model is 0.1B (w/ 1536-token context, ~170 lines of code): https://blog.jetbrains.com/blog/2024/04/04/full-line-code-co...

For context, GPT-2-small is 0.124B params (w/ 1024-token context).

replies(4): >>42785009 #>>42785728 #>>42785838 #>>42786326 #

1. WithinReason ◴[21 Jan 25 20:46 UTC] No.42785009[source]▶

>>42784612 #

That size is on the edge of something you can train at home

replies(2): >>42785431 #>>42786773 #

2. vineyardmike ◴[21 Jan 25 21:31 UTC] No.42785431[source]▶

>>42785009 (TP) #

If you have modern hardware, you can absolutely train that at home. Or very affordable on a cloud service.

I’ve seen a number of “DIY GPT-2” tutorials that target this sweet spot. You won’t get amazing results unless you want to leave a personal computer running for a number of hours/days and you have solid data to train on locally, but fine-tuning should be in the realm of normal hobbyists patience.

replies(1): >>42785617 #

3. nottorp ◴[21 Jan 25 21:59 UTC] No.42785617[source]▶

>>42785431 #

Hmm is there anything reasonably ready made* for this spot? Training and querying a llm locally on an existing codebase?

* I don't mind compiling it myself but i'd rather not write it.

4. Sohcahtoa82 ◴[21 Jan 25 23:50 UTC] No.42786773[source]▶

>>42785009 (TP) #

Not even on the edge. That's something you could train on a 2 GB GPU.

The general guidance I've used is that to train a model, you need an amount of RAM (or VRAM) equal to 8x the number of parameters, so a 0.125B model would need 1 GB of RAM to train.

↑

Ask HN: Is anyone doing anything cool with tiny language models?