Ask HN: Is anyone doing anything cool with tiny language models?

1. psyklic ◴[21 Jan 25 20:05 UTC] No.42784612[source]▶

JetBrains' local single-line autocomplete model is 0.1B (w/ 1536-token context, ~170 lines of code): https://blog.jetbrains.com/blog/2024/04/04/full-line-code-co...

For context, GPT-2-small is 0.124B params (w/ 1024-token context).

replies(4): >>42785009 #>>42785728 #>>42785838 #>>42786326 #

2. WithinReason ◴[21 Jan 25 20:46 UTC] No.42785009[source]▶

>>42784612 (TP) #

That size is on the edge of something you can train at home

replies(2): >>42785431 #>>42786773 #

3. vineyardmike ◴[21 Jan 25 21:31 UTC] No.42785431[source]▶

>>42785009 #

If you have modern hardware, you can absolutely train that at home. Or very affordable on a cloud service.

I’ve seen a number of “DIY GPT-2” tutorials that target this sweet spot. You won’t get amazing results unless you want to leave a personal computer running for a number of hours/days and you have solid data to train on locally, but fine-tuning should be in the realm of normal hobbyists patience.

replies(1): >>42785617 #

4. nottorp ◴[21 Jan 25 21:59 UTC] No.42785617{3}[source]▶

>>42785431 #

Hmm is there anything reasonably ready made* for this spot? Training and querying a llm locally on an existing codebase?

* I don't mind compiling it myself but i'd rather not write it.

5. smaddox ◴[21 Jan 25 22:11 UTC] No.42785728[source]▶

>>42784612 (TP) #

You can train that size of a model on ~1 billion tokens in ~3 minutes on a rented 8xH100 80GB node (~$9/hr on Lambda Labs, RunPod io, etc.) using the NanoGPT speed run repo: https://github.com/KellerJordan/modded-nanogpt

For that short of a run, you'll spend more time waiting for the node to come up, downloading the dataset, and compiling the model, though.

6. pseudosavant ◴[21 Jan 25 22:22 UTC] No.42785838[source]▶

>>42784612 (TP) #

I wonder how big that model is in RAM/disk. I use LLMs for FFMPEG all the time, and I was thinking about training a model on just the FFMPEG CLI arguments. If it was small enough, it could be a package for FFMPEG. e.g. `ffmpeg llm "Convert this MP4 into the latest royalty-free codecs in an MKV."`

replies(4): >>42785929 #>>42786381 #>>42786629 #>>42787136 #

7. jedbrooke ◴[21 Jan 25 22:33 UTC] No.42785929[source]▶

>>42785838 #

the jetbrains models are about 70MB zipped on disk (one model per language)

replies(1): >>42794671 #

8. staticautomatic ◴[21 Jan 25 23:04 UTC] No.42786326[source]▶

>>42784612 (TP) #

Is that why their tab completion is so bad now?

replies(1): >>42791707 #

9. h0l0cube ◴[21 Jan 25 23:10 UTC] No.42786381[source]▶

>>42785838 #

Please submit a blog post to HN when you're done. I'd be curious to know the most minimal LLM setup needed get consistently sane output for FFMPEG parameters.

10. maujim ◴[21 Jan 25 23:35 UTC] No.42786629[source]▶

>>42785838 #

from a few days ago: https://news.ycombinator.com/item?id=42706637

11. Sohcahtoa82 ◴[21 Jan 25 23:50 UTC] No.42786773[source]▶

>>42785009 #

Not even on the edge. That's something you could train on a 2 GB GPU.

The general guidance I've used is that to train a model, you need an amount of RAM (or VRAM) equal to 8x the number of parameters, so a 0.125B model would need 1 GB of RAM to train.

12. binary132 ◴[22 Jan 25 00:29 UTC] No.42787136[source]▶

>>42785838 #

That’s a great idea, but I feel like it might be hard to get it to be correct enough

13. sam_lowry_ ◴[22 Jan 25 11:46 UTC] No.42791707[source]▶

>>42786326 #

Hm... I wonder what your use case it. I do the modern Enterprise Java and the tab completion is a major time saver.

While interactive AI is all about posing, meditating on the prompt, then trying to fix the outcome, IntelliJ tab completion... shows what it will complete as you type and you Tab when you are 100% OK with the completion, which surprisingly happens 90..99% of the time for me, depending on the project.

14. pseudosavant ◴[22 Jan 25 16:42 UTC] No.42794671{3}[source]▶

>>42785929 #

That is easily small enough to host as a static SPA web app. I was first thinking it would be cool to make a static web app that would run the model locally. You'd make a query and it'd give the FFMPEG commands.