Most active commenters

    ←back to thread

    684 points prettyblocks | 14 comments | | HN request time: 1.061s | source | bottom

    I mean anything in the 0.5B-3B range that's available on Ollama (for example). Have you built any cool tooling that uses these models as part of your work flow?
    1. psyklic ◴[] No.42784612[source]
    JetBrains' local single-line autocomplete model is 0.1B (w/ 1536-token context, ~170 lines of code): https://blog.jetbrains.com/blog/2024/04/04/full-line-code-co...

    For context, GPT-2-small is 0.124B params (w/ 1024-token context).

    replies(4): >>42785009 #>>42785728 #>>42785838 #>>42786326 #
    2. WithinReason ◴[] No.42785009[source]
    That size is on the edge of something you can train at home
    replies(2): >>42785431 #>>42786773 #
    3. vineyardmike ◴[] No.42785431[source]
    If you have modern hardware, you can absolutely train that at home. Or very affordable on a cloud service.

    I’ve seen a number of “DIY GPT-2” tutorials that target this sweet spot. You won’t get amazing results unless you want to leave a personal computer running for a number of hours/days and you have solid data to train on locally, but fine-tuning should be in the realm of normal hobbyists patience.

    replies(1): >>42785617 #
    4. nottorp ◴[] No.42785617{3}[source]
    Hmm is there anything reasonably ready made* for this spot? Training and querying a llm locally on an existing codebase?

    * I don't mind compiling it myself but i'd rather not write it.

    5. smaddox ◴[] No.42785728[source]
    You can train that size of a model on ~1 billion tokens in ~3 minutes on a rented 8xH100 80GB node (~$9/hr on Lambda Labs, RunPod io, etc.) using the NanoGPT speed run repo: https://github.com/KellerJordan/modded-nanogpt

    For that short of a run, you'll spend more time waiting for the node to come up, downloading the dataset, and compiling the model, though.

    6. pseudosavant ◴[] No.42785838[source]
    I wonder how big that model is in RAM/disk. I use LLMs for FFMPEG all the time, and I was thinking about training a model on just the FFMPEG CLI arguments. If it was small enough, it could be a package for FFMPEG. e.g. `ffmpeg llm "Convert this MP4 into the latest royalty-free codecs in an MKV."`
    replies(4): >>42785929 #>>42786381 #>>42786629 #>>42787136 #
    7. jedbrooke ◴[] No.42785929[source]
    the jetbrains models are about 70MB zipped on disk (one model per language)
    replies(1): >>42794671 #
    8. staticautomatic ◴[] No.42786326[source]
    Is that why their tab completion is so bad now?
    replies(1): >>42791707 #
    9. h0l0cube ◴[] No.42786381[source]
    Please submit a blog post to HN when you're done. I'd be curious to know the most minimal LLM setup needed get consistently sane output for FFMPEG parameters.
    10. maujim ◴[] No.42786629[source]
    from a few days ago: https://news.ycombinator.com/item?id=42706637
    11. Sohcahtoa82 ◴[] No.42786773[source]
    Not even on the edge. That's something you could train on a 2 GB GPU.

    The general guidance I've used is that to train a model, you need an amount of RAM (or VRAM) equal to 8x the number of parameters, so a 0.125B model would need 1 GB of RAM to train.

    12. binary132 ◴[] No.42787136[source]
    That’s a great idea, but I feel like it might be hard to get it to be correct enough
    13. sam_lowry_ ◴[] No.42791707[source]
    Hm... I wonder what your use case it. I do the modern Enterprise Java and the tab completion is a major time saver.

    While interactive AI is all about posing, meditating on the prompt, then trying to fix the outcome, IntelliJ tab completion... shows what it will complete as you type and you Tab when you are 100% OK with the completion, which surprisingly happens 90..99% of the time for me, depending on the project.

    14. pseudosavant ◴[] No.42794671{3}[source]
    That is easily small enough to host as a static SPA web app. I was first thinking it would be cool to make a static web app that would run the model locally. You'd make a query and it'd give the FFMPEG commands.