←back to thread

899 points georgehill | 1 comments | | HN request time: 0.23s | source
Show context
samwillis ◴[] No.36216196[source]
ggml and llama.cpp are such a good platform for local LLMs, having some financial backing to support development is brilliant. We should be concentrating as much as possible to do local inference (and training) based on privet data.

I want a local ChatGPT fine tuned on my personal data running on my own device, not in the cloud. Ideally open source too, llama.cpp is looking like the best bet to achieve that!

replies(6): >>36216377 #>>36216465 #>>36216508 #>>36217604 #>>36217847 #>>36221973 #
brucethemoose2 ◴[] No.36216377[source]
If MeZO gets implemented, we are basically there: https://github.com/princeton-nlp/MeZO
replies(1): >>36216988 #
moffkalast ◴[] No.36216988[source]
Basically there, with what kind of VRAM and processing requirements? I doubt anyone running on a CPU can fine tune in a time frame that doesn't give them an obsolete model when they're done.
replies(1): >>36217136 #
nl ◴[] No.36217136[source]
According to the paper it fine tunes at the speed of inference (!!)

This would make fine tuning a qantized 13B model achievable in ~0.3 seconds per training example on a CPU.

replies(6): >>36217261 #>>36217324 #>>36217354 #>>36217827 #>>36218026 #>>36218841 #
1. valval ◴[] No.36217324[source]
I think more importantly, what would the fine tuning routine look like? It's a non-trivial task to dump all of your personal data into any LLM architecture.