←back to thread

899 points georgehill | 2 comments | | HN request time: 0.428s | source
Show context
samwillis ◴[] No.36216196[source]
ggml and llama.cpp are such a good platform for local LLMs, having some financial backing to support development is brilliant. We should be concentrating as much as possible to do local inference (and training) based on privet data.

I want a local ChatGPT fine tuned on my personal data running on my own device, not in the cloud. Ideally open source too, llama.cpp is looking like the best bet to achieve that!

replies(6): >>36216377 #>>36216465 #>>36216508 #>>36217604 #>>36217847 #>>36221973 #
SparkyMcUnicorn ◴[] No.36217604[source]
Maybe I'm wrong, but I don't think you want it fine-tuned on your data.

Pretty sure you might be looking for this: https://github.com/SamurAIGPT/privateGPT

Fine-tuning is good for treating it how to act, but not great for reciting/recalling data.

replies(4): >>36219307 #>>36220595 #>>36226771 #>>36241658 #
1. SkyPuncher ◴[] No.36241658[source]
I think people want both. They want fine tuning for their style of communication and interaction. They want better rank and retrieval for rote information.

In other words, it’s like having a spouse/partner. There are certain ways that we communicate that we simply know where the other person is at or what they actually mean.

replies(1): >>36243366 #
2. SparkyMcUnicorn ◴[] No.36243366[source]
Unless you want machine-readable responses, or some other very specific need, the benefits of a fine-tuned model aren't really going to be that much better than a prompt that asks for the style you want along with an example or two. It also raises the barrier to entry quite a bit, since the majority of computers that can run the model aren't capable of training on it.

Even if you're using OpenAI's models, gpt-3.5-turbo is going to be much better (cheaper, bigger context window, higher quality) than any of their models that can be fine-tuned.

But if you're able to fine-tune a local model, then a combination of fine-tuning and embedding is probably going to give you better results than embedding alone.