> I have seen the argument that LLMs can only give you what its been trained
There's confusing terminology here and without clarification people talk past one another.
"What its been trained on" is a distribution. It can produce things from that distribution and only things from that distribution. If you train on multiple distributions, you get the union of the distribution, making a distribution.
This is entirely different from saying it can only reproduce samples which it was trained on. It is not a memory machine that is surgically piecing together snippets of memorized samples. (That would be a mind bogglingly impressive machine!)
A distribution is more than its samples. It is the things between too. Does the LLM perfectly capture the distribution? Of course not. But it's a compression machine so it compresses the distribution. Again, different from compressing the samples, like one does with a zip file.
So distributionally, can it produce anything novel? No, of course not. How could it? It's not magic. But sample wise can it produce novel things? Absolutely!! It would be an incredibly unimpressive machine if it couldn't and it's pretty trivial to prove that it can do this. Hallucinations are good indications that this happens but it's impossible to do on anything but small LLMs since you can't prove any given output isn't in the samples it was trained on (they're just trained on too much data).
> people have been telling me that LLMs cannot solve problems that is not in their training data already. Is this really true or not?
Up until very recently most LLMs have struggled with the prompt
Solve:
5.9 = x + 5.11
This is certainly in their training distribution and has been for years, so I wouldn't even conclude that they can solve problems "in their training data". But that's why I said it's not a perfect model of the distribution.
> a pig with a dragon head
One needs to be quite careful with examples as you'll have to make the unverifiable assumption that such a sample does not exist in the training data. With the size of training data this is effectively unverifiable.
But I would also argue that humans can do more than that. Yes, we can combine concepts, but this is a lower level of intelligence that is not unique to humans. A variation of this is applying a skill from one domain into another. You might see how that's pretty critical to most animals survival. But humans, we created things that are entirely outside nature require things outside a highly sophisticated cut and paste operation. Language, music, mathematics, and so much more are beyond that. We could be daft and claim music is simply cut and paste of songs which can all naturally be reproduced but that will never explain away the feelings or emotion that it produces. Or how we formulated the sounds in our heads long before giving them voice. There is rich depth to our experiences if you look. But doing that is odd and easily dismissed as our own familiarity deceives us into our lack of.