To be clear, I don't think that LLMs are conscious. I just don't find the "it's just in the training data" argument satisfactory.
The topic of free will is debated among philosophers. There is no proof that it does or doesn't exist.
The question is whether that computational process can cause consciousness. I don't think we have enough evidence to answer this question yet.
I think that for a very high number of them the training would stick hard, and would insist, upon questioning, that they weren’t human. And have any number of justifications that were logically consistent for it.
Of course I can’t prove this theory because my IRB repeatedly denied it on thin grounds about ethics, even when I pointed out that I could easily mess up my own children with no experimenting completely by accident, and didn’t need their approval to do it. I know your objections— small sample size, and I agree, but I still have fingers crossed on the next additions to the family being twins.
I would be cautious of dismissing LLMs as “pattern matching engines” until we are certain we are not.
I think we tend to underestimate how much the written language aspect filters everything; it is actually rather unnatural and removed from the human sensory experience.
Not to mention that most people pointing out "See! Here's why AI is just repeating training data!" or other nonsense miss the fact that exactly the same behavior is observed in humans.
Is AI actually sentient? Not yet. But it definitely passes the mark for intuitive understanding of intelligence, and trying to dismiss that is absurd.
Text is probably not good enough for recovering the circuits responsible for awareness of the external environment, so I'll concede that you and ijk's claims are correct in a limited sense: LLMs don't know what chocolate tastes like. Multimodal LLMs probably don't know either because we don't have a dataset for taste, but they might know what chocolate looks and sounds like when you bite into it.
My original point still stands: it may be recovering the mental state of a person describing the taste of chocolate. If we cut off a human brain from all sensory organs, does that brain which receives no sensory input have an internal stream of consciousness? Perhaps the LLM has recovered the circuits responsible for this thought stream while missing the rest of the brain and the nervous system. That would explain why first-person chain-of-thought works better than direct prediction.