Above the response it says
> Documents from the training data that have exact text matches with the model response. Powered by infini-gram
so, if I understand correctly, it searches the training data for matches in the LLM output. This is not traceability in my opinion. This is an attempt at guessing.
Checking individual sources I got texts completely unrelated with the question/answer, but that happen to share an N-gram [1] (I saw sequences up to 6 words) with the LLM answer.
I think they're being dishonest in their presentation of what Olmo can and can't do.