←back to thread

361 points mseri | 1 comments | | HN request time: 0.207s | source
Show context
weregiraffe ◴[] No.46005853[source]
Is the training data open-source? And can you validate that the model was trained on the claimed training data alone? Without this, all benchmarks are useless.
replies(1): >>46005929 #
1. comp_raccoon ◴[] No.46005929[source]
Olmo author here! we release all training data and all our training scripts, plus intermediate checkpoints, so you could take a checkpoint, reproduce a few steps on the training data, and check if loss matches.

it’s no cryptography proof, and you can’t get perfect determinism on nvidia GPUs, but it’s pretty close.