←back to thread

494 points todsacerdoti | 7 comments | | HN request time: 1.559s | source | bottom
1. BurningFrog ◴[] No.44383219[source]
Would it make sense to include the complete prompt that generated the code with the code?
replies(3): >>44383344 #>>44384164 #>>44386462 #
2. astrobiased ◴[] No.44383344[source]
It would need to be more than that. A prompt for one model can have different results vs another. Even when the model has different treatment for inference, eg quantization, the same prompt for the unquantized and quantized model could differ.
replies(1): >>44383438 #
3. verdverm ◴[] No.44383438[source]
Even more so, when you come back to understand in a few years, the model will no longer be available
replies(1): >>44383820 #
4. galangalalgol ◴[] No.44383820{3}[source]
One of several reasons to use an open model even if it isn't quite as good. Version control the models and commit the prompts with the model name and a hash of the parameters. I'm not really sure what value that reproducibility adds though.
5. catlifeonmars ◴[] No.44384164[source]
You’d need to hash the model weights and save the seeds for the temperature prng as well, in order to verify the provenance. Ideally it would be reproducible, right?
replies(1): >>44385459 #
6. danielbln ◴[] No.44385459[source]
Maybe 2 years ago. Nowadays LLMs call functions and use tools, good luck capturing that in a way that it's reproducible.
7. ethan_smith ◴[] No.44386462[source]
Including prompts would create transparency but still wouldn't resolve the underlying copyright uncertainty of the output or guarantee the code wasn't trained on incompatibly-licensed material.