←back to thread

728 points freetonik | 1 comments | | HN request time: 0.201s | source
Show context
jedbrown ◴[] No.44980180[source]
Provenance matters. An LLM cannot certify a Developer Certificate of Origin (https://en.wikipedia.org/wiki/Developer_Certificate_of_Origi...) and a developer of integrity cannot certify the DCO for code emitted by an LLM, certainly not an LLM trained on code of unknown provenance. It is well-known that LLMs sometimes produce verbatim or near-verbatim copies of their training data, most of which cannot be used without attribution (and may have more onerous license requirements). It is also well-known that they don't "understand" semantics: they never make changes for the right reason.

We don't yet know how courts will rule on cases like Does v Github (https://githubcopilotlitigation.com/case-updates.html). LLM-based systems are not even capable of practicing clean-room design (https://en.wikipedia.org/wiki/Clean_room_design). For a maintainer to accept code generated by an LLM is to put the entire community at risk, as well as to endorse a power structure that mocks consent.

replies(5): >>44980234 #>>44980300 #>>44980455 #>>44982369 #>>44990599 #
1. Aeolun ◴[] No.44982369[source]
Or you know, they just feel like code should be free. Like beer should be free.

We didn't have this whole issue 20 years ago because nobody gave a shit. If your code was public, and on the internet, it was free for everyone to use by definition.