←back to thread

343 points sillysaurusx | 3 comments | | HN request time: 0.205s | source
Show context
EMIRELADERO ◴[] No.35028451[source]
I womder, could Facebook take legal action here? While some (most of) the data used to train the model is copyrighted, I don't think the model is. It's the result of a mathematical process applied to a series of facts and works with no more creativity put onto them.
replies(4): >>35028664 #>>35029602 #>>35031189 #>>35033122 #
jeroenhd ◴[] No.35029602[source]
As far as my understanding of American copyright goes, a computer produced work cannot be copyrighted as computers are not human, in the same way a photograph taken by a chimp cannot be copyrighted no matter who owned the camera that took the photo. This is one of the major challenges with the legal status of AI as well that will soon be fought over in court.

It's possible that the automated processing of the dataset is considered to be non-creative enough that the generated AI model cannot be copyrighted. The code to train the model and the input dataset (and the works therein) definitely can be, but not the model itself.

In that case, Facebook would be out of luck, as long as the code to train the model isn't shared. If the courts find AI models to be a different type of work that does produce copyrightable models, Facebook may follow in the footsteps of other copyright giants and start filing lawsuits against anyone who they can catch. I very much doubt they'd go so far, especially since by the time they can even start a lawsuit confidently, the leaked model is probably already outdated and irrelevant.

Personally, I expect the model to end up being uncopyrightable, as would be the output of the model.

This may or may not have very interesting results. The dataset itself is probably copyrightable (a human or set of humans composed it, unless that was also done completely automatically) but if that copyright is claimed, the individual right holders of the included works may demand a licensing fee similar to how sound bytes work in music; "you want to use my work, pay me a fee".

Or maybe the dataset is considered to be diverse enough that individual works cannot be expected to be compensated for their inclusion and you can get around copyright law by amassing enough content at once, who knows.

replies(2): >>35029865 #>>35030044 #
1. adossi ◴[] No.35030044[source]
It is intellectual property, regardless of copyright.
replies(2): >>35030290 #>>35030913 #
2. brookst ◴[] No.35030290[source]
“Intellectual property” is a catch-all for copyright, trademark, patent, and trade secrets. There isn’t really law that protects IP as a general concept, just those four.
3. cma ◴[] No.35030913[source]
It isn't protected as a trade secret if they mostly freely shared it with .edu addresses. And once it has been leaked out widely publicly it isn't either.