←back to thread

313 points rntn | 1 comments | | HN request time: 0s | source
Show context
ankit219 ◴[] No.44608660[source]
Not just Meta, 40 EU companies urged EU to postpone roll out of the ai act by two years due to it's unclear nature. This code of practice is voluntary and goes beyond what is in the act itself. EU published it in a way to say that there would be less scrutiny if you voluntarily sign up for this code of practice. Meta would anyway face scrutiny on all ends, so does not seem to a plausible case to sign something voluntary.

One of the key aspects of the act is how a model provider is responsible if the downstream partners misuse it in any way. For open source, it's a very hard requirement[1].

> GPAI model providers need to establish reasonable copyright measures to mitigate the risk that a downstream system or application into which a model is integrated generates copyright-infringing outputs, including through avoiding overfitting of their GPAI model. Where a GPAI model is provided to another entity, providers are encouraged to make the conclusion or validity of the contractual provision of the model dependent upon a promise of that entity to take appropriate measures to avoid the repeated generation of output that is identical or recognisably similar to protected works.

[1] https://www.lw.com/en/insights/2024/11/european-commission-r...

replies(7): >>44610592 #>>44610641 #>>44610669 #>>44611112 #>>44612330 #>>44613357 #>>44617228 #
zizee ◴[] No.44611112[source]
It doesn't seem unreasonable. If you train a model that can reliably reproduce thousands/millions of copyrighted works, you shouldn't be distributibg it. If it were just regular software that had that capability, would it be allowed? Just because it's a fancy Ai model it is ok?
replies(2): >>44611371 #>>44611463 #
CamperBob2 ◴[] No.44611371[source]
I have a Xerox machine that can reliably reproduce copyrighted works. Is that a problem, too?

Blaming tools for the actions of their users is stupid.

replies(4): >>44611396 #>>44611501 #>>44612409 #>>44614295 #
threetonesun ◴[] No.44611396[source]
If the Xerox machine had all of the copyrighted works in it and you just had to ask it nicely to print them I think you'd say the tool is in the wrong there, not the user.
replies(5): >>44611403 #>>44611469 #>>44611489 #>>44613191 #>>44616639 #
zettabomb ◴[] No.44613191[source]
Xerox already went through that lawsuit and won, which is why photocopiers still exist. The tool isn't in the wrong for being told to print out the copyrighted works. The user still had to make the conscious decision to copy that particular work. Hence, still the user's fault.
replies(1): >>44615490 #
1718627440 ◴[] No.44615490[source]
You take the copyrighted work to the printer, you don't upload data to an LLM first, it is already in the machine. If you got LLMs without training data (however that works) and the user needs to provide the data, then it would be ok.
replies(1): >>44616586 #
1. CamperBob2 ◴[] No.44616586[source]
You don't "upload" data to an LLM, but that's already been explained multiple times, and evidently it didn't soak in.

LLMs extract semantic information from their training data and store it at extremely low precision in latent space. To the extent original works can be recovered from them, those works were nothing intrinsically special to begin with. At best such works simply milk our existing culture by recapitulating ancient archetypes, a la Harry Potter or Star Wars.

If the copyright cartels choose to fight AI, the copyright cartels will and must lose. This isn't Napster Part 2: Electric Boogaloo. There is too much at stake this time.