https://investors.autodesk.com/news-releases/news-release-de...
https://investors.autodesk.com/news-releases/news-release-de...
“Piracy” is mostly a rhetorical term in the context of copyright. Legally, it’s still called infringement or unauthorized copying. But industries and lobbying groups (e.g., RIAA, MPAA) have favored “piracy” for its emotional weight.
Copyright infringement is unauthorized reproduction - you have made a copy of something, but you have not deprived the original owner of it. At most, you denied them revenue although generally less than the offended party claims, since not all instances of copying would have otherwise resulted in a sale.
This is reaching at best.
Come up with a better comparison.
Real piracy always involves booty.
Naturally booty is wealth that has been hoarded.
Has nothing to do with wealth that may or may not come in the future, regardless of whether any losses due to piracy have taken place already or not.
First, Authors argue that using works to train Claude’s underlying LLMs
was like using works to train any person to read and write, so Authors
should be able to exclude Anthropic from this use (Opp. 16).
Second, to that last point, Authors further argue that the training was
intended to memorize their works’ creative elements — not just their
works’ non-protectable ones (Opp. 17).
Third, Authors next argue that computers nonetheless should not be
allowed to do what people do.
https://media.npr.org/assets/artslife/arts/2025/order.pdfAnd to be clear, we javelin the word infringement precisely because it is not theft.
In addition to the deprived revenue, piracy also improves on the general relevance the author has or may have in the public sphere. Essentially, one of the side effects of piracy is basically advertising.
Doctorow was one of the early ones to bring this aspect of it up.
Referring to this? (Wikipedia's disambiguation page doesn't seem to have a more likely article.)
https://en.wikipedia.org/wiki/Richard_Stallman#Copyright_red...
It's not an issue because it's not currently illegal because nobody could have foreseen this years ago.
But it is profiting off of the unpaid work of millions. And there's very little chance of change because it's so hard to pass new protection laws when you're not Disney.
We've only dealt with the fairly straight-forward legal questions so far. This legal battle is still far from being settled.
I don't think humans learn via backprop or in rounds/batches, our learning is more "online".
If I input text into an LLM it doesn't learn from that unless the creators consciously include that data in the next round of teaching their model.
Humans also don't require samples of every text in history to learn to read and write well.
Hunter S Thompson didn't need to ingest the Harry Potter books to write.
Stallman places great importance on the words and labels people use to talk about the world, including the relationship between software and freedom. He asks people to say free software and GNU/Linux, and to avoid the terms intellectual property and piracy (in relation to copying not approved by the publisher). One of his criteria for giving an interview to a journalist is that the journalist agrees to use his terminology throughout the article.
(As an aside, it seems pointless to decry it as a "talking point". The reason it was brought up is presumably because the author agrees with it and thinks it's relevant. It's also entirely possible that the author, like me, made this argument without being aware that it was popularized by Richard Stallman. If it makes sense then you can hear the argument without hearing the person and still find it agreeable.)
"Piracy" is used to refer to copyright violation to make it sound scary and dangerous to people who don't know better or otherwise don't think about it too hard. Just imagine if they called it "banditry" instead; now tell me that pirates are not bandits with boats. They may as well have called it banditry and it's worth correcting that. (I also think it's worth ridiculing but that doesn't appear to be Stallman's primary point.) It's not banditry (how ridiculous would it be to call it that?), it's copyright infringement.
Edit:
Reading my comment again in the context of other things you wrote, I suspect the argument will not pass muster because you do not seem to see piracy's change in meaning as manufactured by PR work purchased by media industry leaders. I'm not really trying to convince you that it's true but it may be worth considering that it is the fundamental disagreement you seem to have with others on Stallman's point; again, not saying you're wrong, just that's where the disagreement is.
When I clicked the link, I got an article about a business that was selling millions of dollars of pirated software.
This guy made millions of dollars in profit by selling pirated software. This wasn't a case of transformative works, nor of an individual doing something for themselves. He was plainly stealing and reselling something.
Here's an article explaining in more detail [1].
Most experts say that if Swartz had gone to trial and the prosecution had proved everything they alleged and the judge had decided to make an example of Swartz and sentence harshly it would have been around 7 years.
Swartz's own attorney said that if they had gone to trail and lost he thought it was unlikely that Swartz would get any jail time.
Swartz also had at least two plea bargain offers available. One was for a guilty plea and 4 months. The other was for a guilty plea and the prosecutors would ask for 6 months but Swartz could ask the judge for less or for probation instead and the judge would pick.
[1] https://www.popehat.com/2013/02/05/crime-whale-sushi-sentenc...
> First, Authors argue that using works to train Claude’s underlying LLMs was like using works to train any person to read and write, so Authors should be able to exclude Anthropic from this use (Opp. 16). But Authors cannot rightly exclude anyone from using their works for training or learning as such. Everyone reads texts, too, then writes new texts. They may need to pay for getting their hands on a text in the first instance. But to make anyone pay specifically for the use of a book each time they read it, each time they recall it from memory, each time they later draw upon it when writing new things in new ways would be unthinkable. For centuries, we have read and re-read books. We have admired, memorized, and internalized their sweeping themes, their substantive points, and their stylistic solutions to recurring writing problems.
Couldn't have put it better myself (though $deity knows I tried many times on HN). Glad to see Judge Alsup continues to be the voice of common sense in legal matters around technology.
As an extreme example, consider murder. Obviously it should be illegal, but if it's legal for one group and not for another, the group for which it's illegal will probably be wiped out, having lost the ability to avenge deaths in the group.
It's much more important that laws are applied impartially and equally than that they are even a tiny bit reasonable.
"Pirates" also transform the works they distribute. They crack it, translate it, compress it to decrease download times, remove unnecessary things, make it easier to download by splitting it in chunks (essential with dial-up, less so nowadays), change distribution formats, offer it trough different channels, bundle extra software and media that they themselves might have coded like trainers, installers, sick chiptunes and so on. Why is the "transformation" done by a big corpo more legal in your views?
In short the post is bait.
Seven years for thumbing your nose at Autodesk when armed robbery would get you less time says some interesting things about the state of legal practice.
“We have admired, memorized, and internalized their sweeping themes, their substantive points, and their stylistic solutions to recurring writing problems.”
Claude is not doing any of these things. There is no admiration, no internalizing of sweeping themes. There’s a network encoding data.
We’re talking about a machine that accepts content and then produces more content. It’s not a person, it’s owned by a corporation that earns money on literally every word this machine produces. If it didn’t have this large corpus of input data (copyrighted works) it could not produce the output data for which people are willing to pay money. This all happens at a scale no individual could achieve because, as we know, it is a machine.
This argument is more along the lines of: blaming Microsoft Word for someone typing characters into the word processors algorithm, and outputting a copy of an existing book. (Yes, it is a lot easier, but the rationale is the same). In my mind the end user prompting the model would be the one potentially infringing.
This absolutely falls under copyright law as I understand it (not a lawyer). E.g. the disclaimer that rolls before every NFL broadcast. The notice states that the broadcast is copyrighted and any unauthorized use, including pictures, descriptions, or accounts of the game, is prohibited. There is wiggle room for fair use by news organizations, critics, artists, etc.
I do think that a big part of the reason Anthropic downloaded millions of books from pirate torrents was because they needed that input data in order to generate the output, their product.
I don’t know what that is, but, IMHO, not sharing those dollars with the creators of the content is clearly wrong.
Make no mistake, they’re seeking to exploit the contents of that material for profits that are orders of magnitude larger than what any shady pirated-material reseller would make. The world looks the other way because these companies are “visionary” and “transformational.”
Maybe they are, and maybe they should even have a right to these buried works, but what gives them the right to rip up the rule book and (in all likelihood) suffer no repercussions in an act tantamount to grand theft?
There’s certainly an argument to be had about whether this form of research and training is a moral good and beneficial to society. My first impression is that the companies are too opaque in how they use and retain these files, albeit for some legitimate reasons, but nevertheless the archival achievements are hidden from the public, so all that’s left is profit for the company on the backs of all these other authors.
What you’re saying is like calling Al Capone a tax cheat. Nonsense.
They went after Aaron over copyright.
That the mechanism performing these things is a network encoding data is… well, that description, at that level of abstraction, is a similarity with the way a human does it, not even a difference.
My network is a 3D mess made of pointy bi-lipid bags exchanging protons across gaps moderated by the presence of neurochemicals, rather than flat sheets of silicon exchanging electrons across tuned energy band-gaps moderated by other electrons, but it's still a network.
> We’re talking about a machine that accepts content and then produces more content. It’s not a person, it’s owned by a corporation that earns money on literally every word this machine produces. If it didn’t have this large corpus of input data (copyrighted works) it could not produce the output data for which people are willing to pay money. This all happens at a scale no individual could achieve because, as we know, it is a machine.
My brain is a machine that accepts content in the form of job offers and JIRA tickets (amongst other things), and then produces more content in the form of pull requests (amongst other things). For the sake specifically of this question, do the other things make a difference? While I count as a person and am not owned by any corporation, when I work for one, they do earn money on the words this biological machine produces. (And given all the models which are free to use, the LLMs definitely don't earn money on "literally" every word those models produce). If I didn't have the large corpus of input data — and there absolutely was copyright on a lot of the school textbooks and the TV broadcast educational content of the 80s and 90s when I was at school, and the Java programming language that formed the backbone of my university degree — I could not produce the output data for which people are willing to pay money.
Should corporations who hire me be required to pay Oracle every time I remember and use a solution that I learned from a Java course, even when I'm not writing Java?
That the LLMs do this at a scale no individual could achieve because it is a machine, means it's got the potential to wipe me out economically. Economics threat of automation has been a real issue at least since the luddites if not earlier, and I don't know how the dice will fall this time around, so even though I have one layer of backup plan, I am well aware it may not work, and if it doesn't then government action will have to happen because a lot of other people will be in trouble before trouble gets to me (and recent history shows that this doesn't mean "there won't be trouble").
Copyright law is one example of government action. So is mandatory education. So is UBI, but so too is feudalism.
Good luck to us all.
Yep, that name's a blast from the past! He was the judge on the big Google/Oracle case about Android and Java years ago, IIRC. I think he even learned to write some Java so he could better understand the case.
This is an uncharitable interpretation. The ostensible point of the comment, or at least a stronger and still-reasonable interpretation, is that they are trying to point out that this specific word choice confuses concepts, which it does. Richard Stallman and the commenter in question are absolutely correct to point that out. You actually seem to be agreeing with Stallman, at least in the abstract.
It's should be acknowledged how/why the meaning of the word changed. As I said, that seems to have been manufactured, which suggests, at least to me, that their (and Richard Stallman's) point is essentially the same as yours. That is to say, the US media industry started paying PR firms to use "piracy" as meaning something other than its normal definition until that became the common definition.
They should not purposely use a different definition like that. That is Stallman's point, and why he refuses to say "piracy" instead of "copyright infringement"; ocean banditry is not copyright infringement and it is confusing -- intentionally so -- to say that it is.
Laws and their enforcement are a clusterfuck. To achieve greater justice we should strive towards better judgements overall.
God, stop with the group on group bs please and engage with things the way they're written without injecting the entirety of your cynical worldview layered on top.
File sharing is not that.