Most active commenters
  • gruez(7)
  • johnnyanmac(5)
  • pc86(3)

←back to thread

451 points croes | 55 comments | | HN request time: 3.047s | source | bottom
1. jhaile ◴[] No.43964361[source]
One aspect that I feel is ignored by the comments here is the geo-political forces at work. If the US takes the position that LLMs can't use copyrighted work or has to compensate all copyright holders – other countries (e.g. China) will not follow suit. This will mean that US LLM companies will either fall behind or be too expensive. Which means China and other countries will probably surge ahead in AI, at least in terms of how useful the AI is.

That is not to say that we shouldn't do the right thing regardless, but I do think there is a feeling of "who is going to rule the world in the future?" tha underlies governmental decision-making on how much to regulate AI.

replies(10): >>43964511 #>>43964513 #>>43964544 #>>43964546 #>>43964647 #>>43964799 #>>43965877 #>>43966756 #>>43969913 #>>43974233 #
2. bgwalter ◴[] No.43964511[source]
The same president that is putting 145% tariffs on China could put 1000% tariffs on Internet chat bots located in China. Or order the Internet cables to be cut as a last resort (citing a national emergency as is the new practice).

I'm not sure at all what China will do. I find it likely that they'll forbid AI at least for minors so that they do not become less intelligent.

Military applications are another matter that are not really related to these copyright issues.

replies(2): >>43964580 #>>43965149 #
3. asddubs ◴[] No.43964513[source]
you could apply that same logic to any IP breaches though, not just AI
replies(1): >>43965586 #
4. therouwboat ◴[] No.43964544[source]
If AI is so important, maybe it should be owned by the government and free to use for all citizens.
replies(1): >>43964591 #
5. bigbuppo ◴[] No.43964546[source]
The real problem here is that AI companies aren't even willing to follow the norms of big business and get the laws changed to meet their needs.
replies(1): >>43968993 #
6. pc86 ◴[] No.43964580[source]
How exactly does one add a tariff to a foreign-based chat bot?
replies(2): >>43964611 #>>43965573 #
7. pc86 ◴[] No.43964591[source]
Name two non-military things that the government owns and aren't complete dumpster fires that barely do the thing they're supposed to do.

Even (especially?) the military is a dumpster fire but it's at least very good at doing what it exists to do.

replies(10): >>43964650 #>>43964655 #>>43964684 #>>43964718 #>>43964753 #>>43964773 #>>43964792 #>>43964900 #>>43965196 #>>43969002 #
8. bilbo0s ◴[] No.43964611{3}[source]
You know that 20 bucks a month a lot of people pay for chatgpt?

Yeah..

you tax it if the "chatgpt" is foreign.

9. oooyay ◴[] No.43964647[source]
Well hell, by that logic average citizens should be able to launder corporate intellectual property because China will never follow suit in adhering to intellectual property law. I'm game if you are.
replies(3): >>43964701 #>>43965219 #>>43969949 #
10. bilbo0s ◴[] No.43964650{3}[source]
That's a trick question.

I mean, name 2 things anyone owns that aren't dumpster fires?

Long time ago industrial engineers used to say, "Even Toyota has recalls."

Something being a dumpster fire is so common nowadays that you really need a better reason to argue in support of a given entity's ownership. (Or even non-ownership for that matter.)

11. pergadad ◴[] No.43964655{3}[source]
The government doesn't make tanks, it just shells out gigantic amounts to companies to make them.

That said, there are plenty of successful government actions across the world, where Europe or Japan probably have a good advantage with solid public services. Think streets, healthcare, energy infrastructure, water infrastructure, rail, ...

replies(1): >>43965711 #
12. lappet ◴[] No.43964684{3}[source]
Highways
13. jowea ◴[] No.43964701[source]
Isn't that sort of logic precisely why China doesn't adhere to IP law?
replies(1): >>43964790 #
14. sklargh ◴[] No.43964718{3}[source]
Hi. Assuming the US here. Depends on scope of analysis and dumpster fire definition.

1. The National Weather Service. Crown jewel and very effective at predicting the weather and forecasting life threatening events.

2. IRS, generally very good at collecting revenue. 3. National Interagency Fire Service / US Forest service tactical fire suppression

4. NTSB/US Chemicals Safety Board - Both highly regarded.

5. Medicare - Basically clung to with talons by seniors, revealed preference is that they love it.

6. DOE National Labs

7. NIH (spicy pick)

8. Highway System

There are valid critiques of all of these but I don’t think any of them could be universally categorized as a complete dumpster fire.

15. nilamo ◴[] No.43964753{3}[source]
1) art museums, specifically the Smithsonian, but nearly every major city has a decent one.

2) state parks are pretty rad.

replies(1): >>43965010 #
16. azemetre ◴[] No.43964773{3}[source]
Medicaid, Medicare, and Social Security are all three programs that have massive approval from US citizens.

Even saying the military is a dumpster fire isn't accurate. The military has led trillions of dollars worth of extraction for the wealthy and elite across the globe.

In no sane world can you say that the ability to protect GLOBAL shipping lanes as a failure. That one service alone has probably paid for itself thousands of times.

We aren't even talking about things like public education (high school education use to be privatized and something only the elites enjoyed 100 years ago; yes public high school education isn't even 100 years old) or libraries or public parks.

---

I really don't understand this "gobermint iz bad" meme you see in tech circles.

I get more out of my taxes compared to equivalent corporate bills that it's laughable.

Government is comprised of people and the last 50 years has been the government mostly giving money and establishing programs to the small cohorts that have been hoarding all the wealth. Somehow this is never an issue with the government however.

Also never understand the arguments from these types either because if you think the government is bad then you should want it to be better. Better mostly meaning having more money to redistribute and more personal to run programs, but it's never about these things. It's always attacking the government to make it worse at the expense of the people.

17. oooyay ◴[] No.43964790{3}[source]
Yes, I was being a bit facetious. It was snark intended to point out that corporations don't get to have their cake and eat it too. Either everything is free and there are no boundaries or we live by our own principles.
replies(3): >>43964944 #>>43964966 #>>43965117 #
18. zem ◴[] No.43964792{3}[source]
post office and USDA (pre trump regime slash-and-burn of course)
19. Bjorkbat ◴[] No.43964799[source]
I broadly agree in that sure, unfettered access to copyrighted material will AI more capable, but more capable of what exactly?

For national security reasons I'm perfectly fine with giving LLMs unfettered access to various academic publications, scientific and technical information, that sort of thing. I'm a little more on the fence about proprietary code, but I have a hard time believing there isn't enough code out there already for LLMs to ingest.

Otherwise though, what is an LLM with unfettered access to copyrighted material better at vs one that merely has unfettered access to scientific / technical information + licensed copyrighted material? I would suppose that besides maybe being a more creative writer, the other LLM is far more capable of reproducing copyrighted works.

In effect, the other LLM is a more capable plagiarism machine compared to the other, and not necessarily more intelligent, and otherwise doesn't really add any more value. What do we have to gain from condoning it?

I think the argument I'm making is a little easier to see in the case of image and video models. The model that has unfettered access to copyrighted material is more capable, sure, but more capable of what? Capable of making images? Capable of reproducing Mario and Luigi in an infinite number of funny scenarios? What do we have to gain from that? What reason do we have for not banning such models outright? Not like we're really missing out on any critical security or economic advantages here.

replies(1): >>43965158 #
20. Buttons840 ◴[] No.43964900{3}[source]
Weather Forecasting
21. r053bud ◴[] No.43964944{4}[source]
It’s barely facetious though. What is stopping me from “starting an AI company” (LLC, sure), torrenting all ebooks (which Facebook did), and as long as I don’t seed, I’m golden?
replies(1): >>43965133 #
22. snozolli ◴[] No.43964966{4}[source]
Either everything is free and there are no boundaries or we live by our own principles.

Or C) large corporations (and the wealthy) do whatever they want while you still get extortion letters because your kid torrented a movie.

They really do get to have their cake and eat it too, and I don't see any end to it.

23. standardUser ◴[] No.43965010{4}[source]
The US federal government doesn't run most museums, but it does run the massive parks system with 20k employees (pre-Musk) and that system enjoys extremely high ratings from guests.
24. gruez ◴[] No.43965117{4}[source]
>It was snark intended to point out that corporations don't get to have their cake and eat it too.

"have their cake and eat it too" allegations only work if you're talking about the same entity. The copyright maximalist corporations (ie. publishers) aren't the same as the permissive ones (ie. AI companies). Making such characterizations make as much sense as saying "citizens don't get to eat their cake and eat it too", when referring to the fact that citizens are anti-AI, but freely pirate movies.

replies(1): >>43965143 #
25. gruez ◴[] No.43965133{5}[source]
>What is stopping me from “starting an AI company” (LLC, sure), torrenting all ebooks (which Facebook did), and as long as I don’t seed, I’m golden?

Nothing. You don't even need the LLC. I don't think anyone got prosecuted for only downloading. All prosecutions were for distribution. Note that if you're torrenting, even if you stop the moment it's finished (and thus never goes to "seeding"), you're still uploading, and would count as distribution for the purposes of copyright law.

replies(1): >>43966059 #
26. _aavaa_ ◴[] No.43965143{5}[source]
Yes they are. Look at what happened when deepseek came out. Altman started crying and alleging that deepseek was trained on OpenAI model outputs without an inkling of irony
replies(2): >>43965232 #>>43968036 #
27. gruez ◴[] No.43965149[source]
>Or order the Internet cables to be cut as a last resort (citing a national emergency as is the new practice).

what if they route through third countries?

28. Teever ◴[] No.43965158[source]
If common culture is an effective substrate to communicate ideas as in we can use shared pop culture references to make metaphors to explain complex ideas then the common culture that large companies have ensnared in excessively long copyrights and trademarks to generate massive profits is a useful thing for an LLM that is designed to convey ideas to have embedded in it.

If I'm learning about kinematics maybe it would be more effective to have comparisons to Superman flying faster than a speeding bullet and no amount of dry textbooks and academic papers will make up for the lack of such a comparison.

This is especially relevant when we're talking about science-fiction which has served as the inspiration for many of the leading edge technologies that we use including stuff like LLMs and AI.

replies(1): >>43966608 #
29. bongodongobob ◴[] No.43965196{3}[source]
National Weather Service

Library of Congress

National Park Service

U.S. Geological Survey (USGS)

NASA

Smithsonian Institution

Centers for Disease Control and Prevention (CDC)

Social Security Administration (SSA)

Federal Aviation Administration (FAA) air traffic control

U.S. Postal Service (USPS)

30. rollcat ◴[] No.43965219[source]
Well I always felt rebellious about the contemporary face of "rules for thee but not for me", specifically regarding copyright.

Musicians remain subject to abuse by the recording industry; they're making pennies on each dollar you spend on buying CDs^W^W streaming services. I used to say, don't buy that; go to a concert, buy beer, buy merch, support directly. Nowadays live shows are being swallowed whole through exclusivity deals (both for artists and venues). I used to say, support your favourite artist on Bandcamp, Patreon, etc. But most of these new middlemen are ready for their turn to squeeze.

And now on top of all that, these artists' work is being swallowed whole by yet another machine, disregarding what was left of their rights.

What else do you do? Go busking?

replies(1): >>43968990 #
31. gruez ◴[] No.43965232{6}[source]
>Altman started crying and alleging that deepseek was trained on OpenAI model outputs without an inkling of irony

Can you link to the exact comments he made? My impression was that he was upset at the fact that they broke T&C of openai, and deepseek's claim of being much cheaper to train than openai didn't factor in the fact that it requried openai's model to bootstrap the training process. Neither of them directly contradict the claim that training is copyright infringement.

32. Ekaros ◴[] No.43965573{3}[source]
Build a big firewall. And then fine massively any ISP that allows traffic to reach bad hosts...
33. Ekaros ◴[] No.43965586[source]
Your employee steals your source code and sells it to multiple competitors. Why should you have any right to go after those competitors?
replies(1): >>43969019 #
34. pc86 ◴[] No.43965711{4}[source]
We're talking about the US government though
replies(1): >>43967503 #
35. hulitu ◴[] No.43965877[source]
> One aspect that I feel is ignored by the comments here is the geo-political forces at work. If the US takes the position that LLMs can't use copyrighted work or has to compensate all copyright holders – other countries (e.g. China) will not follow suit.

Oh really ? They didn't had any problem when people installed copyrighted Windows to come after them. BSA. But now Microsoft turns a blind eye because it suits them.

36. Pooge ◴[] No.43966059{6}[source]
Which is still what Facebook did, if I'm not mistaken. There's no way they torrented and managed to upload less than 1 bit.
replies(1): >>43966700 #
37. Bjorkbat ◴[] No.43966608{3}[source]
Fair point, we use metaphor to explain and understand a variety of topics, and a lot of those metaphors are best understood through pop culture analogies.

A reasonable compromise then is that you can train an AI on Wikipedia, more-or-less. An AI trained this way will have a robust understanding of Superman, enough that it can communicate through metaphor, but it won't have the training data necessary to create a ton of infringing content about Superman (well, it won't be able to create good infringing content anyway. It'll probably have access to a lot of plot summaries but nothing that would help it make a particularly interesting Superman comic or video).

To me it seems like encyclopedias use copyrighted pop culture in a way that constitutes fair use, and so training on them seems fine as long as they consent to it.

38. FireBeyond ◴[] No.43966700{7}[source]
You're right. They claimed they made efforts to minimize seeding, but minimal is not none, as you say.
replies(1): >>43966888 #
39. stonogo ◴[] No.43966756[source]
Big "Mr. President, we cannot allow a mineshaft gap" energy going on, even if it's difficult for me personally to believe that LLMs contribute in any sense to ruling the world.
40. gruez ◴[] No.43966888{8}[source]
You can make a patched torrent client that never uploads any pieces to peers. It'd definitely be within Meta's capability to do so. The real problem is that unlike typical torrenting lawusits, they weren't caught red-handed in the act, and would therefore be hard to go after them. This might seem unfair, but it's not any different than you openly posting on Reddit that you torrent, but it'd be tough for rights holders to go after you even with such admission.
replies(1): >>43967576 #
41. const_cast ◴[] No.43967503{5}[source]
There's nothing special about the US government that makes it uniquely shit.

The difference here is that we have people like yourself: those who have zero faith in our government and as such act as double agents or saboteurs. When people such as yourself gain power in the legislator they "starve the beast". Meaning, purposefully deconstruct sections of our government such that they have justification for their ideological belief that our government doesn't work.

You guys work backwards. The foregone conclusion is that government programs never work, and then you develop convoluted strategies to prove that.

42. breakingcups ◴[] No.43967576{9}[source]
> Previously, a Meta executive in charge of project management, Michael Clark, had testified that Meta allegedly modified torrenting settings "so that the smallest amount of seeding possible could occur," which seems to support authors' claims that some seeding occurred. And an internal message from Meta researcher Frank Zhang appeared to show that Meta allegedly tried to conceal the seeding by not using Facebook servers while downloading the dataset to "avoid" the "risk" of anyone "tracing back the seeder/downloader" from Facebook servers. Once this information came to light, authors asked the court for a chance to depose Meta executives again, alleging that new facts "contradict prior deposition testimony."
replies(1): >>43967762 #
43. gruez ◴[] No.43967762{10}[source]
>Meta allegedly modified torrenting settings "so that the smallest amount of seeding possible could occur,"

>Meta allegedly tried to conceal the seeding by not using Facebook servers while downloading the dataset to "avoid" the "risk" of anyone "tracing back the seeder/downloader" from Facebook servers

Sounds like they used a VPN, set the upload speed to 1kb/s and stopped after the download is done. If the average Joe copied that setup there's 0% chance he'd get sued, so I don't really see a double standard here. If anything, Meta might get additional scrutiny because they're big enough of a target that rights holders will go through the effort of suing them.

replies(1): >>43969185 #
44. rubslopes ◴[] No.43968036{6}[source]
Another example: Microsoft suing pirated Windows distributors.
45. johnnyanmac ◴[] No.43968990{3}[source]
We regulate it like how we did centuries ago that lead to copyright. If we already have rules we enforce it. If no one in power wants to, we put in people who will.

In the end this all comes down to needing the people to care enough.

replies(1): >>43976413 #
46. johnnyanmac ◴[] No.43968993[source]
This is pre iselt why we need proportional fees for courts. We can't just let companies treat the law as a cost benefits analysis. They should live in fear of a court result against their favor.
47. johnnyanmac ◴[] No.43969002{3}[source]
Roads and telecommunication. You can argue they are indeed a dumpster fire, but imagine the alternatives full of tolls and incompatible wavelengths.
48. johnnyanmac ◴[] No.43969019{3}[source]
Because they bought code from someone not authorized to sell it?

This isn't some new phenomenon. We do indeed seize assets from buyers if the seller stole them.

49. FireBeyond ◴[] No.43969185{11}[source]
> If the average Joe copied that setup there's 0% chance he'd get sued

Citation needed. RIAA used to just watch torrents and sent cease and desists to everyone who connected, whether for a minute or for months. It was very much a dragnet, and I highly doubt there was any nuance of "but Your Honor, I only seeded 1MB back so it's all good".

replies(1): >>43972516 #
50. arp242 ◴[] No.43969913[source]
I get what you're saying, but this is just a race to the bottom, no?

It's annoying to see the current pushback against China focusing so much on inconsequential matters with so much nonsense mixed in, because I do think we do need to push back against China on some things.

51. seanmcdirmid ◴[] No.43969949[source]
In the long run private IP will eventually become very public despite laws you have, it’s been like that since the Stone Age. The American Industrial Revolution was built partially on stolen IP from Britain. The internet has just sped up diffusion. You can stop it if you are willing to cut the line, but legal action is only some friction and even then only in the short term
52. gruez ◴[] No.43972516{12}[source]
Did you miss the part about using a VPN?
53. 1vuio0pswjnm7 ◴[] No.43974233[source]
The design, manufacture and supply of electronics is far more important than one particular usage, e.g, "LLMs". It has never been a requirement to violate copyrights to produce electronics, or computer software. In fact, arguably there would be no "MicroSoft" were it not for Gates' lobbying for the existence and enforcement of "software copyright". The "Windows" franchise, among others, relies on it. The irony of Microsoft's support for OpenAI is amusing. Copyright enforcement for me but not for thee.
54. rollcat ◴[] No.43976413{4}[source]
Disney continued to lobby to extend copyright for like half a century. A lot of people did care. What use is regulation if you can just buy it?
replies(1): >>43982182 #
55. johnnyanmac ◴[] No.43982182{5}[source]
>a lot of people did care

As did Disney, apparently.

>what use is regulation if you can just buy it?

I don't like it either, but it still comes down to the same issues. We vote in people who can be bought and don't make a scandal out of it when it happens. The first step to fixing that corruption is to make congress afraid of being ousted if discovered. With today's communication structure, that's easier than ever.

But if the people don't care, we see the obvious Victor.