←back to thread

137 points bradt | 3 comments | | HN request time: 0.521s | source
Show context
neehao ◴[] No.45087333[source]
As Tyler Cowen says, solve for the equilibrium.

"Many widely used machine-learning models rely on copyrighted data. For instance, Google finds the most relevant web pages for a search term by relying on a machine learning model trained on copyrighted web data. But the use of copyrighted data by machine learning models that generate content (or give answers to search queries than link to sites with the answers) poses new (reasonable) questions about fair use. By not sharing the proceeds, such systems also kill the incentives to produce original content on which they rely. For instance, if we don’t incentivize content producers, e.g., people who respond to Stack Overflow questions, the ability of these models to answer questions in new areas is likely to be lower. The concern about fair use can be addressed by training on data from content producers who have opted to share their data. The second problem is more challenging. How do you build a system that shares proceeds with content producers?"

https://www.gojiberries.io/generative-ai-and-the-market-for-...

replies(2): >>45087359 #>>45087576 #
1. observationist ◴[] No.45087359[source]
Content producers that publish their "content" to the public web aren't entitled to dictate what's done with that material.

There's a simple solution. People that publish things can put up a paywall and people can pay what the content is worth.

The thing that AI endangers is not valuable content, it's the SEO clickbait cashcow, and as far as I'm concerned, the faster AI kills that off, the better.

That monetization model is corrupt as hell, produces all sorts of perverse incentives, and is the epitome of the enshittification of the web.

Burn, baby, burn.

replies(2): >>45087433 #>>45087434 #
2. bgwalter ◴[] No.45087433[source]
Of course they are entitled. They have the copyright, so you cannot reproduce it anywhere by default and the "fair" use issue is not settled.

Valuable content is endangered because writers feel demotivated it their material is just stolen by overfunded big corporations.

Paywalls only work for known publications and not for someone who writes the perfect tutorial on how to solve boot issues in Debian. Why would anyone write that if it's just stolen and monetized without attribution?

3. _Algernon_ ◴[] No.45087434[source]
Publishing publicly doesn't surrender copyright…