←back to thread

335 points alphabetting | 10 comments | | HN request time: 2.133s | source | bottom
Show context
wenbin ◴[] No.41872782[source]
NotebookLM is contributing to fake podcasts across the internet, with over 1,300 and counting:

https://github.com/ListenNotes/ai-generated-fake-podcasts/bl...

Google is taking a different approach this time, moving quickly. While NotebookLM is indeed a remarkable tool for personal productivity and learning, it also opens the door for spammers to mass-produce content that isn't meant for human consumption.

Amidst all the praise for this project, I’d like to offer a different perspective. I hope the NotebookLM team sees this and recognizes the seriousness of the spam issue, which will only grow if left unaddressed. If you know someone on the team, please bring this to their attention - Could you please provide a tool or some plain-English guidelines to help detect audio generated by NotebookLM? Is there a watermark or any other identifiable marker that can be used?

Just recently, a Hacker News post highlighted how nearly all Google image results for "baby peacock" are AI-generated: https://news.ycombinator.com/item?id=41767648

It won't be long before we see a similar trend with low-quality, AI-generated fake podcasts flooding the internet.

replies(18): >>41872802 #>>41872821 #>>41872878 #>>41872954 #>>41873067 #>>41873074 #>>41873152 #>>41873269 #>>41873297 #>>41873476 #>>41874055 #>>41874427 #>>41874680 #>>41875008 #>>41877535 #>>41879360 #>>41879521 #>>41880487 #
jsheard ◴[] No.41872821[source]
> it also opens the door for spammers to mass-produce content that isn't meant for human consumption.

What's new? Every novel class of genAI product has brought a tidal wave of slop, spam and/or scams to the medium it generates. If anyone working on a product like this doesn't anticipate it being used to mass produce vapid white-noise "content" on an industrial scale then they haven't been paying attention.

replies(1): >>41872875 #
1. wenbin ◴[] No.41872875[source]
This is definitely not a new issue.

What I’m aiming for is to ensure that the NotebookLM team is aware of the impact and actively considering it. Hopefully, they are already working on tools or mechanisms to address the problem—ideally before their colleagues at YouTube and Google Search come asking for help to fight NotebookLM-generated spams :)

It's certainly easier for the creators of genAI to build detection tools than for outsiders to do so. AI audio detection is a hard problem - https://www.npr.org/2024/04/05/1241446778/deepfake-audio-det...

replies(2): >>41873156 #>>41875011 #
2. criddell ◴[] No.41873156[source]
> What I’m aiming for is to ensure that the NotebookLM team is aware of the impact and actively considering it.

What is the impact? Have any of them attracted an audience of any meaningful size? If a month from now there are 1.3 million generated podcasts, what do you anticipate the fallout to be?

replies(2): >>41875383 #>>41880356 #
3. sgdfhijfgsdfgds ◴[] No.41875383[source]
> If a month from now there are 1.3 million generated podcasts, what do you anticipate the fallout to be?

Is this a rhetorical question? Because the answer for podcast indexing and search services is surely pretty obvious.

replies(1): >>41875734 #
4. criddell ◴[] No.41875734{3}[source]
Why is it a problem? There's even more material for those services now and for their customers, the value these services can provide is even higher.
replies(1): >>41876317 #
5. ungreased0675 ◴[] No.41876317{4}[source]
Wouldn’t the value be lower if podcasts end up the way product review blogs have? Endless spam that causes people to append “Reddit” to their searches in hopes of finding something human generated.
replies(2): >>41877612 #>>41878182 #
6. sgdfhijfgsdfgds ◴[] No.41877612{5}[source]
Exactly this. It's obvious that generative AI content is bad for search and indexing, and I wish more people would learn to extrapolate from the forms of the problem that already exist, that the quality of the generated content is not going to solve the "please, god, find me something a real person actually said about this real thing" problem.

I just don't understand how people can pretend this isn't happening just because they find each new twist of a technology fascinating.

replies(1): >>41878765 #
7. ◴[] No.41878182{5}[source]
8. criddell ◴[] No.41878765{6}[source]
> It's obvious that generative AI content is bad for search and indexing

It's not obvious at all. You see a problem, I see opportunities. The person you responded to mentioned Reddit. Well, the glut of garbage on the internet makes Reddit more valuable.

The "please, god, find me something a real person actually said about this real thing" isn't actually a problem very many people have. Despite all the complaining here, most people are happy with the results they get when they search Google or Bing or Facebook or Reddit.

So like every technology, there are positives and negatives. It isn't clear to me where this all nets out but I'm leaning slightly positive.

And if you don't find this technology fascinating, then I'm not sure what to say.

replies(1): >>41881700 #
9. BlueTemplar ◴[] No.41880356[source]
The impact of cheap spam, as in the other cases, is to push people towards platforms.

You can even see it in the case of e-mail spam, where while technically not platforms, it helped Gmail and Hotmail to become so big that they can basically impose their own rules on the protocol.

10. Dilettante_ ◴[] No.41881700{7}[source]
>Well, the glut of garbage on the internet makes Reddit more valuable.

If given the choice between a dime and a fiver, is the fiver somehow worth more than if I'd offered you the choice between a fiver and a tenner instead?