The best YouTube downloaders, and how Google silenced the press

(windowsread.me)

579 points Leftium | 4 comments | 19 Sep 25 12:20 UTC | HN request time: 0s | source

Show context

molticrystal ◴[19 Sep 25 20:49 UTC] No.45306399[source]▶

The claim that Google secretly wants YouTube downloaders to work doesn't hold up. Their focus is on delivering videos across a vast range of devices without breaking playback(and even that is blurring[0]), not enabling downloads.

If you dive into the yt-dlp source code, you see the insane complexity of calculations needed to download a video. There is code to handle nsig checks, internal YouTube API quirks, and constant obfuscation that makes it a nightmare(and the maintainers heroes) to keep up. Google frequently rejects download attempts, blocks certain devices or access methods, and breaks techniques that yt-dlp relies on.

Half the battle is working around attempts by Google to make ads unblockable, and the other half is working around their attempts to shut down downloaders. The idea of a "gray market ecosystem" they tacitly approve ignores how aggressively they tweak their systems to make downloading as unreliable as possible. If Google wanted downloaders to thrive, they wouldn't make developers jump through these hoops. Just look at the yt-dlp issue tracker overflowing with reports of broken functionality. There are no secret nods, handshakes, or other winks, as Google begins to care less and less about compatibility, the doors will close. For example, there is already a secret header used for authenticating that you are using the Google version of Chrome browser [1] [2] that will probably be expanded.

[0] Ask HN: Does anyone else notice YouTube causing 100% CPU usage and stattering? https://news.ycombinator.com/item?id=45301499

[1] Chrome's hidden X-Browser-Validation header reverse engineered https://news.ycombinator.com/item?id=44527739

[2] https://github.com/dsekz/chrome-x-browser-validation-header

replies(18): >>45306431 #>>45307288 #>>45308312 #>>45308891 #>>45309570 #>>45309738 #>>45310615 #>>45310619 #>>45310847 #>>45311126 #>>45311155 #>>45311160 #>>45311645 #>>45313122 #>>45315060 #>>45315374 #>>45316124 #>>45325129 #

guerrilla ◴[20 Sep 25 02:17 UTC] No.45309570[source]▶

>>45306399 #

> If you dive into the yt-dlp source code, you see the insane complexity of calculations needed to download a video. There is code to handle nsig checks, internal YouTube API quirks, and constant obfuscation that makes it a nightmare(and the maintainers heroes) to keep up. Google frequently rejects download attempts, blocks certain devices or access methods, and breaks techniques that yt-dlp relies on.

This just made me incredibly grateful for the people who do this kind of work. I have no idea who writes all the uBlock Origin filters either, but blessed be the angels, long may their stay in heaven be.

I'm pretty confident I could figure it out eventually but let's be honest, the chance that I'd ever actually invest that much time and energy is approximates zero close enough that we can just say it's flat nil.

Maybe Santa Claus needs to make some donations tonight. ho ho ho

replies(2): >>45309786 #>>45310667 #

imiric ◴[20 Sep 25 05:31 UTC] No.45310667[source]▶

>>45309570 #

As the web devolves further, the only viable long-term solution will be allow lists instead of block lists. There is too much hostility online—from websites that want to track you and monetize your data and attention, SEO scams and generated content, and an ever-increasing army of bots—that it's becoming infeasible to maintain rules to filter all of it out. It's much easier to write rules for traffic you approve of, although they will have to be more personal than block lists.

replies(1): >>45311246 #

drnick1 ◴[20 Sep 25 07:24 UTC] No.45311246[source]▶

>>45310667 #

This is more or less what I already do with uBlock/uMatrix. By default, I filter out ALL third party content on every website, and manually allow CDNs and other legitimate third party domains. I still use DNS blacklists however so that mobile devices where this can't be easily done benefit from some protection against the most common offenders (Google Analytics, Facebook Pixel, etc.)

replies(1): >>45316777 #

1. noja ◴[20 Sep 25 19:54 UTC] No.45316777[source]▶

>>45311246 #

I’m not sure why everyone keeps repeating this. The fight is lost. Your data is being collected by the websites you visit and handed to Facebook via a proxy container. You will never see a different domain, it’s invisible to the end user.

replies(1): >>45317034 #

2. drnick1 ◴[20 Sep 25 20:19 UTC] No.45317034[source]▶

>>45316777 (TP) #

Care to elaborate on the mechanisms at play? If what you claim is true, all websites would already serve ads from their own domain. The main issue I can see with this approach is that there would be an obvious incentive for webmasters to vastly overstate ad impressions to generate revenue.

replies(1): >>45324211 #

3. noja ◴[21 Sep 25 16:32 UTC] No.45324211[source]▶

>>45317034 #

Look up Facebook Conversions API Gateway

replies(1): >>45324759 #

4. drnick1 ◴[21 Sep 25 17:23 UTC] No.45324759{3}[source]▶

>>45324211 #

As far as I understand, the objective is completely different. Ads are shown on platforms owned by Meta, and the Conversions API runs on the merchant's website (server-side), and reports interactions such as purchases back to Facebook.

This is quite different from websites monetizing traffic through and trackers placed on their own webpages. Those can still be reliably blocked by preventing websites from loading third party content.

↑