←back to thread

579 points Leftium | 1 comments | | HN request time: 0.334s | source
Show context
molticrystal ◴[] No.45306399[source]
The claim that Google secretly wants YouTube downloaders to work doesn't hold up. Their focus is on delivering videos across a vast range of devices without breaking playback(and even that is blurring[0]), not enabling downloads.

If you dive into the yt-dlp source code, you see the insane complexity of calculations needed to download a video. There is code to handle nsig checks, internal YouTube API quirks, and constant obfuscation that makes it a nightmare(and the maintainers heroes) to keep up. Google frequently rejects download attempts, blocks certain devices or access methods, and breaks techniques that yt-dlp relies on.

Half the battle is working around attempts by Google to make ads unblockable, and the other half is working around their attempts to shut down downloaders. The idea of a "gray market ecosystem" they tacitly approve ignores how aggressively they tweak their systems to make downloading as unreliable as possible. If Google wanted downloaders to thrive, they wouldn't make developers jump through these hoops. Just look at the yt-dlp issue tracker overflowing with reports of broken functionality. There are no secret nods, handshakes, or other winks, as Google begins to care less and less about compatibility, the doors will close. For example, there is already a secret header used for authenticating that you are using the Google version of Chrome browser [1] [2] that will probably be expanded.

[0] Ask HN: Does anyone else notice YouTube causing 100% CPU usage and stattering? https://news.ycombinator.com/item?id=45301499

[1] Chrome's hidden X-Browser-Validation header reverse engineered https://news.ycombinator.com/item?id=44527739

[2] https://github.com/dsekz/chrome-x-browser-validation-header

replies(18): >>45306431 #>>45307288 #>>45308312 #>>45308891 #>>45309570 #>>45309738 #>>45310615 #>>45310619 #>>45310847 #>>45311126 #>>45311155 #>>45311160 #>>45311645 #>>45313122 #>>45315060 #>>45315374 #>>45316124 #>>45325129 #
1. 1vuio0pswjnm7 ◴[] No.45325129[source]
"If you dive into the yt-dlp source code, you see the insane complexity of calculations needed to download a video. "

Indeed the complexity is insane

https://news.ycombinator.com/item?id=45256043

But what is meant by "a video". Is this referring to the common case or an edge/corner case. Does "a" mean one particular video or all videos

"There is code to handle nsig checks, internal YouTube API quirks, and constant obfuscation that makes it a nightmare(and the maintainers heroes) to keep up."

True, but is this code required for all YouTube videos

The majority of YT videos are non-commercial, unpromoted with low view counts. These are simple to download

For example, the current yt-dlp project contains approximately 218 YT IDs. A 2024 version contained approximately 201 YT IDs. These are often for testing edge cases

The example 1,525-character shell script below outputs download URLs for almost all the YT IDs found in yt-dlp. No Python needed

By comparison the yt-dlp project is 15,679,182 characters, approximately

The curl binary is used in the example only because it's popular, not because I use it. I use simpler, more flexible software than curl

I have been using tiny shell script to download YT videos for over 15 years. I have been downloading videos from googlevideo.com for even longer, before Google acquired YouTube.^1 Surprisingly (or not), when YT changes something that requires updating the script (and this has only happened to me about 5 times or less in 15 years) I have generally been able to fix the shell script faster than yt-dl(p) fixes its Python program (same for NewPipe/NewPipeSB)

I prefer non-commercial videos that are not promoted. The ones with relatively low view counts. For more popular videos, I listen to the audio file first before downloading the video file. After listening to the audio, I may decide to skip the video. Also I am not overly concerned about throttling

1. The original Google Video made a distinction between commercial and non-commercial(free) videos. The later were always easy to download, and no sign-in/log-in was required. This might be a more plausible theory why YT has always allowed downloads for non-commercial videos

   # custom C filters to make scripts faster, easier to write
   # yy030 filters URLs from stdin
   # yy082 filters various strings from stdin, 
   # e.g., f == print format descriptions, v == print YT IDs
   # x is a YouTube ID
   # script accepts YT ID on stdin
   
   #/bin/sh
   read x;
   y=https://www.youtube.com/youtubei/v1/player?prettyPrint=false 
   curl -K/dev/stdin $y <<eof|yy030|if test $# -gt 0;then egrep itag=$1;else yy082 f|uniq;fi;
   silent
   #verbose
   ipv4
   http1.0
   tlsv1.3
   tcp-nodelay
   resolve www.youtube.com:443:142.251.215.238 
   user-agent "com.google.ios.youtube/19.45.4 (iPhone16,2; U; CPU iOS 18_1_0 like Mac OS X;)"
   header "content-type: application/json"
   header "X-Youtube-Client-Name: 5"
   header "X-Youtube-Client-Version: 19.45.4"
   header "X-Goog-Visitor-Id: CgtpN1NtNlFnajBsRSjy1bjGBjIKCgJVUxIEGgAgIw=="
   cookie "PREF=hl=en&tz=UTC; SOCS=CAI; GPS=1; YSC=4sueFctSML0; __Secure-ROLLOUT_TOKEN=CJO64Zqggdaw7gEQiZW-9r3mjwMYiZW-9r3mjwM%=; VISITOR_INFO1_LIVE=i7Sm6Qgj0lE; VISITOR_PRIVACY_METADATA=CgJVUxIEGgAgIw=="
   data "{\"context\": {\"client\": {\"clientName\": \"IOS\", \"clientVersion\": \"19.45.4\", \"deviceMake\": \"Apple\", \"deviceModel\": \"iPhone16,2\", \"userAgent\": \"com.google.ios.youtube/19.45.4 (iPhone16,2; U; CPU iOS 18_1_0 like Mac OS X;)\", \"osName\": \"iPhone\", \"osVersion\": \"18.1.0.22B83\", \"hl\": \"en\", \"timeZone\": \"UTC\", \"utcOffsetMinutes\": 0}}, \"videoId\": \"$x\", \"playbackContext\": {\"contentPlaybackContext\": {\"html5Preference\": \"HTML5_PREF_WANTS\", \"signatureTimestamp\": 20347}}, \"contentCheckOk\": true, \"racyCheckOk\": true}"
   eof