The best YouTube downloaders, and how Google silenced the press

1. peteforde ◴[20 Sep 25 01:23 UTC] No.45308944[source]▶

One of the things that drives me crazy about YouTube is that if a video gets taken down, it shows up as a "This video is no longer available" with no further metadata. I am far, far more uptight about no knowing which video was removed than I am about the fact that it is no longer available.

I have put serious thought into creating a tool that would automatically yt-dlp every video I open to a giant hard drive and append a simple index with the title, channel, thumbnail and date.

In general, I think people are way too casual about media of all kinds silently disappearing when you're not looking.

replies(10): >>45309263 #>>45309876 #>>45311419 #>>45311593 #>>45311732 #>>45312005 #>>45312465 #>>45314574 #>>45315102 #>>45337266 #

2. pzmarzly ◴[20 Sep 25 01:47 UTC] No.45309263[source]▶

>>45308944 (TP) #

I had a Bash script that parsed my browser history, and for every YouTube video it would run yt-dlp with "--write-info-json --write-subtitles --download-archive=already-downloaded.db" flags. Creating it was the easy part, but keeping it running has presented some challenges. For example, Google started rate limiting my IP quickly, so I had to offload this process to a NAS, where it could keep running for hours overnight, persistently downloading stuff at near dialup speeds. Then I was running out of storage quickly, so I had to add video filtering, and I planned to add basic garbage collection. And of course I had to have youtube-dl (and later yt-dlp) updated at all times.

In the end, I decided it is not worth it. In the scenario you described, I would take the video link/ID and paste it into Bing and Yandex. There is large chance they still have that page cached in their index.

FWIW if you are going to create your own tool, my advice will be to make it a browser extension, and try to pull the video straight from YouTube's <video> element.

3. youniverse ◴[20 Sep 25 03:00 UTC] No.45309876[source]▶

>>45308944 (TP) #

Agreed, I can't describe the sadness that some of my most treasured nostalgic videos were lost before I knew about yt-dlp, and I can't find out what their titles even were. For example on spotify when music gets removed, if it's in your playlist it just shows it greyed out and unplayable.

Someone at google please give us the ability to see titles!

replies(2): >>45311385 #>>45314582 #

4. shaky-carrousel ◴[20 Sep 25 07:58 UTC] No.45311385[source]▶

>>45309876 #

Off the top of my head, did you try to access the URLs via archive.org? That way, at least you'll get the titles.

5. CM30 ◴[20 Sep 25 08:06 UTC] No.45311419[source]▶

>>45308944 (TP) #

I've always wondered why we don't see any platforms just remove the media while leaving the metadata, comments, ratings, etc intact. Like, is there some legal requirement that the idea itself has to be hard to find, or is it okay to just remove the media and let people keep discussing it?

replies(3): >>45311637 #>>45311951 #>>45313108 #

6. globular-toast ◴[20 Sep 25 08:45 UTC] No.45311593[source]▶

>>45308944 (TP) #

I want a web cache that runs on my network transparently caching everything that goes through it. It would be a LRU cache but with the ability to easily mark some resource as archived such that it never gets deleted. A browser extension could be used to do this marking. Unfortunately client-side js makes this very difficult or even impossible to do.

We really dropped the ball when it came to running random js from websites. The number of people who truly run only free software these days is close to zero. I used to block all js about 10 years ago but it was a losing battle and ended up being essentially an opt out from society.

7. balamatom ◴[20 Sep 25 08:57 UTC] No.45311637[source]▶

>>45311419 #

Legal requirement - probably not. Econophysical constraint - betcha. They mostly don't care about the discussion, or the content, or the idea, they care about keeping your eyeballs within a given rectangle until a bell rings.

8. matheusmoreira ◴[20 Sep 25 09:17 UTC] No.45311732[source]▶

>>45308944 (TP) #

> In general, I think people are way too casual about media of all kinds silently disappearing when you're not looking.

I used to be obsessed with this.

The way I saw it was the universe took billions of years of concerted effort to generate a random number that represents a unique file such as an interesting video or image. It would be such a shame if all that effort was invalidated due to bullshit YouTube reasons or copyright nonsense or link rot or whatever.

So I started hoarding this data. I started buying hardware and designing a home data center with ZFS and hundreds of terabytes to hold it all. I started downloading things I never actually gave a shit about just because they were rare and I wanted to preserve them.

I think getting married cured me of this. Now it's all moments that will be lost to time, like tears in the rain.

replies(1): >>45316834 #

9. tokioyoyo ◴[20 Sep 25 10:04 UTC] No.45311951[source]▶

>>45311419 #

Name itself might be what is legally required to be taken down.

10. notrealyme123 ◴[20 Sep 25 10:16 UTC] No.45312005[source]▶

>>45308944 (TP) #

https://archivebox.io/ could be a solution for that

11. uncircle ◴[20 Sep 25 11:42 UTC] No.45312465[source]▶

>>45308944 (TP) #

> This video is no longer available.

This is why I recommend everybody to stay AWAY from Youtube Music. I migrated my curated playlists from Spotify a few years ago, and to my surprise now I have dozens of songs that are no longer available and Youtube doesn’t offer a way to at least let me know which song it was. Indeed, I was a paying user and Youtube caused intentional and irretrievable data loss.

After a decade of paying for Youtube Premium I have unsubscribed and have vowed never to give them any more money whatsoever.

replies(1): >>45314125 #

12. heavyset_go ◴[20 Sep 25 13:14 UTC] No.45313108[source]▶

>>45311419 #

It implies the service is lacking something, that it's deficient in specific tangible things you want but they don't have.

A generic 404 for something you don't even know exists won't leave a `video_title` sized hole in your heart and chip on your shoulder, and won't give competitors opportunities to serve your needs instead.

13. ◴[20 Sep 25 15:23 UTC] No.45314125[source]▶

>>45312465 #

14. Sophira ◴[20 Sep 25 16:11 UTC] No.45314574[source]▶

>>45308944 (TP) #

You might find https://findyoutubevideo.thetechrobo.ca/ to be helpful! It was posted before on HN here: https://news.ycombinator.com/item?id=38228481

15. sneak ◴[20 Sep 25 16:12 UTC] No.45314582[source]▶

>>45309876 #

Not by default. The normal spotify setting is to simply silently remove them. You have to turn on the “show unplayable tracks” preference to know that your playlists have been altered without your consent.

16. BrtByte ◴[20 Sep 25 17:02 UTC] No.45315102[source]▶

>>45308944 (TP) #

The lack of even basic metadata when a video disappears is maddening... like a black hole where context used to be

17. balder1991 ◴[20 Sep 25 20:00 UTC] No.45316834[source]▶

>>45311732 #

I still have a bit of this, but I try to be realistic that hoarding too much shit is a waste. I simply try to filter whether that is something really worth keeping. Then I save it if it’s something I’m very interested in or if it has any sentimental value to me personally.

18. xnx ◴[22 Sep 25 18:08 UTC] No.45337266[source]▶

>>45308944 (TP) #

You can't even tell what channel removed videos were from. Very frustrating.