←back to thread

127 points Brajeshwar | 1 comments | | HN request time: 0s | source
Show context
krick ◴[] No.42481217[source]
How do you backup websites? I mean, it sounds trivial, but I kinda still haven't figured out what is the way. I sometimes think that I'd like some script to automatically make a copy of every webpage I ever link in my notes (it really happens quite often that a blog I linked some years ago is no more), and maybe even replace links to that mirror of my own, but all websites I've actually backed up by now are either "old-web" that are trivial to mirror, or basically required some custom grabber to be writen by myself. If you just want to copy a webdpage, often it either has some broken CSS&JS, missing images, because it was "too shallow", or otherwise it is too deep and has a ton of tiny unnecessary files that are honestly just quite painful to keep on your filesystem as it grows. Add to that cloudaflare, Captchas, ads (that I don't see when browsing with ublock and ideally wouldn't want them in my mirrored sites as well), cookie warning splash-screens, all sorts of really simple (but still above wget's paygrade) anti-scraping measures, you get the idea.

Is there something that "just works"?

replies(3): >>42481273 #>>42481558 #>>42482585 #
1. Dwedit ◴[] No.42481273[source]
There are extensions like "Save Page WE" that will dump the current state of the DOM to an HTML file, including CSS and Images, but these are static and don't make the scripting work.