←back to thread

492 points vladyslavfox | 3 comments | | HN request time: 0.502s | source
1. RcouF1uZ4gsC ◴[] No.41898927[source]
The Library of Congress should be archiving the Internet and it should have the budget required to do so.

This is in line with its mission as the "Library of Congress". Being able to have an accurate record of what was on the Internet at a specific point in time would be helpful when discussing legislation or potential regulation involving the internet.

replies(2): >>41899062 #>>41899349 #
2. awkwardpotato ◴[] No.41899062[source]
The Library of Congress does currently archive limited collections of the internet[0]. They have a blog post[1] breaking down the effort, currently it's 8 full time staff with a team of part time members. According to Wikipedia[2], it's built on Heritrix and Wayback which are both developed by the Internet Archive (blog post also mentions "Wayback software"). Current archives are available at: http://webarchive.loc.gov/

[0] https://www.loc.gov/programs/web-archiving/about-this-progra...

[1] https://blogs.loc.gov/thesignal/2023/08/the-web-archiving-te...

[2] https://en.m.wikipedia.org/wiki/List_of_Web_archiving_initia...

3. tokai ◴[] No.41899349[source]
As awkwardpotato write they do. Many national libraries all over the word treat the internet as covered by their requirements of legal deposit, and crawl their respective TLD.