Does lemmy have any communities dedicated to archiving/hoarding data?
This post foreshadowed today’s AWS outage.
👀
I have been archiving Linux builds for the last 20 years so I could effectively install Linux on almost any hardware since 1998-ish.
I have been archiving docker images to my locally hosted gitlab server for the past 3-5 years (not sure when I started tbh). I’ve got around 100gb of images ranging from core images like OS to full app images like Plex, ffmpeg, etc.
I also have been archiving foss projects into my gitlab and have been using pipelines to ensure they remain up-to-date.
the only thing I lack are packages from package managers like pip, bundler, npm, yum/dnf, apt. there’s just so much to cache it’s nigh impossible to get everything archived.
I have even set up my own local CDN for JS imports on HTML. I use rewrite rules in nginx to redirect them to my local sources.
my goal is to be as self-sustaining on local hosting as possible.
respectable level of hoarding 🏅
Everyone should have this mindset regarding their data. I always say to my friends and family, “If you like it, download it.”. The internet is always changing and that piece of media that you like can be moved, deleted, or blocked at any time.
The pornhub collapse should have taught the average person that.
You’re awesome. Keep up the good work.
I would also add Openstreetmap to the list
I also recommend downloading “Flashpoint archive” to have flash games and animations to stay entertained.
There is a 4gb version and a 2.3TB version.
There is a 4gb version and a 2.3TB version.
That’s quite the range
When I downloaded it years ago it was 1.8TB. It’s crazy how big the archive is. The smaller one is just so it’s accessible to most people.
Is that Flash exclusive or do they accept other games from that era?
I’m not sure, but I do think it’s just flash
Neither are that bad honestly. I have jigdo scripts I run with every point release of Debian and have a copy of English Wikipedia on a Kiwix mirror I also host. Wikipedia is a tad over 100 GB. The source, arm64 and amd64 complete repos (DVD images) for Debian Trixie, including the network installer and a couple live boot images, are 353 GB.
Kiwix has copies of a LOT of stuff, including Wikipedia on their website. You can view their zim files with a desktop application or host your own web version. Their website is: https://kiwix.org/
If you want (or if Wikipedia is censored for you) you can also look at my mirror to see what a web hosted version looks like: https://kiwix.marcusadams.me/
Note: I use Anubis to help block scrapers. You should have no issues as a human other than you may see a little anime girl for a second on first load, but every once and a while Brave has a disagreement with her and a page won’t load correctly. I’ve only seen it in Brave, and only rarely, but I’ve seen it once or twice so thought I’d mention it.
I rarely get bounced by Anubis, but oddly enough it has happened to me a couple times in FF, I suspect it’s the fingerprinting resistance settings that cause this to happen? Hasn’t happened in a while though
I can answer one part of your question. Yes, it’s not as big as you think it is.

does this include images?
With images, it is 111,08 GB
Compressed or uncompressed? Can it be directly read?
Can be read directly, like normal Wikipedia.
That’s very nice. Does it also include other languages, or would that take more space?
This is English only. Other languages are downloaded separately, though they typically take less space.
Nice.
How about, when included previous versions of pages? (excluding images)Not sure, not having that option. Can imagine not much more, if proper version history management is involved.
Yeah, seems like there’s nothing as simple as something similar to a
git cloneavailable.
One would probably have to download multiple full copies from different times and then merge them with deduplication, to get that answer.
No
I thought the whole point of torrenting was to decentralise distribution. I use torrents to get my distros.
In my own little bubble, I thought that’s how most people got their distro.
What happens when they just cut the underwater cables? Torrent over carrier pigeon for a linux distro would take ages
Sneakernet to the rescue. Some of you are too young to know about walking around with boxes full of disks.
A wise man once said
Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway.
It was trading CD-R’s during my high school days… good times. Napster was just starting to take off by the time we had a CD-R trading network set up, Napster just increased the amount of CD’s that got passed around.
Pigeon latency is horrible, but the bandwidth is pretty great. You could probably load up an adult pigeon with at least 12TB of media.
https://en.wikipedia.org/wiki/IP_over_Avian_Carriers
Just gonna leave this here for whoever wants to read more on the methodology and potential risks.
Over a 30-mile (48 km) distance, a single pigeon may be able to carry tens of gigabytes of data in around an hour, which on an average bandwidth basis compared very favorably to early ADSL standards, even when accounting for lost drives.
Compared to what I use at home now, this sounds great
A good way to see what the future of places like the U.S are is to look at places like North Korea, where they do exactly this, move files around on flash media to avoid the state censors.
We need some more community wifi projects
Community Wisps are cool
Tiny jump drives on pigeons is low key excellent imo
@Maroon I thought torrent technology to be a godsend for package managers.
Why none of them use it?
I mean, damn.
Turns out hosting a bunch of files is very cheap.
Torrents are often used for installers, but for packages it tends to be more trouble than what it’s worth. Is creating a torrent for a 4k library worth it?
git and the lot are a lot better at this than people realize.
Did I miss something? Whats happening to debian stable?
debian stable became the go to distro for long term usage in case our FOSS support structure goes haywire due to wars
Is there a context to this or just random thought?
You can ignore politics, but politics will not ignore you.
gestures at everything
I would add in some rom collections and book repositories as well. The whole library of Nintendo games is under a gig and would go a long way for entertaining people.
Book repos? I didn’t know such a thing existed. Can you share more?
Project Gutenberg has a large collection of public domain books
Thank you kindly
This is just minor datahoarding. I do it, on an extreme level.
Years ago I bought a physical encyclopedia. I remember having one as a kid and using it for school reports. Also just looking through it can be cool. Learning about something you never knew existed is just a unique experience and doing it through a physical book just deepens the whole experience.
I also learned the practice of printing a physical encyclopedia is going out of fashion. I think there is only one company the still prints a yearly encyclopedia and it’s not Encyclopedia Britannica of all things. Might have change since I bought my copy but go give some physical media some love if you can.
Okay so where do I find some cheap hard drives? Europe if possible :-)
look for dvr’s they have huge hdds in them and you can find them at thrift stores for cheap
Wait why keep Debian? What happened to Debian?
Nothing, it’s probably an attempt to have something stable and unchanging, so that aging doesn’t show much.
The meme doesn’t seem to be about Debian becoming bad, more like data hoarding.
old pcs off amazon usually come with good reliable 1/2tb harddrive.















