←back to thread

90 points superstarryeyes | 1 comments | | HN request time: 0.387s | source

Well, I certainly tried. I had to, because it has a certain quirk inspired by "digital minimalism."

The quirk is that it only allows you to fetch new articles once per day (or X days).

Why? Let me explain...

I want my internet content to be like a boring newspaper. You get it in the morning, and you read the whole thing while sipping your morning coffee, and then you're done! No more new information for today. No pings, no alerts, peace, quiet, zen, etc.

But with that, I needed it to be able to fetch all articles from my hundreds of feeds in one sitting. This is where Zig and curl optimisations come in. I tried to do all the tricks in the book. If I missed something, let me know!

First off, I'm using curl multi for the network layer. The cool thing is it automatically does HTTP/2 multiplexing, which means if your feeds are hosted on the same CDN it reuses the same connection. I've got it configured to handle 50 connections total with up to 6 per host, which seems to be the sweet spot before servers start getting suspicious. Also, conditional GETs. If a feed hasn't changed since last time, the server just says "Not Modified" and we bail immediately.

While curl is downloading feeds, I wouldn't want CPU just being idle so the moment curl finishes downloading a single feed, it fires a callback that immediately throws the XML into a worker thread pool for parsing. The main thread keeps managing all the network stuff while worker threads are chewing through XML in parallel. Zig's memory model is perfect for this. Each feed gets its own ArenaAllocator, which is basically a playground where you can allocate strings during parsing, then when we're done, we just nuke the entire arena in one go.

For parsing itself, I'm using libexpat because it doesn't load the entire XML into memory like a DOM parser would. This matters because some podcast feeds especially are like 10MB+ of XML. So with smart truncation we download the first few X mb's (configurable), scan backwards to find the last complete item tag, cut it there, and parse just that. Keeps memory usage sane even when feed sizes get massive.

And for the UI I just pipe everything to the system's "less" command. You get vim navigation, searching, and paging for free. Plus I'm using OSC 8 hyperlinks, so you can actually click links to open them on your browser. Zero TUI framework needed. I've also included OPML import/export and feed groups as additional features.

The result: content from hundreds of RSS feeds retrieved in matter of seconds, and peace of mind for the rest of the day.

The code is open source and MIT licensed. If you have ideas on how to make it even faster or better, comment below. Feature requests and other suggestions are also welcome, here or GitHub.

Show context
ekjhgkejhgk ◴[] No.46297117[source]
Why MIT and not GPL3?
replies(1): >>46297179 #
superstarryeyes ◴[] No.46297179[source]
why not? isn't mit just objectively a better license for open source? i just hope rss would make a comeback to make the internet a little saner again, and if someone wants to use hys source code as a base for their own rss reader, whether commercial or not, great!
replies(2): >>46299550 #>>46311055 #
palata ◴[] No.46311055[source]
Copyleft licences "care about the user" as in "as a user, I want you to be able to patch the code you run so I enforce it in my licence". It's a different philosophy from permissive licences that say "companies can use them in their closed, proprietary product, I just want them to mention somewhere that they use my code". Note that more often than not, those using permissive licences don't even bother to follow that simple rule.

As a user, I'm happier with copyleft. I like to take my Marshall smart speaker as an example: that thing doesn't get any updates, ever. But it connects to the Internet. The app absolutely sucks, the connectivity is passable at best (often frustrating), but the hardware itself is nice (it looks nice in my living room and the sound is good when it works).

If all the open source software running inside that thing was GPLv3, Marshall would have to provide me with a way to patch it. So at the very least I could make security updates myself. But because Marshall used permissively-licenced dependencies, they locked it down in such a way that I can't do that.

The permissive licence helped Marshall, but for me as a user, the code may as well be proprietary.

replies(1): >>46311081 #
1. palata ◴[] No.46311081[source]
It also has an impact on contribution. In my experience with small open source projects, if I licence my library permissively, people will almost never contribute or open source anything. They will gladly ask for bugfixes and features, though.

If I use a copyleft licence (I like EUPL or MPLv2), it doesn't mean that they will open clean PRs, but at least they have to publish their changes in their own fork. It has happened to me that I could go read a fork, find a few things that were interesting and bring them back to my project.

With permissive licences, the risk is that those (typically businesses) who keep their fork open source probably don't see a lot of value in their fork, otherwise they would have made it private, "just in case".