Most active commenters

mustaphah(4)

Feedmaker: URL + CSS selectors = RSS feed

(feedmaker.fly.dev)

1. mustaphah ◴[19 Sep 25 21:55 UTC] No.45307165[source]▶

>>45306701 (OP) #

The good news: made it to the front page.

The bad news: so did the 503 page.

replies(1): >>45307455 #

2. ◴[19 Sep 25 21:58 UTC] No.45307200[source]▶

>>45306701 (OP) #

3. kschaul ◴[19 Sep 25 22:20 UTC] No.45307422[source]▶

>>45306701 (OP) #

Glad you’re find the tool interesting! A short blog post behind it: https://kschaul.com/post/2023/04/16/feedmaker-quickly-genera...

And the GitHub url (hopefully easy to host your own instance): https://github.com/kevinschaul/feedmaker

replies(1): >>45308068 #

4. benbristow ◴[19 Sep 25 22:23 UTC] No.45307455[source]▶

>>45307165 #

In some ways a good thing, no? Shows you've got work to do on optimisation for large audiences. A free stress test (unless you're on a host that charges per hit or bandwidth excess), as you will.

Did load eventually for me, thought it was broken as no styles but looks like it's intentional.

replies(1): >>45307758 #

5. bradbeattie ◴[19 Sep 25 22:31 UTC] No.45307519[source]▶

>>45306701 (OP) #

https://github.com/RSS-Bridge/rss-bridge is what I've been using for the same purpose.

6. zekenie ◴[19 Sep 25 22:33 UTC] No.45307556[source]▶

>>45306701 (OP) #

Not the same but this gives me an idea… what if there was a map reduce for doms as a web primitive. Like imagine if I could make a dom (or feed) that was some selection and transformation of another dom

replies(2): >>45307656 #>>45307745 #

7. onedognight ◴[19 Sep 25 22:46 UTC] No.45307656[source]▶

>>45307556 #

You have just re-invented XLST.

replies(1): >>45310406 #

8. 1-more ◴[19 Sep 25 22:56 UTC] No.45307745[source]▶

>>45307556 #

https://www.w3schools.com/xml/tryxslt.asp?xmlfile=cdcatalog&... give it a whirl!

9. uyzstvqs ◴[19 Sep 25 22:57 UTC] No.45307758{3}[source]▶

>>45307455 #

Seems to be hosted using fly.io

10. int0x29 ◴[19 Sep 25 23:03 UTC] No.45307815[source]▶

>>45306701 (OP) #

I made a CGI program that ran CSS selectors against URLs and returned the output. I debated making it public and then realized I probably didn't want to run an open proxy. I'm curious how long this will last.

11. crazygringo ◴[19 Sep 25 23:07 UTC] No.45307858[source]▶

>>45306701 (OP) #

I love this.

Has anyone tested to see if it works with Blogtrottr which will email you whenever there's a new item in an RSS feed?

Just since this doesn't seem like it even includes a date field in the RSS? And of course no guid. So I'm wondering how compatible it winds up being.

replies(1): >>45307940 #

12. kevincox ◴[19 Sep 25 23:17 UTC] No.45307940[source]▶

>>45307858 #

Dates shouldn't matter. The feed has ID elements which is what identify entries. Atom has no guid element. So I would expect this to work with any reader.

replies(1): >>45310347 #

13. mustaphah ◴[19 Sep 25 23:33 UTC] No.45308068[source]▶

>>45307422 #

Looks like you're hosting this on fly.io - PAYG model. You could probably host this for free on Cloudflare Workers; 100k requests/day on the free tier; static content (the homepage) is free & unlimited.

Edit: The catch is the 10ms CPU cap per request - you'd need a super lean implementation. Django's too heavy for that.

replies(2): >>45308349 #>>45308495 #

14. 0cf8612b2e1e ◴[20 Sep 25 00:09 UTC] No.45308349{3}[source]▶

>>45308068 #

Python alone is many milliseconds to start. Unless they give you some allowances for interpreter overhead.

15. mustaphah ◴[20 Sep 25 00:29 UTC] No.45308495{3}[source]▶

>>45308068 #

Well, someone already did with JS: https://github.com/ProfessorManhattan/rss-worker

16. edoceo ◴[20 Sep 25 04:24 UTC] No.45310347{3}[source]▶

>>45307940 #

I wish they had concrete, accurate id and created_at. IIRC these attributes are fixed in AT.

17. pimlottc ◴[20 Sep 25 04:34 UTC] No.45310406{3}[source]▶

>>45307656 #

*XSLT

18. ZYbCRq22HbJ2y7 ◴[20 Sep 25 05:20 UTC] No.45310623[source]▶

>>45306701 (OP) #

Should be able to achieve this without selectors with HTML to Markdownish (something like Firefox's Reader mode).

19. mg ◴[20 Sep 25 05:52 UTC] No.45310784[source]▶

>>45306701 (OP) #

That is a good idea.

59 requirements, including Django, seems pretty heavy though?

For my own RSS feed, I use this 48 line Python file with no dependencies outside the standard library:

https://github.com/no-gravity/atomfeed.py

It takes an array with the entries as input, not a web page. But I guess the HTML parsing should take no more than another few lines? For HTML parsing, I have good experiences with the lxml module which is in the Debian repos. It is fast and works pretty well.

20. ulrischa ◴[20 Sep 25 08:23 UTC] No.45311504[source]▶

>>45306701 (OP) #

Same can be done wirh freshrss

21. gottlobflegel ◴[20 Sep 25 08:51 UTC] No.45311616[source]▶

>>45306701 (OP) #

You can just use an XSLT stylesheet like this: https://wwwcip.cs.fau.de/~oc45ujef/misc/src/atom.xsl xsltproc includes a handy --html flag that lets you just process the source file directly.

replies(1): >>45311747 #

22. chmod775 ◴[20 Sep 25 09:22 UTC] No.45311747[source]▶

>>45311616 #

It's not "just" when the format has enough visual noise to give perl a run for their money. I'm getting a migraine trying to figure out what's going on in that .xsl

↑