←back to thread

556 points campuscodi | 10 comments | | HN request time: 0.001s | source | bottom
Show context
jgrahamc ◴[] No.41867399[source]
My email is jgc@cloudflare.com. I'd like to hear from the owners of RSS readers directly on what they are experiencing. Going to ask team to take a closer look.
replies(7): >>41867476 #>>41867836 #>>41868190 #>>41868888 #>>41869258 #>>41869657 #>>41876633 #
kalib_tweli ◴[] No.41867836[source]
There are email obfuscation and managed challenge script tags being injected into the RSS feed.

You simply shouldn't have any challenges whatsoever on an RSS feed. They're literally meant to be read by a machine.

replies(2): >>41868120 #>>41874073 #
1. kalib_tweli ◴[] No.41868120[source]
I confirmed that if you explicitly set the Content-Type response header to application/rss+xml it seems to work with Cloudflare Proxy enabled.

The issue here is that Cloudflare's content type check is naive. And the fact that CF is checking the content-type header directly needs to be made more explicit OR they need to do a file type check.

replies(1): >>41868798 #
2. londons_explore ◴[] No.41868798[source]
I wonder if popular software for generating RSS feeds might not be setting the correct content-type header? Maybe this whole issue could be mostly-fixed by a few github PR's...
replies(4): >>41869066 #>>41869112 #>>41869113 #>>41877322 #
3. kalib_tweli ◴[] No.41869066[source]
It wouldn't. It's the role of the HTTP server to set the correct content type header.
4. onli ◴[] No.41869113[source]
Correct might be debatable here as well. My blog for example sets Content-Type to text/xml, which is not exactly wrong for an RSS feed (after all, it is text and XML) and IIRC was the default back then.

There were compatibility issues with other type headers, at least in the past.

replies(1): >>41869959 #
5. djbusby ◴[] No.41869112[source]
The number of feeds with crap headers and other non-spec stuff going on; and loads of clients missing useful headers. Ugh. It seems like it should be simple; maybe that's why there are loads of naive implementations.
6. johneth ◴[] No.41869959{3}[source]
I think the current correct content types are:

'application/rss+xml' (for RSS)

'application/atom+xml' (for Atom)

replies(2): >>41870071 #>>41873080 #
7. londons_explore ◴[] No.41870071{4}[source]
Sounds like a kind samaritan could write a scanner to find as many RSS feeds as possible which look like RSS/Atom and don't have these content types, then go and patch the hosting software those feeds use to have the correct content types, or ask the webmasters to fix it if they're home-made sites.

As soon as a majority of sites use the correct types, clients can start requiring it for newly added feeds, which in turn will make webmasters make it right if they want their feed to work.

8. onli ◴[] No.41873080{4}[source]
Not even Cloudflares own blog uses those, https://blog.cloudflare.com/rss/, or am I getting a wrong content-type shown in my dev tools? For me it is `application/xml`. So even if `application/rss+xml` were the correct type by an official spec, it's not something to rely on if it's not used commonly.
replies(1): >>41873190 #
9. johneth ◴[] No.41873190{5}[source]
I just checked Wikipedia and it says Atom's is 'application/atom+xml' (also confirmed in the IANA registry), and RSS's is 'application/rss+xml' (but it's not registered yet, and 'text/xml' is also used widely).

'application/rss+xml' seems to be the best option though in my opinion. The '+xml' in the media type tells (good) parsers to fall back to using an XML parser if they don't understand the 'rss' part, but the 'rss' part provides more accurate information on the content's type for parsers that do understand RSS.

All that said, it's a mess.

10. Klonoar ◴[] No.41877322[source]
Quite a few feeds out there use the incorrect type of text/xml, since it works slightly better in browsers by not prompting a download.

Would not surprise me if Cloudflare lumps this in with text/html protections.