CRLF is obsolete and should be abolished

(fossil-scm.org)

422 points km | 2 comments | 13 Oct 24 19:16 UTC | HN request time: 0s | source

Show context

michaelmior ◴[13 Oct 24 20:03 UTC] No.41831072[source]▶

> various protocols (HTTP, SMTP, CSV) still "require" CRLF at the end of each line

What would be the benefit to updating legacy protocols to just use NL? You save a handful of bits at the expense of a lot of potential bugs. HTTP/1(.1) is mostly replaced by HTTP/2 and later by now anyway.

Sure, it makes sense not to require CRLF with any new protocols, but it doesn't seem worth updating legacy things.

> Even if an established protocol (HTTP, SMTP, CSV, FTP) technically requires CRLF as a line ending, do not comply.

I'm hoping this is satire. Why intentionally introduce potential bugs for the sake of making a point?

replies(13): >>41831206 #>>41831210 #>>41831225 #>>41831256 #>>41831322 #>>41831364 #>>41831391 #>>41831706 #>>41832337 #>>41832719 #>>41832751 #>>41834474 #>>41835444 #

FiloSottile ◴[13 Oct 24 20:39 UTC] No.41831391[source]▶

>>41831072 #

Exactly. Please DO NOT mess with protocols, especially legacy critical protocols based on in-band signaling.

HTTP/1.1 was regrettably but irreversibly designed with security-critical parser alignment requirements. If two implementations disagree on whether `A:B\nC:D` contains a value for C, you can build a request smuggling gadget, leading to significant attacks. We live in a post-Postel world, only ever generate and accept CRLF in protocols that specify it, however legacy and nonsensical it might be.

(I am a massive, massive SQLite fan, but this is giving me pause about using other software by the same author, at least when networks are involved.)

replies(7): >>41831450 #>>41831498 #>>41831871 #>>41832546 #>>41832632 #>>41832661 #>>41839309 #

tptacek ◴[13 Oct 24 20:47 UTC] No.41831450[source]▶

>>41831391 #

This would be more persuasive if HTTP servers didn't already widely accept bare 0ah line termination. What's the first major public web site you can find that doesn't?

replies(5): >>41831506 #>>41831717 #>>41832137 #>>41832555 #>>41832731 #

michaelmior ◴[13 Oct 24 20:55 UTC] No.41831506{3}[source]▶

>>41831450 #

We're talking about servers and clients here. The best way to ensure things work is to adhere to an established protocol. Aside from saving a few bytes, there doesn't seem to be any good reason to deviate.

replies(3): >>41831609 #>>41831637 #>>41832929 #

Aeolun ◴[14 Oct 24 00:16 UTC] No.41832929{4}[source]▶

>>41831506 #

Well, you can achieve the desired behavior in all situations by ignoring CR and treating any seen LF as NL.

I just don’t see why you’d not want to do that as the implementer. If there’s some way to exploit that behavior I can’t see it.

replies(1): >>41836805 #

immibis ◴[14 Oct 24 12:15 UTC] No.41836805{5}[source]▶

>>41832929 #

The exploit is that your request went through a proxy which followed the standard (but failed to reject the bare NL) and the client sent a header after a bare NL which you think came from the proxy but actually came from the client - such as the client's IP address in a fake X-Forwarded-For, which the proxy would have removed if it had parsed it as a header.

This attack is even worse when applied to SMTP because the attacker can forge emails that pass SPF checking, by inserting the end of one message and start of another. This can also be done in HTTP if your reverse proxy uses a single multiplexed connection to your origin server, and the attacker can make their response go to the next user and desync all responses after that.

replies(1): >>41843394 #

1. Aeolun ◴[14 Oct 24 23:39 UTC] No.41843394{6}[source]▶

>>41836805 #

Thanks, that was actually a very clear description of the problem!

The problem here is not to use one or the other, but to use a mix of both.

replies(1): >>41848471 #

2. immibis ◴[15 Oct 24 13:37 UTC] No.41848471[source]▶

>>41843394 (TP) #

And the standard is CRLF, so you're either following the standard or using a mix.

↑