Most active commenters

tptacek(3)
immibis(3)

←back to thread

CRLF is obsolete and should be abolished

(fossil-scm.org)

Show context

michaelmior ◴[13 Oct 24 20:03 UTC] No.41831072[source]▶

>>41830717 (OP) #

> various protocols (HTTP, SMTP, CSV) still "require" CRLF at the end of each line

What would be the benefit to updating legacy protocols to just use NL? You save a handful of bits at the expense of a lot of potential bugs. HTTP/1(.1) is mostly replaced by HTTP/2 and later by now anyway.

Sure, it makes sense not to require CRLF with any new protocols, but it doesn't seem worth updating legacy things.

> Even if an established protocol (HTTP, SMTP, CSV, FTP) technically requires CRLF as a line ending, do not comply.

I'm hoping this is satire. Why intentionally introduce potential bugs for the sake of making a point?

replies(13): >>41831206 #>>41831210 #>>41831225 #>>41831256 #>>41831322 #>>41831364 #>>41831391 #>>41831706 #>>41832337 #>>41832719 #>>41832751 #>>41834474 #>>41835444 #

FiloSottile ◴[13 Oct 24 20:39 UTC] No.41831391[source]▶

>>41831072 #

Exactly. Please DO NOT mess with protocols, especially legacy critical protocols based on in-band signaling.

HTTP/1.1 was regrettably but irreversibly designed with security-critical parser alignment requirements. If two implementations disagree on whether `A:B\nC:D` contains a value for C, you can build a request smuggling gadget, leading to significant attacks. We live in a post-Postel world, only ever generate and accept CRLF in protocols that specify it, however legacy and nonsensical it might be.

(I am a massive, massive SQLite fan, but this is giving me pause about using other software by the same author, at least when networks are involved.)

replies(7): >>41831450 #>>41831498 #>>41831871 #>>41832546 #>>41832632 #>>41832661 #>>41839309 #

tptacek ◴[13 Oct 24 20:47 UTC] No.41831450[source]▶

>>41831391 #

This would be more persuasive if HTTP servers didn't already widely accept bare 0ah line termination. What's the first major public web site you can find that doesn't?

replies(5): >>41831506 #>>41831717 #>>41832137 #>>41832555 #>>41832731 #

1. michaelmior ◴[13 Oct 24 20:55 UTC] No.41831506[source]▶

>>41831450 #

We're talking about servers and clients here. The best way to ensure things work is to adhere to an established protocol. Aside from saving a few bytes, there doesn't seem to be any good reason to deviate.

replies(3): >>41831609 #>>41831637 #>>41832929 #

2. tptacek ◴[13 Oct 24 21:08 UTC] No.41831609[source]▶

>>41831506 (TP) #

I'm saying the consistency that Filippo says our security depends on doesn't really seem to exist in the world, which hurts the persuasiveness of that particular argument in favor of consistency.

replies(2): >>41831837 #>>41835413 #

3. Ekaros ◴[13 Oct 24 21:11 UTC] No.41831637[source]▶

>>41831506 (TP) #

There is very good reasons not to deviate as mismatch in various other things that can or are not on the path can affect things. Like reverse proxies, load balancers and so on.

4. dwattttt ◴[13 Oct 24 21:35 UTC] No.41831837[source]▶

>>41831609 #

But no one expects 0ah to be sufficient. Change that expectation, and now you have to wonder if your middleware and your backend agree on whether the middleware filtered out internal-only headers.

replies(1): >>41831921 #

5. tptacek ◴[13 Oct 24 21:45 UTC] No.41831921{3}[source]▶

>>41831837 #

Yeah, I'm not certain that this is a real issue. It might be? Certainly, I'm read in to things like TECL desync. I get the concern, that any disagreement in parsing policies is problematic for HTTP because of middleboxes. But I think the ship may have sailed on 0ah, and that it may be the case that you simply have to build HTTP systems to be bare-0ah-tolerant if you want your system to be resilient.

replies(1): >>41832774 #

6. dwattttt ◴[13 Oct 24 23:50 UTC] No.41832774{4}[source]▶

>>41831921 #

But what's bare-0ah-tolerant? Accepting _or_ ignoring bare 0ah's means you need to ensure all your moving parts agree, or you end up in the "one bit thinks this is two headers, others think it's one header".

The only situation where you don't need to know two policies match is when one of the policies rejects one of the combinations outright. Probably. Maybe.

EDIT: maybe it's better phrased as "all parts need to be bare-0ah-strict". But then it's fine if it's bare-0ah-reject; they just need to all be strict, one way or the other.

7. Aeolun ◴[14 Oct 24 00:16 UTC] No.41832929[source]▶

>>41831506 (TP) #

Well, you can achieve the desired behavior in all situations by ignoring CR and treating any seen LF as NL.

I just don’t see why you’d not want to do that as the implementer. If there’s some way to exploit that behavior I can’t see it.

replies(1): >>41836805 #

8. immibis ◴[14 Oct 24 08:22 UTC] No.41835413[source]▶

>>41831609 #

Security also doesn't exist as much as we'd like it to, which doesn't excuse making it exist even less.

9. immibis ◴[14 Oct 24 12:15 UTC] No.41836805[source]▶

>>41832929 #

The exploit is that your request went through a proxy which followed the standard (but failed to reject the bare NL) and the client sent a header after a bare NL which you think came from the proxy but actually came from the client - such as the client's IP address in a fake X-Forwarded-For, which the proxy would have removed if it had parsed it as a header.

This attack is even worse when applied to SMTP because the attacker can forge emails that pass SPF checking, by inserting the end of one message and start of another. This can also be done in HTTP if your reverse proxy uses a single multiplexed connection to your origin server, and the attacker can make their response go to the next user and desync all responses after that.

replies(1): >>41843394 #

10. Aeolun ◴[14 Oct 24 23:39 UTC] No.41843394{3}[source]▶

>>41836805 #

Thanks, that was actually a very clear description of the problem!

The problem here is not to use one or the other, but to use a mix of both.

replies(1): >>41848471 #

11. immibis ◴[15 Oct 24 13:37 UTC] No.41848471{4}[source]▶

>>41843394 #

And the standard is CRLF, so you're either following the standard or using a mix.

↑