Parse, don't validate (2019)

(lexi-lambda.github.io)

398 points declanhaigh | 2 comments | 07 Mar 23 08:47 UTC | HN request time: 0s | source

Show context

kybernetikos ◴[07 Mar 23 13:41 UTC] No.35055198[source]▶

This is obviously good advice almost all of the time.

However, I have had to deal occasionally with http libraries that tried to parse everything and would not give you access to anything that they could not parse. This was incredibly frustrating for corner cases that the library authors hadn't considered.

If you are the one who is going to take action on the data, parse don't validate is the correct approach. If you are writing a library that deals with data that it doesn't fully understand, and you're handing that data to someone else to take action with, then it may not always be the right approach.

replies(2): >>35056991 #>>35059019 #

tizzy ◴[07 Mar 23 16:10 UTC] No.35056991[source]▶

>>35055198 #

This seems like good library design. As annoying as it is, it means the things you can use are well tested and supported.

What was your solution to this? Parse the things the library didn't?

replies(1): >>35057146 #

kybernetikos ◴[07 Mar 23 16:19 UTC] No.35057146[source]▶

>>35056991 #

The library didn't allow you to see the things (e.g. particular headers or options for those headers) that it didn't know to parse. Ultimately we had to migrate to a different library that didn't restrict us to just what the library knew. The decision not to let us even see things that the library didn't know about is particularly egregious where best practices are changing over time.

In my view it's a very bad design for an http library, although it would have been a lot less frustrating if it had at least provided an escape hatch.

replies(2): >>35057263 #>>35063426 #

jimbokun ◴[07 Mar 23 16:28 UTC] No.35057263[source]▶

>>35057146 #

Sounds like its model is not the HTTP RFC, but something more specific to some domain.

Which I agree, is a poor design choice. A type modeling an HTTP request should model the RFC definition as closely as possible.

replies(1): >>35058994 #

aidenn0 ◴[07 Mar 23 18:22 UTC] No.35058994[source]▶

>>35057263 #

> Which I agree, is a poor design choice. A type modeling an HTTP request should model the RFC definition as closely as possible.

I couldn't disagree more. A type modeling an HTTP request should model HTTP requests. Not some theoretical description of an HTTP request.

replies(2): >>35059787 #>>35068654 #

jimbokun ◴[08 Mar 23 13:21 UTC] No.35068654{5}[source]▶

>>35058994 #

I’m confused, if the RFC does not accurately model HTTP requests, what does?

replies(1): >>35071251 #

1. aidenn0 ◴[08 Mar 23 16:44 UTC] No.35071251{6}[source]▶

>>35068654 #

Your customers define what an HTTP request is.

To be less snarky, the RFC defines what a well-formed HTTP request is. In the wild there are a lot of malformed HTTP requests that business cases may require handling.

replies(1): >>35090904 #

2. jimbokun ◴[10 Mar 23 05:55 UTC] No.35090904[source]▶

>>35071251 (TP) #

I suppose the “parse don’t validate” philosophy would recommend first transforming the I’ll formed request into a data structure that only models well formed requests, before it’s processed by any other part of the program.

↑