←back to thread

Parse, don't validate (2019)

(lexi-lambda.github.io)
398 points declanhaigh | 1 comments | | HN request time: 0s | source
Show context
kybernetikos ◴[] No.35055198[source]
This is obviously good advice almost all of the time.

However, I have had to deal occasionally with http libraries that tried to parse everything and would not give you access to anything that they could not parse. This was incredibly frustrating for corner cases that the library authors hadn't considered.

If you are the one who is going to take action on the data, parse don't validate is the correct approach. If you are writing a library that deals with data that it doesn't fully understand, and you're handing that data to someone else to take action with, then it may not always be the right approach.

replies(2): >>35056991 #>>35059019 #
tizzy ◴[] No.35056991[source]
This seems like good library design. As annoying as it is, it means the things you can use are well tested and supported.

What was your solution to this? Parse the things the library didn't?

replies(1): >>35057146 #
kybernetikos ◴[] No.35057146[source]
The library didn't allow you to see the things (e.g. particular headers or options for those headers) that it didn't know to parse. Ultimately we had to migrate to a different library that didn't restrict us to just what the library knew. The decision not to let us even see things that the library didn't know about is particularly egregious where best practices are changing over time.

In my view it's a very bad design for an http library, although it would have been a lot less frustrating if it had at least provided an escape hatch.

replies(2): >>35057263 #>>35063426 #
jimbokun ◴[] No.35057263[source]
Sounds like its model is not the HTTP RFC, but something more specific to some domain.

Which I agree, is a poor design choice. A type modeling an HTTP request should model the RFC definition as closely as possible.

replies(1): >>35058994 #
aidenn0 ◴[] No.35058994[source]
> Which I agree, is a poor design choice. A type modeling an HTTP request should model the RFC definition as closely as possible.

I couldn't disagree more. A type modeling an HTTP request should model HTTP requests. Not some theoretical description of an HTTP request.

replies(2): >>35059787 #>>35068654 #
recursive ◴[] No.35059787{3}[source]
HTTP requests are not things that humans discovered in nature. They are abstractions, created entirely by specification. In some sense, an HTTP request is exactly that which conforms to the specification.
replies(3): >>35060430 #>>35062064 #>>35063121 #
aidenn0 ◴[] No.35060430{4}[source]
To an extent, that sounds like saying the thing I am sitting in is not a chair since it has 5 legs.
replies(1): >>35061204 #
recursive ◴[] No.35061204{5}[source]
I mean, if chairs were things with formal specifications, and that specification said so, yeah.

But in this universe, no.

replies(1): >>35063081 #
1. girvo ◴[] No.35063081{6}[source]
If you can completely ignore HTTP request data that happens to not perfectly meet the RFC at your work (or, more specifically for us, Modbus RTU responses), I salute you. Sadly, I can’t, we get some wild stuff that we still need to attempt to handle. Both HTTP and Modbus!