Parse, don't validate (2019)

(lexi-lambda.github.io)

398 points declanhaigh | 1 comments | 07 Mar 23 08:47 UTC | HN request time: 0s | source

Show context

kybernetikos ◴[07 Mar 23 13:41 UTC] No.35055198[source]▶

This is obviously good advice almost all of the time.

However, I have had to deal occasionally with http libraries that tried to parse everything and would not give you access to anything that they could not parse. This was incredibly frustrating for corner cases that the library authors hadn't considered.

If you are the one who is going to take action on the data, parse don't validate is the correct approach. If you are writing a library that deals with data that it doesn't fully understand, and you're handing that data to someone else to take action with, then it may not always be the right approach.

replies(2): >>35056991 #>>35059019 #

tizzy ◴[07 Mar 23 16:10 UTC] No.35056991[source]▶

>>35055198 #

This seems like good library design. As annoying as it is, it means the things you can use are well tested and supported.

What was your solution to this? Parse the things the library didn't?

replies(1): >>35057146 #

kybernetikos ◴[07 Mar 23 16:19 UTC] No.35057146[source]▶

>>35056991 #

The library didn't allow you to see the things (e.g. particular headers or options for those headers) that it didn't know to parse. Ultimately we had to migrate to a different library that didn't restrict us to just what the library knew. The decision not to let us even see things that the library didn't know about is particularly egregious where best practices are changing over time.

In my view it's a very bad design for an http library, although it would have been a lot less frustrating if it had at least provided an escape hatch.

replies(2): >>35057263 #>>35063426 #

jimbokun ◴[07 Mar 23 16:28 UTC] No.35057263[source]▶

>>35057146 #

Sounds like its model is not the HTTP RFC, but something more specific to some domain.

Which I agree, is a poor design choice. A type modeling an HTTP request should model the RFC definition as closely as possible.

replies(1): >>35058994 #

aidenn0 ◴[07 Mar 23 18:22 UTC] No.35058994[source]▶

>>35057263 #

> Which I agree, is a poor design choice. A type modeling an HTTP request should model the RFC definition as closely as possible.

I couldn't disagree more. A type modeling an HTTP request should model HTTP requests. Not some theoretical description of an HTTP request.

replies(2): >>35059787 #>>35068654 #

recursive ◴[07 Mar 23 19:13 UTC] No.35059787{3}[source]▶

>>35058994 #

HTTP requests are not things that humans discovered in nature. They are abstractions, created entirely by specification. In some sense, an HTTP request is exactly that which conforms to the specification.

replies(3): >>35060430 #>>35062064 #>>35063121 #

aidenn0 ◴[07 Mar 23 20:03 UTC] No.35060430{4}[source]▶

>>35059787 #

To an extent, that sounds like saying the thing I am sitting in is not a chair since it has 5 legs.

replies(1): >>35061204 #

recursive ◴[07 Mar 23 21:05 UTC] No.35061204{5}[source]▶

>>35060430 #

I mean, if chairs were things with formal specifications, and that specification said so, yeah.

But in this universe, no.

replies(1): >>35063081 #

1. girvo ◴[07 Mar 23 23:46 UTC] No.35063081{6}[source]▶

>>35061204 #

If you can completely ignore HTTP request data that happens to not perfectly meet the RFC at your work (or, more specifically for us, Modbus RTU responses), I salute you. Sadly, I can’t, we get some wild stuff that we still need to attempt to handle. Both HTTP and Modbus!

↑