Parse, don't validate (2019)

(lexi-lambda.github.io)

398 points declanhaigh | 1 comments | 07 Mar 23 08:47 UTC | HN request time: 0.207s | source

Show context

bruce343434 ◴[07 Mar 23 10:54 UTC] No.35053912[source]▶

Note that this basically requires your language to have ergonomic support for sum types, immutable "data classes", pattern matching.

The point is to parse the input into a structure which always upholds the predicates you care about so you don't end up continuously defensively programming in ifs and asserts.

replies(12): >>35054046 #>>35054070 #>>35054386 #>>35054514 #>>35054901 #>>35054993 #>>35055124 #>>35055230 #>>35056047 #>>35057866 #>>35058185 #>>35059271 #

mtlynch ◴[07 Mar 23 11:13 UTC] No.35054046[source]▶

>>35053912 #

I get a lot of value from this rule even without those language features.

I follow "Parse, Don't Validate" consistently in Go. For example, if I need to parse a JSON payload from an end-user for Foo, I define a struct called FooRequest, and I have exactly one function that creates a FooRequest instance, given a JSON stream.

Anywhere else in my application, if I have a FooRequest instance, I know that it's validated and well-formed because it had to have come from my FooRequest parsing function. I don't need sum types or any special language features beyond typing.

replies(1): >>35054157 #

jotaen ◴[07 Mar 23 11:28 UTC] No.35054157[source]▶

>>35054046 #

My main take-away is the same, I wonder though whether “parse, don’t validate” is the right term for it. To me, “parse, don’t validate” somehow suggests that you should do parsing instead of validation, but the real point for me is that I still validate (as before), plus I “capture”/preserve validation success by means of a type.

replies(8): >>35054350 #>>35054377 #>>35054626 #>>35054751 #>>35055151 #>>35055232 #>>35055382 #>>35056979 #

qsort ◴[07 Mar 23 11:57 UTC] No.35054350[source]▶

>>35054157 #

It's in the same sense of "whitelist, don't blacklist", or "by the love of god it's 2023, do not escape SQL".

Don't define reasons why the input is invalid, instead have a target struct/object, and parse the input into that object.

replies(1): >>35055225 #

blincoln ◴[07 Mar 23 13:45 UTC] No.35055225[source]▶

>>35054350 #

I like this explanation and approach, but how does it solve the first problem described in the article - the case where there's an array being processed that might be empty?

There are plenty of cases in real-world code where an array that's part of a struct or object may or may not contain any elements. If you're just parsing input into that, it seems like you'd either still end up doing an equivalent of checking whether the array is empty or not everywhere the array might be used later, even if that check is looking at an "array has elements" type flag in the struct/object, and so you're still maintaining a description of ways that the input may be invalid. But I'm not a world-class programmer, so maybe I'm missing something. Maybe you mean something like for branches of the code that require a non-empty array, you have a second struct/object and parser that's more strict and errors out if the array is empty?

replies(4): >>35055647 #>>35057974 #>>35058114 #>>35063412 #

bcrosby95 ◴[07 Mar 23 17:22 UTC] No.35058114[source]▶

>>35055225 #

Depending upon language, and what you're using to hold the array, inheritance.

A 'NotEmpty a' is just a subclass of a potentially empty 'a'. You also get the desirable behavior, in this scenario, of automatic upcasting of a 'NotEmpty a' into a regular old 'a'.

replies(1): >>35060098 #

1. secdeal ◴[07 Mar 23 19:38 UTC] No.35060098[source]▶

>>35058114 #

Not quite, 'a' is the type of the elements 'NonEmpty a' contains.

It is rather the subclass of some kind of 'Iterable a'.

↑