←back to thread

Parse, don't validate (2019)

(lexi-lambda.github.io)
398 points declanhaigh | 1 comments | | HN request time: 0.422s | source
Show context
bruce343434 ◴[] No.35053912[source]
Note that this basically requires your language to have ergonomic support for sum types, immutable "data classes", pattern matching.

The point is to parse the input into a structure which always upholds the predicates you care about so you don't end up continuously defensively programming in ifs and asserts.

replies(12): >>35054046 #>>35054070 #>>35054386 #>>35054514 #>>35054901 #>>35054993 #>>35055124 #>>35055230 #>>35056047 #>>35057866 #>>35058185 #>>35059271 #
mtlynch ◴[] No.35054046[source]
I get a lot of value from this rule even without those language features.

I follow "Parse, Don't Validate" consistently in Go. For example, if I need to parse a JSON payload from an end-user for Foo, I define a struct called FooRequest, and I have exactly one function that creates a FooRequest instance, given a JSON stream.

Anywhere else in my application, if I have a FooRequest instance, I know that it's validated and well-formed because it had to have come from my FooRequest parsing function. I don't need sum types or any special language features beyond typing.

replies(1): >>35054157 #
jotaen ◴[] No.35054157[source]
My main take-away is the same, I wonder though whether “parse, don’t validate” is the right term for it. To me, “parse, don’t validate” somehow suggests that you should do parsing instead of validation, but the real point for me is that I still validate (as before), plus I “capture”/preserve validation success by means of a type.
replies(8): >>35054350 #>>35054377 #>>35054626 #>>35054751 #>>35055151 #>>35055232 #>>35055382 #>>35056979 #
1. friendzis ◴[] No.35055151[source]
> suggests that you should do parsing instead of validation

Kind of yes, but this discussion is much dependent on definitions of `parse` and `validate`, which the article does not explicitly elaborate on. The chapter "The power of parsing" captures this difference implicitly "validateNonEmpty always returns (), the type that contains no information". Validation, in the context of all of this, can be defined as "checking conformance to a set of rules" while parsing is mostly synonymous with deserialization.

In most practical application you explicitly do not want to only validate inputs as in you have no need to perform any computation on invalid input anyway. Sometimes you explicitly want to analyze invalid inputs, maybe try and recover some information or do some other magic. Sure then, go and validate input and do that magic on invalid input. In most cases you want to simply reject invalid inputs.

However, when you think about it, that is what parsing does. Validation happens during parsing implicitly: parser will either return a valid object or throw an error, but parsing has an added benefit that the end result is a datum of a known datatype. Of course it only really works in statically typed languages.

The thing is that it is rather easy to conflate the two. Take for example the following JSON `{"foo": "bar", "baz": 3}`. A "parser" can return 1) a list of `(String, String, String)` 3-tuples for (type, key, value) that downstream has to process again 2) full blown `FoobarizerParams` object or something in between.