←back to thread

Parse, don't validate (2019)

(lexi-lambda.github.io)
398 points declanhaigh | 1 comments | | HN request time: 0.218s | source
Show context
jmull ◴[] No.35055254[source]
I get the point, but I wonder at why people find this particular article compelling. To me it's weak...

It's built on a particular technical distinction between paring and validating that (1) is not all that commonly understood or consistently accepted and (2) not actually explicitly stated in the article!

(validation: check data assumptions, fail of not met; parse: check data assumptions, fail if not met, and on success return data as a new type reflecting the additional constraints of the data, which can therefore be checked at compile time. Notice parsing includes validation, which makes the title of the article quite poor.)

That's important to know because the distinction is only meaningful in the context of certain language features, which may or may not apply.

Also, this is not great general advice:

> Push the burden of proof upward as far as possible, but no further

For one, it's a mostly meaningless, since it really just says put the burden of proof in the right place. But it implies that upward is preferable. You really want to push it upward if it's a high-level concern, and downward if it's a low-level concern. E.g., suppose you're working on an app or service that accesses the database, so the database is lower-level. You'll want to push your database-specific type transformations closer to the code that accesses the database.

Honestly, I find this whole thing kind of muddled.

(Also, in my experience, the fundamental limit here isn't on validation strategies, but the human ability to break down a problem and logically organize the solution. You can just as easily end up with an unmaintainable mess of spaghetti types as with any other useful abstraction.

replies(6): >>35055366 #>>35055866 #>>35055895 #>>35056075 #>>35057758 #>>35061557 #
1. lolinder ◴[] No.35057758[source]
> You really want to push it upward if it's a high-level concern, and downward if it's a low-level concern. E.g., suppose you're working on an app or service that accesses the database, so the database is lower-level. You'll want to push your database-specific type transformations closer to the code that accesses the database.

This confusion is, I think, just a question of different conceptions of the system architecture.

Your terminology is drawing from a three-tier architecture [0] with a presentation layer, logic layer, and data layer. Under this model, input (data) is the bottom layer and output (HTTP/GUI) is the top layer, with your application logic in the middle.

On the other hand, she is viewing the system through an inside-outside lens similar to the hexagonal architecture [1]. All input (data) and output (HTTP/GUI) is considered to be up and out of your application logic. Rather than being the middle of a sandwich, the application logic is the kernel of a seed.

This is a common way to view the system when programming in functional languages like Haskell because the goal is usually to push all I/O to the start of the call stack so as to minimize the amount of code that has to account for side effects. The three-tier architecture isn't concerned about isolating effects, so treating the data layer as the bottom layer of the code is reasonable.

In either model, the point is to push validation to the boundaries of your code and rely on the type checker to prove you're using things right within the logic layer.

[0] https://en.wikipedia.org/wiki/Multitier_architecture

[1] https://en.wikipedia.org/wiki/Hexagonal_architecture_%28soft...