←back to thread

Parse, Don't Validate (2019)

(lexi-lambda.github.io)
389 points melse | 10 comments | | HN request time: 0s | source | bottom
Show context
seanwilson ◴[] No.27640953[source]
From the Twitter link:

> IME, people in dynamic languages almost never program this way, though—they prefer to use validation and some form of shotgun parsing. My guess as to why? Writing that kind of code in dynamically-typed languages is often a lot more boilerplate than it is in statically-typed ones!

I feel that once you've got experience working in (usually functional) programming languages with strong static type checking, flakey dynamic code that relies on runtime checks and just being careful to avoid runtime errors makes your skin crawl, and you'll intuitively gravitate towards designs that takes advantage of strong static type checks.

When all you know is dynamic languages, the design guidance you get from strong static type checking is lost so there's more bad design paths you can go down. Patching up flakey code with ad-hoc runtime checks and debugging runtime errors becomes the norm because you just don't know any better and the type system isn't going to teach you.

More general advice would be "prefer strong static type checking over runtime checks" as it makes a lot of design and robustness problems go away.

Even if you can't use e.g. Haskell or OCaml in your daily work, a few weeks or just of few days of trying to learn them will open your eyes and make you a better coder elsewhere. Map/filter/reduce, immutable data structures, non-nullable types etc. have been in other languages for over 30 years before these ideas became more mainstream best practices for example (I'm still waiting for pattern matching + algebraic data types).

It's weird how long it's taking for people to rediscover why strong static types were a good idea.

replies(10): >>27641187 #>>27641516 #>>27641651 #>>27641837 #>>27641858 #>>27641960 #>>27642032 #>>27643060 #>>27644651 #>>27657615 #
1. lukashrb ◴[] No.27643060[source]
For what its worth: People don't use dynamic language because they don't know better or never used a static language. To better understand what dynamic languages bring to the table, here are some disadvantages of static types to consider:

Static types are awesome for local reasoning, but they are not that helpful in the context of the larger system (this already starts at the database, see idempotency mismatch).

Code with static types is sometimes larger and more complex than the problem its trying to solve

They tightly couple data to a type system, which (can) introduce incidental complexity >(I'm still waiting for pattern matching + algebraic data types) This is a good example, if you pattern match to a specific structure (e.g. position of fields in your algebraic data type), you tightly coupled your program to this particular structure. If the structure change, you may have to change all the code which pattern matches this structure.

replies(4): >>27643241 #>>27643284 #>>27646280 #>>27648828 #
2. tikhonj ◴[] No.27643241[source]
The example with pattern matching doesn't have anything to do with static types. You'll have exactly the same problem if you pattern match against positional arguments in Python:

    match event.get():
        case Click((x, y)):
            handle_click_at(x, y)
(Example from PEP 636[1].)

In both Python and statically typed languages you can avoid this by matching against field names rather than positions, or using some other interface to access data. This is an important design aspect to consider when writing code, but does not have anything to do with dynamic programming. The only difference static typing makes is that when you do change the type in a way that breaks existing patterns, you can know statically rather than needing failing tests or runtime errors.

The same is true for the rest of the things you've mentioned: none are specific to static typing! My experience with a lot of Haskell, Python, JavaScript and other languages is that Haskell code for the same task tends to be shorter and simpler, albeit by relying on a set of higher-level abstractions you have to learn. I don't think much of that would change for a hypothetical dynamically typed variant of Haskell either!

[1]: https://www.python.org/dev/peps/pep-0636/#matching-sequences

replies(1): >>27644451 #
3. giovannibonetti ◴[] No.27643284[source]
When you said "idempotency mismatch" you were meaning impedance mismatch, right?
replies(2): >>27643321 #>>27643452 #
4. tome ◴[] No.27643321[source]
Strange if so because it's the "Object-relational impedance mismatch" not the "Static type-relational impedance mismatch".

https://en.wikipedia.org/wiki/Object%E2%80%93relational_impe...

5. lukashrb ◴[] No.27643452[source]
Your are right! Thank you for correcting me.
6. lukashrb ◴[] No.27644451[source]
You're absolutely right. I guess I mentioned pattern matching in particular because of the cited sentence from OP "I'm still waiting for pattern matching + algebraic data types".

> The same is true for the rest of the things you've mentioned: none are specific to static typing!

Sure, I could be wrong here. I frequently am. But could you point out why do you think that?

replies(1): >>27649355 #
7. lolinder ◴[] No.27646280[source]
This argument is common, but I've never understood how a dynamically typed language is supposed to avoid coupling algorithms to data structures.

When using a data structure, I know what set of fields I expect it to have. In TypeScript, I can ask the compiler to check that my function's callers always provide data that meets my expectations. In JavaScript, I can check for these expectations at runtime or just let my function have undefined behavior.

Either way, if my function's assumptions about the data's shape don't turn out to be correct, it will break, whether or not I use a dynamic language.

It seems that most of the people who make this argument against static typing are actually arguing against violations of the Robustness Principle[0]: "be conservative in what you send, be liberal in what you accept".

A statically typed function that is as generous as possible should be no more brittle against outside change than an equally-generous dynamically typed function. The main difference is that the statically typed function is explicit about what inputs it has well-defined behavior for.

[0] https://en.wikipedia.org/wiki/Robustness_principle

replies(1): >>27651575 #
8. healsjnr1 ◴[] No.27648828[source]
In my own anecdotal experience, I think it comes down to what the product you are working on needs, and how your team works.

Recently, I spent 3 years on Scala then switched jobs and spent 3 years in Ruby.

It was a shock to go back to dynamic languages, but after 3 months, I honestly couldn't tell which felt more productive or led to more stable high quality product.

In Ruby, we had all the issues people point out about dynamic languages, but the product didn't lean heavily on complex data structures or algorithms. We embraced complexity and failure and get good processes, designs and practices to deal with this.

In Scala, we had more rigour, but I also know I spent a lot of time on type design. Once things were sorted there was a lot of confidence in it, but generally, it took a lot longer to get there.

For certain systems that is absolutely worth it, for others (and in my case) it did feel like the evolution of the product meant this effort never really paid off.

9. tikhonj ◴[] No.27649355{3}[source]
Seems like static typing is neither necessary nor sufficient to cause the particular problems you mentioned.

Static types can absolutely help with more than local reasoning. One of the main reasons I like static types is that they give me a place in the code to reflect aspects of the architecture that otherwise go implicit. Databases are a great example: in both static and dynamic languages, the behavior of your code is going to depend on the database schemas you're using, and static types give you a mechanism to declare and manage this dependency explicitly. In a general sense, the key in architecture is how a system is split up into parts and what interfaces the parts have, and a static type system is a way to make those interfaces first-class citizens of a language subject to abstraction and automation.

Code with static types might be larger than the problem it's trying to solve, but so might code with dynamic types—that's more a function of the sort of abstractions and complexity that go into the code. I've written some code to do very similar things in Haskell and Python, and my experience is that the Haskell code tends to be noticeably more compact than the Python code, even though I make liberal use of types and write out all the top-level type signatures. While some of this comes down to other differences between the languages, part of Haskell's expressiveness absolutely comes down to static types—static types make it easier to write very general code against high-level abstractions (eg monads) that would need to be narrower in dynamic languages (eg promises).

And sure, you can couple code and data in static languages, but at least the types will help you when it comes time to change. I've worked with dynamically typed programs where functions rely on some mixture of nested tuples, hashmaps, lists... etc, and it's hard to understand exactly how concepts are represented in code, much less change that representation. If you represent some central notion in your code as some nested primitive type (a hashmap of tuples of ...) and you want to change that representation, you'll still have to update the places it's used in the code, but without a static type system you won't get a list of those places, and you won't have assurance that updated everywhere that mattered. I'm not sure I'm explaining this pattern well, but I've worked in JavaScript codebases where making breaking changes to core data components was basically impossible.

Point being, all the problems you mention come up in both static an dynamic languages. There might be a case that they're more common in one or the other, but it's not obvious either way, and it's going to depend a lot on which specific static type system you're thinking of.

10. throwaway346434 ◴[] No.27651575[source]
If you are doing things with what is basically strings, as you find a lot of user input from web form inputs is, the advantages in having a lot of different strict types of Strings isnt huge. In these scenarios, using just basic types gets you a long way, because there is often very little business logic - route here, render that template, set these attributes.

Even a lot of JSON or XML parsing, you throw it into a parser and take what you need; if an unrelated field isnt what you expected, just move on with things rather than stop everything because the library author forgot about extension or openness possibilities (ie: an xs:any in a schema).

This attitude comes from the assumption not that types are unhelpful; just the chances we've modelled every outcome into our view of the world and gotten that right is unlikely.

When you get to complex systems with state changes to data and strict, well defined policies, rules engines, etc? Thats where dynamic languages often start adding all of that validation layer, to assert for right now you should act more like a type system and it's important - it probably has financial or security or other risks.