Most active commenters
  • ukj(15)
  • detaro(3)
  • justinpombrio(3)

←back to thread

Parse, Don't Validate (2019)

(lexi-lambda.github.io)
389 points melse | 27 comments | | HN request time: 1.565s | source | bottom
Show context
seanwilson ◴[] No.27640953[source]
From the Twitter link:

> IME, people in dynamic languages almost never program this way, though—they prefer to use validation and some form of shotgun parsing. My guess as to why? Writing that kind of code in dynamically-typed languages is often a lot more boilerplate than it is in statically-typed ones!

I feel that once you've got experience working in (usually functional) programming languages with strong static type checking, flakey dynamic code that relies on runtime checks and just being careful to avoid runtime errors makes your skin crawl, and you'll intuitively gravitate towards designs that takes advantage of strong static type checks.

When all you know is dynamic languages, the design guidance you get from strong static type checking is lost so there's more bad design paths you can go down. Patching up flakey code with ad-hoc runtime checks and debugging runtime errors becomes the norm because you just don't know any better and the type system isn't going to teach you.

More general advice would be "prefer strong static type checking over runtime checks" as it makes a lot of design and robustness problems go away.

Even if you can't use e.g. Haskell or OCaml in your daily work, a few weeks or just of few days of trying to learn them will open your eyes and make you a better coder elsewhere. Map/filter/reduce, immutable data structures, non-nullable types etc. have been in other languages for over 30 years before these ideas became more mainstream best practices for example (I'm still waiting for pattern matching + algebraic data types).

It's weird how long it's taking for people to rediscover why strong static types were a good idea.

replies(10): >>27641187 #>>27641516 #>>27641651 #>>27641837 #>>27641858 #>>27641960 #>>27642032 #>>27643060 #>>27644651 #>>27657615 #
ukj ◴[] No.27641651[source]
Every programming paradigm is a good idea if the respective trade-offs are acceptable to you.

For example, one good reason why strong static types are a bad idea... they prevent you from implementing dynamic dispatch.

Routers. You can't have routers.

replies(3): >>27641741 #>>27642043 #>>27642764 #
1. ImprobableTruth ◴[] No.27642043[source]
Huh? Could you give a specific example? Because e.g. C++ and Rust definitely have dynamic dispatch through their vtable mechanisms.
replies(1): >>27642108 #
2. ukj ◴[] No.27642108[source]
Do you understand the difference between compile time and runtime?

Neither C++ nor Rust give you static type safety AND dynamic dispatch because all of the safety checks for C++ and Rust happen at compile time. Not runtime.

replies(4): >>27642140 #>>27642147 #>>27642150 #>>27642761 #
3. detaro ◴[] No.27642140[source]
> Neither C++ nor Rust have dynamic dispatch

You appear to be using some other definition of dynamic dispatch than the rest of the software industry...

replies(1): >>27642184 #
4. dkersten ◴[] No.27642147[source]
Dynamic languages do it at runtime too, JUST LIKE rust and C++ do. What's the difference?

C++ and Rust let you have compile-time safety, until you choose to give it up and have runtime checks instead. Dynamic languages only allow the latter. Static languages let you choose, dynamic languages chose the latter for you in all cases. Both can have dynamic dispatch.

Besides, static languages can have compile-time type safe dynamic dispatch, if you constrain the dispatch to compile-time-known types (eg std::variant). You only lose that if you want fully unconstrained dynamism, in which case you defer type checking to runtime. Which is what dynamic languages always have.

So both C++ and Rust DO have dynamic dispatch and the programmer gets to choose what level of the dynamism/safety trade off they want. And yes, these features ARE first class features of the languages.

replies(1): >>27642207 #
5. jsnell ◴[] No.27642150[source]
I think you might need to define what you mean by dynamic dispatch, because it is very clearly something totally different than how the term is commonly understood.
replies(1): >>27643462 #
6. ukj ◴[] No.27642184{3}[source]
You appear to be conflating compilers with runtimes.

Dynamic dispatch happens at runtime.

C++ and Rust are compile-time tools, not runtimes.

replies(2): >>27642649 #>>27643030 #
7. ukj ◴[] No.27642207{3}[source]
>until you choose to give it up

PRECISELY

You have to give up the safety to get the feature.

So you "want type-safety". Until you don't.

>static languages can have compile-time type safe dynamic dispatch

"Compile-time dynamic dispatch" is an oxymoron. Dynamic dispatch happens at runtime.

replies(1): >>27642351 #
8. ◴[] No.27642351{4}[source]
9. BoiledCabbage ◴[] No.27642649{4}[source]
You are wrong. C++ supports dynamic dispatch.

Please read about it on Wikipedia

https://en.m.wikipedia.org/wiki/Dynamic_dispatch

And for the future to not litter HN with comments like these, next time 10 different people in thread are all explaining to you why you're mistaken, take a moment to try to listen and think through what they're saying instead of just digging deeper.

Having an open mind to learning something new, not just arguing a point is a great approach to life.

replies(1): >>27642717 #
10. lexi-lambda ◴[] No.27642761[source]
This is sort of a perplexing perspective to me. It seems tantamount to saying “you can’t predict whether a value will be a string or a number AND have static type safety because the value only exists at runtime, and static type safety only happens at compile-time.” Yes, obviously static typechecking happens at compile-time, but type systems are carefully designed so that the compile-time reasoning says something useful about what actually occurs at runtime—that is, after all, the whole point!

Focusing exclusively on what happens at compile-time is to miss the whole reason static type systems are useful in the first place: they allow compile-time reasoning about runtime behavior. Just as we can use a static type system to make predictions about programs that pass around first-class functions via static dispatch, we can also use them to make predictions about programs that use vtables or other constructions to perform dynamic dispatch. (Note that the difference between those two things isn’t even particularly well-defined; a call to a first-class function passed as an argument is a form of unknown call, and it is arguably a form of dynamic dispatch.)

Lots of statically typed languages provide dynamic dispatch. In fact, essentially all mainstream ones do: C++, Java, C#, Rust, TypeScript, even modern Fortran. None of these implementations require sacrificing static type safety in any way; rather, type systems are designed to ensure such dispatch sites are well-formed in other ways, without restricting their dynamic nature. And this is entirely in line with the OP, as there is no tension whatsoever between the techniques it describes and dynamic dispatch.

replies(1): >>27642990 #
11. ukj ◴[] No.27642990{3}[source]
You must be strawmanning my position to make this comment.

Obviously static type systems are useful. I don't even think my point is contrary to anything you are saying. This is not being said as way of undermining any particular paradigm because computation is universal - the models of computation on the other hand (programming languages) are not all “the same”. There are qualitative differences.

Every single programming paradigm is a self-imposed restriction of some sort. It is precisely this restriction that we deem useful because they prevent us from shooting off our feet with shotguns. And we also prevent ourselves from being able to express certain patterns (of course we can deliberately/explicitly turn off the self-imposed restriction! ).

Like the restriction you are posing on your self is explicit in "type systems are carefully designed so that the compile-time reasoning says something useful about what actually occurs at runtime"

If you could completely determine everything that happens at runtime you wouldn't need exception/error handling!

All software would be 100% deterministic.

And it isn't.

I can say nothing of the structure of random bitstreams from unknown sources. I only know what I EXPECT them to be. Not what they actually are.

In this context parsing untrusted data IS runtime type-checking.

12. detaro ◴[] No.27643030{4}[source]
And the compiler generates the code necessary for dynamic dispatch to happen at runtime.
replies(1): >>27643438 #
13. ukj ◴[] No.27643438{5}[source]
But it doesn’t static-type-check that particular code-path.

Because it can’t.

14. ukj ◴[] No.27643462{3}[source]
Deciding which implementation of a function handles any given piece of data at runtime.

Trivially, because you don’t have this knowledge (and therefore you can’t encode it into your type system) at compile time.

replies(1): >>27644291 #
15. Jtsummers ◴[] No.27643603{6}[source]
C++ is not a compiler. C++ is a language with a specification from which people derive compilers and standard libraries and runtimes.

C++ the language very much does tell you what to expect at runtime, though perhaps not everything you could ever want. I mean, it's not Haskell or Idris with their much richer type systems.

replies(2): >>27643772 #>>27647623 #
16. ukj ◴[] No.27643772{7}[source]
Perfect!

Please produce a piece of code (in a language such as Coq or Agda) which proves whether any given piece of random data has the type “C++ compiler” or “C++ program”.

That is the epitome of static type-checking, right?

17. justinpombrio ◴[] No.27644291{4}[source]
Aha! I think I have debugged your thinking. Wow you made that hard by arguing so much.

Apparently you do know what dynamic dispatch is, you're just wrong that it can't be type checked.

In Java, say you have an interface called `Foo` with a method `String foo()`, and two classes A and B that implement that method. Then you can write this code (apologies if the syntax isn't quite right, it's been a while since I've written Java):

    Foo foo = null;
    if (random_boolean()) {
        foo = new A();
    } else {
        foo = new B();
    }
    // This uses dynamic dispatch
    System.out.println(foo.foo())
This uses dynamic dispatch, but it is statically type checked. If you change A's `foo()` method to return an integer instead of a String, while still declaring that A implements the Foo interface, you will get a type error, at compile time.
replies(1): >>27644679 #
18. ukj ◴[] No.27644679{5}[source]
So, there is nothing dynamic about that dispatch.

Because the implementation details of Foo are actually know at compile time. Which is why you are able to type-check it.

You have literally declared all allowed (but not all possible) implementations of Foo.

What happens when Foo() is a remote/network call?

replies(2): >>27644720 #>>27644901 #
19. detaro ◴[] No.27644720{6}[source]
so you are using a different definition of dynamic dispatch than the rest of the software industry.
replies(1): >>27644755 #
20. ukj ◴[] No.27644755{7}[source]
I am using a conception (NOT a definition) that is actually dynamic.

If you can type-check your dispatcher at compile time then there is nothing dynamic about it.

Decidable (ahead of time) means your function is fully determined. Something that is fully determined is not dynamic.

It is the conception computer scientists use.

21. justinpombrio ◴[] No.27644901{6}[source]
That is not what dynamic dispatch means! It is an extremely well established term, with a very clear meaning, and that is not what it means.

I thought you were just mistaken about something, but no, instead you've redefined a well understood term without telling anyone, then aggressively refused to clarify what you meant by it and argued for hours with people, while saying they were all wrong when they used the well established term to mean its well established meaning.

The thing you're talking about is an interesting concept, but it's not called dynamic dispatch, and you will confuse everyone you talk to if you call it that. I don't know if there's a term for it.

replies(2): >>27644974 #>>27648858 #
22. ukj ◴[] No.27644974{7}[source]
“Well established” doesn’t mean anything.

According to who?

Computer scientists talk about “well formed” not “well established”.

Those are categorical definitions.

replies(1): >>27645075 #
23. justinpombrio ◴[] No.27645075{8}[source]
> According to who?

Wikipedia, every textbook you can find, the top dozen search results for "dynamic dispatch", me who has a PhD in computer science plus all the other CS PhD people I know, everyone in my office who knows the term (who are industry people, not academia people), every blog post I have ever read that uses the term, and all the other HN commenters except you. I'm really not exaggerating; a lot of CS terms have disputed meanings but not this one.

EDIT: Sorry all for engaging the troll. I thought there might have been some legitimate confusion. My bad.

replies(2): >>27645217 #>>27647108 #
24. ukj ◴[] No.27645217{9}[source]
So which textbook contains the meaning of "meaning"?

Oh, that's recursive! Which is Computer Science's domain of expertise, not the public domain.

We are talking about formal semantics here. What do programs (and computer languages are themselves programs) mean?

Point 0 of Wadler's law.

https://en.wikipedia.org/wiki/Semantics_(computer_science)

If you can type-check it at compile time then it is NOT dynamic dispatch. It's a contextual confusion.

25. ukj ◴[] No.27647108{9}[source]
There is a legitimate confusion indeed and it seems to be on your part!

What I am talking about when I say "dynamic dispatch" is the sort of dispatching done by R, LISP and Julia (and not by C++ or Java). Now, we can bicker about labels and you can insist that it's actually called "multiple dispatch" and not "dynamic dispatch", but you can't bicker about the semantic fact that "multiple dispatch" is actually more dynamic than "dynamic dispatch".

And this semantic point would've been trivial to unpack if you weren't try to win an argument.

Indeed, sorry for engaging the trolls. 20; or 30 of you. Lost count.

26. ukj ◴[] No.27647623{7}[source]
Hah! No wonder you don't grok my perspective.

If you derive a C++ compiler that accepts file A as valid C++. And I derive a C++ compiler that rejects file A as valid C++, then from the lens of type theory the two compilers have different type-signatures!

They are not the same formal language. One, or both compilers implement a language that is not C++.

27. ukj ◴[] No.27648858{7}[source]
It turns out "dynamic dispatch" is not as well established or as clear as you insist. It means at least one of five things:

0. Dynamic dispatch (as you are using it) 1. Double dispatch 2. Multiple dispatch 3. Predicate dispatch 4. All of the above collectively.

What you thought was a confusing re-definition on my part might have been an ignorance of alternative uses on yours ;)

Java doesn't do 1, 2 or 3.