Most active commenters

kortex(6)
mixmastamyk(5)
exdsq(4)
cortesoft(3)
JPKab(3)
globular-toast(3)

Popular/hot comments

>>27642800 #
>>27644709 #

←back to thread

Parse, Don't Validate (2019)

(lexi-lambda.github.io)

Show context

kortex ◴[26 Jun 21 14:15 UTC] No.27642049[source]▶

>>27639890 (OP) #

This principle is how pydantic[0] utterly revolutionized my python development experience. I went from constantly having to test functions in repls, writing tons of validation boilerplate, and still getting TypeErrors and NoneTypeErrors and AttributeErrors left and right to like...just writing code. And it working! Like one time I wrote a few hundred lines of python over the course of a day and then just ran it... and it worked. I just sat there shocked, waiting for the inevitable crash and traceback to dive in and fix something, but it never came. In Python! Incredible.

[0] https://pydantic-docs.helpmanual.io/

replies(8): >>27642308 #>>27642664 #>>27643276 #>>27643474 #>>27644758 #>>27645737 #>>27646367 #>>27647141 #

1. jimmaswell ◴[26 Jun 21 15:23 UTC] No.27642664[source]▶

>>27642049 #

I've found this to be simply a matter of experience, not tooling. As the years go by I find the majority of my code just working right - never touched anything like pydantic or validation boilerplate for my own code, besides having to write unit tests as an afterthought at work to keep the coverage metric up.

replies(5): >>27642800 #>>27642869 #>>27643369 #>>27643588 #>>27644097 #

2. vikingcaffiene ◴[26 Jun 21 15:41 UTC] No.27642800[source]▶

>>27642664 (TP) #

Man, for a dev with as much experience as you’re claiming to have, this comment ain’t a great look.

I’d argue that the more experience you get the more you write code for other people which involves adding lots of tooling, tests, etc. Even if the code works the first time, a more senior dev will make sure others have a “pit of success” they can fall into. This involves a lot more than just some “unit tests as an afterthought to keep the coverage up.”

replies(4): >>27643101 #>>27643190 #>>27643436 #>>27644382 #

3. cortesoft ◴[26 Jun 21 15:48 UTC] No.27642869[source]▶

>>27642664 (TP) #

Tests aren't to make sure your code works when you write it, it is to make sure it doesn't break when you make changes down the line.

replies(1): >>27644709 #

4. mixmastamyk ◴[26 Jun 21 16:13 UTC] No.27643101[source]▶

>>27642800 #

Adding lots, no. I agree with the grandparent.

Keeping the code simple, finding the right abstractions, untangling coupling, gets the most bang for the buck. See the “beyond pep8” talk for a enlightened perspective.

That said, lightweight testing and tools like pyflakes to prevent egregious errors helps an experienced dev write very productively. Typing helps the most with large, venerable projects with numerous devs of differing experience levels.

replies(2): >>27643669 #>>27644163 #

5. ◴[26 Jun 21 16:20 UTC] No.27643190[source]▶

>>27642800 #

6. JPKab ◴[26 Jun 21 16:38 UTC] No.27643369[source]▶

>>27642664 (TP) #

I've worked with plenty of coders who talk about how awesome their code is even though they just write unit test as an afterthought. They also talk about how they don't need validation and everything is just awesome.

I hated working with those coders because they weren't really very good and their code was always the worst to maintain. They are the equivalent of a carpenter who brags about how quickly they can bang nails but can't build a stable structure to save their life.

7. JPKab ◴[26 Jun 21 16:45 UTC] No.27643436[source]▶

>>27642800 #

It's an immediate tell when someone makes statements like the one you're replying to.

It immediately tells me that they've never worked on large software projects, and if they have they haven't worked on ones that lasted more than a few months.

I apologize to folks reading this for my rather aggressive tone but I've been writing software for a long time in numerous languages, and people with the unit tests as an afterthought attitude are typically rather arrogant in fool hardy.

The most recent incarnation I've encountered is the hotshot data scientist who did okay in a few Kaggle competitions using Jupyter notebooks, and thinks they can just write software the way they did for the competitions with no test of any kind.

I had one of these on my team recently and naturally I had to do 95% of the work to turn anything he produced into a remotely decent product. I couldn't even get the guy to use nbdev, which would have allowed him to use Jupyter to write tested, documented, maintainable code.

replies(2): >>27644422 #>>27645111 #

8. globular-toast ◴[26 Jun 21 17:01 UTC] No.27643588[source]▶

>>27642664 (TP) #

I agree. I'm often baffled by some developers who seem to think dynamic typing is a minefield that inevitably goes wrong all the time. I note these are almost always Javascript programmers, though. In practice, experience developers in dynamic languages like Python, Lisp etc. rarely make such errors. The number of bugs we deal with that would have been caught early by static typing are vanishingly small.

The best argument I've heard for doing type annotation is for documentation purposes to help future devs. But I don't completely buy this either. I touch new codebases all the time and I rarely spend much time thinking about what types will be passed. I can only assume it comes with experience.

Type annotation actually ends up taking a hell of a long time to do and is of questionable benefit if some of the codebase is not annotated. People spend sometimes hours just trying to get the type checker to say OK for code that actually works just fine!

replies(1): >>27644748 #

9. jolux ◴[26 Jun 21 17:09 UTC] No.27643669{3}[source]▶

>>27643101 #

Typing is just another guardrail, it's not a substitute for finding the right abstractions and keeping things simple.

replies(1): >>27644434 #

10. kortex ◴[26 Jun 21 17:47 UTC] No.27644097[source]▶

>>27642664 (TP) #

No this was like over a week, and 100% due to the tooling. Pydantic, pycharm, black, mypy, and flake8. Pretty much went from "type hints here and there" to "what happens if I try writing python as if it were (95%) statically typed." I'd been testing well before this point but it's not the same as writing test.

The development process is totally different when you write structured types first and then write your logic. 10/10 would recommend.

Usual caveat: this is what makes sense to me and my brain. Your experience may be different based on neurotype.

replies(1): >>27644562 #

11. kortex ◴[26 Jun 21 17:53 UTC] No.27644163{3}[source]▶

>>27643101 #

> Typing helps the most with large, venerable projects

I disagree. I've started using types from the ground up and it helps almost equally at every stage of the game. Also I aggressively rely on autocomplete for methods. It's faster this way than usual "dynamic" or "pythonic" python.

Part of it might be exactly because writing my datatypes first helps me think about the right abstractions.

The big win with python is maybe 2-10% of functions, I just want to punt and use a dict. But I have shifted >80% of what used to be dicts to Models/dataclasses and it's so much faster to write and easier to debug.

replies(1): >>27645021 #

12. hardwaregeek ◴[26 Jun 21 18:16 UTC] No.27644382[source]▶

>>27642800 #

Agreed. It's like saying "oh well I just fly the airplane really carefully". A lot of codebases eclipse the point where one person can understand the whole system. Testing, static analysis and tooling are what allows us to keep the plane flying.

replies(1): >>27645074 #

13. jimmaswell ◴[26 Jun 21 18:20 UTC] No.27644422{3}[source]▶

>>27643436 #

I've worked on large scale projects for a long time. A large portion of the kind of code I've written is impractical or impossible to actually "unit test" e.g. Unity3D components or frontend JS that interacts with a million things. When something weird is going on I'll have to dig in with console logs and breakpoints.

On certain backend code where I am able to do unit tests, they do catch the occasional edge case logic error but not at a rate that makes me concerned about only checking them in some time after the original code, which I'll have already tested myself in real use as I went along.

replies(1): >>27652922 #

14. hardwaregeek ◴[26 Jun 21 18:20 UTC] No.27644434{4}[source]▶

>>27643669 #

I agree but guardrails are pretty awesome. And if people were saying "don't use guardrails, just drive properly", I'd ask why they think guardrails and driving properly are mutually exclusive.

replies(1): >>27644958 #

15. Scarbutt ◴[26 Jun 21 18:33 UTC] No.27644562[source]▶

>>27644097 #

The development process is totally different when you write structured types first and then write your logic. 10/10 would recommend.

Unless you were writing very small throwaway scripts, in what world where you writing your logic first and thinking about your data structures later?

replies(2): >>27648867 #>>27652769 #

16. exdsq ◴[26 Jun 21 18:50 UTC] No.27644709[source]▶

>>27642869 #

How do you know your code works when you write it if you don't test it?

replies(3): >>27645375 #>>27645643 #>>27646387 #

17. exdsq ◴[26 Jun 21 18:54 UTC] No.27644748[source]▶

>>27643588 #

It's okay if you're working on a blog site, less so if you're working on an air-planes autopilot.

replies(1): >>27646018 #

18. jolux ◴[26 Jun 21 19:22 UTC] No.27644958{5}[source]▶

>>27644434 #

Exactly. To be clear I’m very pro-type systems.

19. mixmastamyk ◴[26 Jun 21 19:30 UTC] No.27645021{4}[source]▶

>>27644163 #

I don’t need to aggressively rely on tools, they are merely in the background. Perhaps what the earlier post about experience was thinking.

Also, what makes you think I’m not aware of datatypes? Currently working eight hours a day on Django models.

20. mixmastamyk ◴[26 Jun 21 19:35 UTC] No.27645074{3}[source]▶

>>27644382 #

Agreed with the end of your post. However, the top post approaches religious dogma. I argue against that even if one has some good points.

replies(1): >>27652262 #

21. mixmastamyk ◴[26 Jun 21 19:40 UTC] No.27645111{3}[source]▶

>>27643436 #

You got paid to do the work presumably. You might also be able to push back on it. Coding standards should be a thing just about anywhere competent.

In short, there are choices besides, “I alone have to do all the hard work.”

replies(1): >>27645447 #

22. syngrog66 ◴[26 Jun 21 20:10 UTC] No.27645375{3}[source]▶

>>27644709 #

you run it. look at the results or output. like the stdout, or a file it changed, or in a REPL or debugger. depends on situation

replies(2): >>27645480 #>>27646224 #

23. JPKab ◴[26 Jun 21 20:18 UTC] No.27645447{4}[source]▶

>>27645111 #

I quit the company and the team as a result of the bosses refusing to make their pet data scientists write remotely professional code.

I was more experienced with predictive algorithms and deep learning than any of the data scientists at the company but because they were brought in from an acquisition of a company that had an undeserved reputation due to a loose affiliation with MIT, they were treated like magicians while the rest of us were treated like blacksmiths.

I had the choice and I made the choice to leave. And of course I raised hell with the bosses about them not writing remotely production quality code that required extensive refactoring.

And yes I was paid to do the work but the work occupied time that I could have spent working on the other projects I had that were more commercially successful but less sexy to Silicon Valley VCs who look at valuations based on other companies' newest hottest product.

24. chrisandchris ◴[26 Jun 21 20:21 UTC] No.27645480{4}[source]▶

>>27645375 #

Yeah I‘m not sure that‘s how software engineering should work.

Tests should prove a desired behaviour. Sometimes it‘s not possible to fully run code until late in some staging, just because there are a lot of dependencies and conplexity. That‘s what tests are for (on various lebels of abstraction).

replies(1): >>27645651 #

25. cortesoft ◴[26 Jun 21 20:42 UTC] No.27645643{3}[source]▶

>>27644709 #

Sorry, should have said “aren’t JUST to make sure your code works when you write it”

I was specifically responding to the commenter I replied to, who said they didn’t need tests because their code just worked the first time after he wrote it.

26. cortesoft ◴[26 Jun 21 20:43 UTC] No.27645651{5}[source]▶

>>27645480 #

I think it depends on the task. Some code we write is so simple and only used a few times that you don’t need tests.

27. globular-toast ◴[26 Jun 21 21:23 UTC] No.27646018{3}[source]▶

>>27644748 #

JPL sent Lisp to space https://flownet.com/gat/jpl-lisp.html

replies(1): >>27646200 #

28. exdsq ◴[26 Jun 21 21:45 UTC] No.27646200{4}[source]▶

>>27646018 #

Sure and I know people who write Python that goes into space too, but it doesn't mean it'd the norm or even a good idea

29. exdsq ◴[26 Jun 21 21:49 UTC] No.27646224{4}[source]▶

>>27645375 #

Sounds laborious to manually check edge cases each time you change that code or its dependencies. I'd rather just write a test.

30. globular-toast ◴[26 Jun 21 22:09 UTC] No.27646387{3}[source]▶

>>27644709 #

You don't. But you only need to test it once (manually), then commit it.

You write automated tests so that you can keep running the tests later such that the behaviour is maintained through refactor and non-breaking changes.

31. Too ◴[27 Jun 21 05:31 UTC] No.27648867{3}[source]▶

>>27644562 #

Even having type hints on basic functions like def foo(bar: str) in a throwaway-script helps because it gives me reliable autocompletion on bar. I might be getting old or toggle between so many different languages nowadays that even basic stuff i don't want to remember, like whether it is bar.lower(), bar.lowerCase(), bar.toLower(), bar.toLocaleLowerCase().

Defining a data structure up front doesn't require a lot of boilerplate as Java incorrectly have taught all of us. Writing a statically typed typing.NamedTuple or @dataclass is literally a one-liner.

32. kortex ◴[27 Jun 21 15:21 UTC] No.27652262{4}[source]▶

>>27645074 #

> However, the top post approaches religious dogma.

I don't deny it. Join the cult of Static Python. We have cookies! And lower stress levels!

I usually wrap that spiel with my caveat "this depends greatly on your neurotype, style, environment, and other things." I have ADHD and my brain struggles with keeping bits of state in memory, so having to remember the type of every variable without my IDE tracking it for me is a huge performance drain.

However, I would contend even if your neurotype supported that mental workflow... it isn't actually better. Humans on average can handle 7 +/- 2 "pieces" of information in focus. Why spend any of your precious half-dozen pieces of salient consciousness on something a machine is really good at doing?

replies(1): >>27656996 #

33. kortex ◴[27 Jun 21 16:16 UTC] No.27652769{3}[source]▶

>>27644562 #

The world of data science, ML and computer vision research. It's very academic-heavy, which has two effects. There's an insulation between commercial software dev and it, which results in a lot of NIH, a lot of reinforcement of bad habits, and a lag in propagation of best practices. Second, and related to this, there is a tendency to just piecemeal hack towards the solution, rather than architect the system from the ground up.

It's not zero consideration of data structures, it's mostly a focus on the main data type (arrays and data frames) and not really thinking about typed records, data models and such. The majority of types are float, str, dict, np.ndarray, pd.DataFrame. No dataclasses, minimal classes, and when classes are used, it's Java101 style "all the bad parts of OOP" programming. Sadly, I've spent years in this space before learning better.

34. kortex ◴[27 Jun 21 16:30 UTC] No.27652922{4}[source]▶

>>27644422 #

> A large portion of the kind of code I've written is impractical or impossible to actually "unit test" e.g. Unity3D components or frontend JS that interacts with a million things.

Opinion: This is actually a symptom of what is (imho) a pervasive problem lodged deep in the collective consciousness of software dev: OOP with fine-grained objects. I blame (early) Java in large part for exacerbating this mentality. Encapsulation of state with mutator methods in particular. It sprays state all over the application, encourages mutation in place over immutability, coupling, validating-not-parsing, and makes it nigh-well impossible to write good tests.

It's really hard to write objects that enforce all invariants under every mutation. And when you have state strewn everywhere, it's impossible to test every nook and cranny. The combinatorial space explodes.

Objects are helpful for encapsulating state when they are course-grained, mutations are atomic, coupling occurs in one place, state changes are auditable, and the entire state can be replayed/set at once, to enable mock tests and subsystem integration tests. AKA, things like databases, reactors, and persistent data structures.

35. mixmastamyk ◴[28 Jun 21 01:00 UTC] No.27656996{5}[source]▶

>>27652262 #

Because it adds a lot of work that many projects don't need. Read wpietri's post on how candidates who write static get half as much accomplished in an interview.

Also since the tools are immature and bolted on afterward in Python, I think it's even a bit worse than it would be in something decent like C#.

↑