Most active commenters
  • hakunin(5)
  • cyberax(4)
  • exclipy(3)
  • zakirullin(3)
  • YZF(3)
  • ferguess_k(3)
  • awesome_dude(3)
  • seadan83(3)
  • brabel(3)

←back to thread

1455 points nromiun | 107 comments | | HN request time: 1.579s | source | bottom
1. exclipy ◴[] No.45077894[source]
This was my main takeaway from A Philosophy Of Software Design by John Ousterhout. It is the best book on this subject and I recommend it to every software developer.

Basically, you should aim to minimise complexity in software design, but importantly, complexity is defined as "how difficult is it to make changes to it". "How difficult" is largely determined by the amount of cognitive load necessary to understand it.

replies(11): >>45077906 #>>45077954 #>>45078135 #>>45078497 #>>45078728 #>>45078760 #>>45078826 #>>45078970 #>>45079961 #>>45080019 #>>45082718 #
2. zakirullin ◴[] No.45077906[source]
That's best book on the topic! The article was inspired by this exact book. And John is a very good person, we discussed a thing or two about the article.
replies(1): >>45077958 #
3. bsenftner ◴[] No.45077954[source]
Which is why I consider DRY (Don't Repeat Yourself) to be an anti-rule until an application is fairly well understood and multiple versions exist. DO repeat yourself, and do not create some smart version of what you think the problem is before you're attempting the 3rd version. Version 1 is how you figure out the problem space, version 2 is how you figure out your solution as a maintainable dynamic thing within a changing tech landscape, and version 3 is when DRY is look at for the first time for that application.
replies(5): >>45078178 #>>45078299 #>>45078606 #>>45078696 #>>45079410 #
4. exclipy ◴[] No.45077958[source]
Oh! I was surprised you didn't link or mention the book
replies(1): >>45077969 #
5. zakirullin ◴[] No.45077969{3}[source]
It is mentioned/quoted in Deep Modules section: https://github.com/zakirullin/cognitive-load?tab=readme-ov-f...

Maybe I should make it more visible.

6. hinkley ◴[] No.45078135[source]
It’s a pain in the ass to source a copy of this book without giving Jeff Bezos all the money. If anyone reading this thread knows John, could you bring this to his attention?

I even tried calling the bookstore on his campus and they said try back at the beginning of a semester, they didn’t have any copies.

My local book store could not source me a copy, and neither IIRC could Powell’s.

replies(2): >>45078758 #>>45081283 #
7. hinkley ◴[] No.45078178[source]
Some people use a gardening metaphor for code, and I think that since code is from and for humans, that’s not a terrible analogy. It’s organic by origin if not by nature.

When you’re dealing with perennial plants, there’s only so much control you actually have, and there’s a list of things you know you have to do with them but you cannot do them all at once. There is what you need to do now, what you need to do next year, and a theory of what you’ll do over the next five years. And two years into any five year plan, the five year plan has completely changed. You’re hedging your bets.

Traditional Formal English and French gardens try to “master” the plants. Force them to behave to an exacting standard. It’s only possible with both a high degree of skill and a vast pool of labor. They aren’t really about nature, or food. They’re displays of opulence. They are conspicuous consumption. They are keeping up with the Joneses. Some people love that about them. More practical people see it as pretentious bullshit.

I think we all know a few companies that make a bad idea work by sheer force of will and overwhelming resources.

replies(1): >>45081213 #
8. zahlman ◴[] No.45078299[source]
DRY isn't about not reimplementing things; it's about not literally copying and pasting code. Which I have seen all the time, and which some might find easier now but will definitely make the system harder to change (correctly) at some point later on.
replies(10): >>45078465 #>>45078493 #>>45078525 #>>45078789 #>>45078797 #>>45078961 #>>45079164 #>>45079325 #>>45079628 #>>45079966 #
9. nicoburns ◴[] No.45078465{3}[source]
Yeah, I've seen codebases where you have several hundred line components copy-pasted multiple times with say 10-20 lines changed, and you literally have to diff the files to find out why there are several.

This is unhelpful even if the design is a complete mess.

10. ryeats ◴[] No.45078493{3}[source]
This is a trap junior devs fall into DRY isn't free it can be premature optimization since in order to avoid copying code you often add both an abstraction AND couple components together that are logically separate. The issues are at some point they may have slightly different requirements and if done repeatedly you can get to a point that you have all these small layers of abstraction that are cross cutting concerns and making changes have a bigger blast radius than you can intuit easily.
replies(5): >>45078671 #>>45078700 #>>45079551 #>>45080482 #>>45081244 #
11. ◴[] No.45078525{3}[source]
12. martinpw ◴[] No.45078606[source]
Closely related to the Rule of Three - ok to duplicate once, but if it is needed a third time, consider refactoring: https://en.wikipedia.org/wiki/Rule_of_three_(computer_progra...

I think it's a pretty good compromise. I have tried in the past not to duplicate code at all, and it often ends up more pain than gain. Allow copy/paste if code is needed in two different places, but refactor if needed in three or more, is a pretty good rule of thumb.

replies(4): >>45078751 #>>45078801 #>>45080238 #>>45080568 #
13. rkomorn ◴[] No.45078671{4}[source]
The reverse of that is people introducing bugs because code that wasn't DRY enough was only changed in some of the places that needed to be changed instead of all the places.

To me, it's the things that are specifically intended to behave the same should be kept DRY.

replies(2): >>45079743 #>>45081298 #
14. tialaramex ◴[] No.45078696[source]
I think more than a few people have recommended waiting until the 3rd or 4th X before you say OK, Don't Repeat Yourself we need to factor this out. That's where my rule of thumb is too.

Deliberately going earlier makes sense if experience teaches you there will be 3+ of this eventually, but the point where I'm going to pick "Decline" and write that you need to fix this first is when I see you've repeated something 4-5 times, that's too many, we have machines to do repetition for us, have the machine do it.

An EnableEditor function? OK, meaningful name. EnablePublisher? Hmm, yes I understand the name scheme but I get a bad feeling. EnableCoAuthor? Approved with a stern note to reconsider, are we really never adding more of these, is there really some reason you can't factor this out? EnableAuditor. No. Stop, this function is named Enable and it takes a Role, do not copy-paste and change the names.

15. zahlman ◴[] No.45078700{4}[source]
If you notice that two parts of the code look similar, but have a good reason not to merge or refactor, that deserves a signpost comment.

If you're copying and pasting something, there probably isn't a good reason for that. (The best common reason I can think of is "the language / framework demands so much boilerplate to reuse this little bit of code that it's a net loss" — which is still a bad feeling.)

If you rewrite something without noticing that you're doing so, something has definitely gone wrong.

If a client's requirements change to the point where you can't accommodate them in the nicely refactored function (or to the point where doing so would create an abomination) — then you can make the separate, similar looking version.

replies(2): >>45078878 #>>45079785 #
16. tverbeure ◴[] No.45078728[source]
Only half joking: I don’t think I trust a book from an author who has inflicted decades of TCL pain on me (and on the entire community of EDA tool users.)
replies(1): >>45079711 #
17. rekrsiv ◴[] No.45078751{3}[source]
On the other hand, just because you know you're going to have to refactor, doesn't mean you should start refactoring once you reach three; you might not yet know the ideal shape for this code until many more duplications.
18. tialaramex ◴[] No.45078758[source]
That sucks. Ordinarily although a weird volume there's no demand for won't be fast a bookshop should be able to get anything in print. Is there some reason it's specific to this book do you think?
19. YZF ◴[] No.45078760[source]
The problem is no set of rules can replace taste, judgement, experience and intuition. Every rule can be used to argue anything.

You can't win architecture arguments.

I like the article but the people who need it won't understand it and the people who don't need it already know this. As we say, it's not a technical problem, it's always a people and culture problem. Architecture just follows people and culture. If you have Rob Pike and Google you'll get Go. You can't read some book and make Go. (whether you like it or not is a different question).

replies(9): >>45078947 #>>45079055 #>>45079393 #>>45079903 #>>45079931 #>>45079994 #>>45080208 #>>45080993 #>>45083102 #
20. stevage ◴[] No.45078789{3}[source]
Copying and pasting code is often fine, particularly when you make a change to one of the copies.

Over time I have come to prefer having two near copies that are each more concretely expressive of their task than a more abstract version that caters to both.

21. YZF ◴[] No.45078797{3}[source]
But sometimes you should copy and paste code because those difference pieces of code can evolve independently. Knowing when to do this and when not to do this is what we do and no rule can blindly say one way or the other.

Even the most obvious of functions like sin() and cos() may in some circumstances warrant a specialized implementation. Sure, for most stuff you should not have 10 copies of those all over the place. But sometimes you might.

DRY is a bad rule. The more appropriate rule is avoid duplicating code when not doing so results something better. I.e. judgement always trumps rules.

replies(1): >>45091977 #
22. stevage ◴[] No.45078801{3}[source]
Agreed, it works pretty well for me.

The hard edge case is when you have a thing that needs to be duplicated along two axes. So now you have two pairs of things, four total. Four simple things or one complex thing.

23. onlinehost ◴[] No.45078826[source]
I bought his book after seeing this talk of his https://youtu.be/bmSAYlu0NcY
24. smallnamespace ◴[] No.45078878{5}[source]
> If you're copying and pasting something, there probably isn't a good reason for that.

I would embrace copying and pasting for functionality that I want to be identical in two places right now, but I’m not sure ought to be identical in the future.

replies(1): >>45082725 #
25. braebo ◴[] No.45078947[source]
I’m accustomed to this principle as a musician, so it’s been interesting to see it withstand my journey into software.
replies(1): >>45079180 #
26. drbojingle ◴[] No.45078961{3}[source]
I completely disagree. Sometimes it makes things harder but not 100% of the time.

Sometimes things are only the same temporarily and shouldn't be brought together.

27. swat535 ◴[] No.45078970[source]
I've long given up on trying to find the perfect solution for Software. I don't think anyone has really "cracked the code" per se. The best we have is people's wisdom and experiences.

Ultimately, context, industries and teams vary so greatly that it doesn't make sense to quantify it.

What I've settled on instead is aiming for a balance between "mess" and "beauty" in my design. The hardest thing for me personally to grasp was that businesses are indeterministic whereas software is not, thus requirements always shifts and fitting this into the rigidity of computer systems is _difficult_.

These days, I only attempt to refactor when I start to feel the pain when I'm about to change the code.. and even then, I perform the bare minimum to clean up the code. Eventually multiple refactoring shapes a new pattern which can be pulled into an abstraction.

replies(1): >>45082332 #
28. lokar ◴[] No.45079055[source]
I found the book helpful as a way to organize and express what I already knew
29. AstroBen ◴[] No.45079164{3}[source]
DRY is about concepts, not characters. Don't have multiple implementations of a concept

If you choose to not copy paste the code you better be damn sure the two places that use it are relying on the same concept, not just superficially similar code thats yet to diverge

30. dlivingston ◴[] No.45079180{3}[source]
Can you expand on this?
replies(1): >>45079886 #
31. ori_b ◴[] No.45079325{3}[source]
Not copy pasting code also makes it harder to change the system correctly at some point later on, because you transformed a local decision ("does this code do what the caller needs?") onto a global one ("does this code do what any possible caller needs, including across code maintained by other teams?")

There's no one rule. It takes experience and taste to make good guesses, and you'll often be wrong even so.

replies(1): >>45081240 #
32. zakirullin ◴[] No.45079393[source]
> I like the article but the people who need it won't understand it

That's true. One doesn't change his mindset just after reading. Even after some mentorship the results are far from satisfying. Engineers can completely agree with you on the topic, only to go and do just the opposite.

It seems like the hardest thing to do is to build a feedback loop - "what decisions I made in past -> what it led to". Usually that loop takes a few years to complete, and most people forget that their architecture decisions led to a disaster. Or they just disassociate themselves.

replies(2): >>45080131 #>>45082180 #
33. cyberax ◴[] No.45079410[source]
DRY means something completely different. It means that there should be just one source of truth.

Example: you have a config defined as Java/Go classes/structures. You want to check that the config file has the correct syntax. Non-DRY strategy is to describe its structure in an XSD schema (ok, ok JSON schema) and then validate the config. So you end up with two sources of truth: the schema and Java/Go classes, they can drift apart and cause problems.

The DRY way is to generate the classes/structures that define the config from that schema.

34. fenomas ◴[] No.45079551{4}[source]
All my younger colleagues have heard my catchphrase:

Copy-paste is free; abstractions are expensive.

replies(1): >>45079892 #
35. MrDarcy ◴[] No.45079628{3}[source]
I’ll bite. We’re expanding into Europe. I literally copied and pasted our entire infrastructure into a new folder named “Europe”

Now there’s a new requirement that only applies to Europe and nowhere else and it’s super easy and straight forward to change the infrastructure.

I don’t see how it was a poor choice to literally copy and paste configs that result in hundreds of thousands of lines of yaml and I have 25 yoe.

replies(2): >>45079739 #>>45081418 #
36. RossBencina ◴[] No.45079711[source]
I know you're only half joking, but I don't think you can pin the blame on John or TCL. Osterhaut's thesis, as I recall, was that there is real benefit to having multiple programming languages working at different levels of the domain (e.g. a scriptable system with the core written in a lower level language). Of course now this is a widespread practice in many domains (e.g. web browsers, numerical computing: matlab, numpy). It's an idea that has stood the test of time. TCL is just one way of achieving that aim, but at the time it was one of few open-source options available. I think scheme/lisp would have been the obvious alternative. AutoDesk went in that direction.

I remember using TCL in the 90s for my own projects as an embeddable command language. The main selling point was that it was a relatively widely understood scripting language with an easily embeddable off-the-shelf open source code base, perhaps one of the first of its kind (ignoring lisps.) Of course the limitations soon became clear. Only a few years later I had high hopes that Python would become a successor, but it went in a different direction and became significantly more difficult to embed in other applications than was TCL -- it just wasn't a primary use case for the core Python project. The modern TCL-equivalent is Lua, definitely a step up from TCL, but I think if EDA tools used Lua there would be plenty of hand-wringing too.

Just guessing, but I imagine that at the time TCL was adopted within EDA tools there were few alternatives. And once TCL was established it was going to be very hard to replace. Even if you ignore inertia at the EDA vendors, I can't imagine hardware engineers (or anyone with a job to do) wanting to switch languages every two to five years like some developers seem happy to do. It's a hard sell all around.

I reckon the best you can do is blame the vendors for (a) not choosing a more fit-for purpose language at the outset, which probably means Scheme, or inventing their own, (b) or not ripping the bandaid off at some point and switching to a more fit-for-purpose language. Blaming (b) is tough though, even today selecting an embedded application language is vexed: you want something that has good affordances as a language, is widely used and documented, easily embedded, and long-term stable. Almost everything I can think of fails the long term stability test (Python, JavaScript, even Lua which does not maintain backward compatibility between releases).

37. chipsrafferty ◴[] No.45079739{4}[source]
I think the most common way to approach that problem would be to have a "default config", and overrides. Could you go into more detail about why you didn't do this instead?

Downsides with your approach is:

1. Now whenever you want to change something both in Europe and (assuming) USA you have to do it in 2 places. If the change is the same for both, in my system, you could just update the default/shared config. If the change is different for both it's equally easy, but faster, since the overrides are smaller files.

2. It's not clear what the difference is between Europe and USA if there is 1 line different amongst thousands. If there are more differences in the future, it becomes increasingly difficult to tell the difference easily.

3. If in the future you also need to add Africa, you just compounded the problems of 1. and 2.

replies(1): >>45080014 #
38. sroerick ◴[] No.45079743{5}[source]
This is the correct take - if you're getting this type of bug, it's now past time for DRY
39. chipsrafferty ◴[] No.45079785{5}[source]
I don't think it's as cut and dry as that. In my team we require 100% test coverage. Every file requires an accompanying test file, and every test file is set up with a bunch of mocks.

Sure, we could take the Foo, Bar, and Baz tables that share 80-90% of common logic and have them inherit from a common, shared, abstract component. We've discussed it in the past. Maybe it's the better solution, maybe not. But it would mean that instead of maintaining 3 component files and 3 test file, which are very similar, and when we need to change something it is often a copy-paste job, instead we'd have to maintain 2 additional files for the shared component, and when that has to change, it would require more work as we then have to add more to the other 3 files.

Such setups can often cause a cascade of tests that need updated and PRs with dozens of files changed.

Also, there are many parts of our project where things could be done much better if we were making them from scratch. But, 6 years of changing requirements and new features and this is what we have - and at this point, I'm not sure that having a shared component would actually make things easier unless we rewrite a huge amount of the codebase, for which there is no business reason.

replies(1): >>45080638 #
40. hakunin ◴[] No.45079886{4}[source]
Not the commenter, but also had experience with making music and writing software. I think the same applies to any creative endeavor. It’s super hard to consume what you produce as “someone else” (I.e. read what you write, listen to what you compose with fresh perspective). Usually it takes time to forget and disassociate from your work, because you get too used to it while producing it. Coming back to it another day can work sometimes, but very quickly you’ll get used to it again. I think this is one of the most effective ways to achieve quality tasteful results in anything. If you can train yourself to read your own code with fresh eyes almost as soon as you write it, you’d be unlocking a powerful shortcut, a cheat code to life. It’ll make the biggest impact on your code’s (and any other creative work’s) quality. This is also why sometimes you can spend hours painstakingly trying to design something, and it comes out terrible, nobody likes it. And you can do something in 20 minutes just improvising your way through, and it comes out an elegant masterpiece. That’s because you never gave yourself time to “get used to” your work such that you couldn’t perceive the problems with it anymore. You maintained that fresh impatient perspective the entire time.
replies(1): >>45080797 #
41. wilkystyle ◴[] No.45079892{5}[source]
One of the many great takeaways from Sandi Metz's talk at Railsconf 2014: "Duplication is far cheaper than the wrong abstraction."

https://www.youtube.com/watch?v=8bZh5LMaSmE

Worth watching in its entirety, but the quote is from ~13:59 in that video.

replies(1): >>45080260 #
42. bb88 ◴[] No.45079903[source]
> Every rule can be used to argue anything.

Unless it's a rule prohibiting complexity by removing technologies. Here's a set of rules I have in my head.

1. No multithreading. (See Mozilla's "You must be this high" sign)

2. No visitor pattern. (See grug oriented development)

3. No observer pattern. (See django when signals need to run in a particular order)

4. No custom DSL's. (I need to add a new operator, damnit, and I can't parse your badly written LALR(1) schema).

5. No XML. (Fight me, I have battle scars.)

replies(2): >>45079983 #>>45080703 #
43. bogdanoff_2 ◴[] No.45079931[source]
>the people who need it won't understand it

That's not true. There's plenty of beginner programmers who will benefit from this.

44. ferguess_k ◴[] No.45079961[source]
I'm struggling with the amount of complexity. As an inexperienced SWE, I found it difficult to put everything into my head when the # of function calls (A) + # of source code files (B) to navigate reach N. In particular, if B >= 3 or A >= 3 -- because, B equals the number of screens I need to view all source code files without Command+Tab/Alt+Tab, and cognitive load increases when A increases, especially when some "patterns" are involved.

But I'm not experienced enough to tell, whether it is my inexperience that causes the difficulty, or it is indeed that the unnecessary complexity tha causes it.

replies(3): >>45080190 #>>45081396 #>>45081449 #
45. hansvm ◴[] No.45079966{3}[source]
A subtlety still exists there. Copy-pasting is fine. What you're trying to prevent with DRY is two physical locations in your codebase referring to the same semantic context (i.e., when you should change "the thing" you have to remember to change "all the places").

Somewhat off-topic, that's one usual failure mode of "DRY" code. Code is de-duplicated at a visual level rather than in terms of relevant semantics, so that changes which should only affect one path either affect both or are very complicated to reason about because of the unnecessary coupling.

46. ferguess_k ◴[] No.45079983{3}[source]
> 2. No visitor pattern. (See grug oriented development)

This one is my particular pet-peeve. But I often think that the reason is because I suck. I'm going to read "grug".

I also hate one-liner functions.

replies(1): >>45080047 #
47. ruraljuror ◴[] No.45079994[source]
Software developers don’t arrive fully formed. Rob Pike benefitted from reading a book or two.
replies(1): >>45080589 #
48. MrDarcy ◴[] No.45080014{5}[source]
I don’t do this because with a complete copy I get progressive rollouts across regions without the complexity of if statements and feature flags. That is to say, making the change twice is a feature not a bug when the changes are staggered in time.

From an operational perspective it’s much more important to ensure the code is clear and readable during an incident.

Overrides are like inheritance. They are themselves complex and add unnecessary cognitive load.

Composition is better for the common pieces that never change across regions. Think of an import statement of a common package into both the Europe and North America folders.

I easily see the one line diff among hundreds of thousands using… diff.

Regarding Africa, we’ve established 1 is a feature and 2 is a non issue, so I’d copy it again.

This approach scales both as the team scales and as the infrastructure scales. Teammates can read and comprehend much more easily than hierarchies of overrides, and changes are naturally scoped to pieces of the whole.

replies(1): >>45081345 #
49. Shorn ◴[] No.45080019[source]
Unsurprisingly, minimising the amount of cognitive complexity is how you get the most out of LLM coding agents. So now have a theoretically repeatable way to measure cognitive load as contextualised to software engineering.
50. bb88 ◴[] No.45080047{4}[source]
The real geniuses of our times can convert complexity into simplicity. The subgeniuses use complexity to flex over the common developer.

Sometimes things need to be complex -- well that's okay. The real trick is to not put complexity into places it doesn't belong.

replies(1): >>45088932 #
51. lll-o-lll ◴[] No.45080131{3}[source]
One of the big troubles is that if you join a big org you won’t get to do any architecture until you are at least “senior” or “lead”. Maybe that’s not true everywhere, but I have seen a fair bit of it. You need several iterations of “I built a thing” “oh, the thing evolved in horrible ways”, before the instincts for good architecture are developed.

I think Big Orgs need to develop younger promising talent by letting them build small green fields projects. Essentially fostering startups inside the organisation proper. Let them build and learn from mistakes (while providing the necessary knowledge; you can actually learn most of this from books, but experience is the ultimate teacher). Otherwise you end up with 5 year experienced people who cannot design themselves out of a paper bag.

replies(1): >>45082834 #
52. lll-o-lll ◴[] No.45080190[source]
Humans have a very limited amount of working memory. 3-5 items on average. A savant might be at something like 12. It is trivially easy to blow that with code. OO with code inheritance is a prime example of combinatorial explosion that can lead to more possibilities than atoms in the universe, let alone one persons ability to reason.

Watch ‘Simple made Easy’ by Rich Hickey; a classic from our industry. The battle against complexity is ever ongoing. https://youtu.be/SxdOUGdseq4?feature=shared

53. safety1st ◴[] No.45080208[source]
The approach that I am trialing with my team now, so far to good results, is as follows.

* Our coding standards require that functions have a fairly low cyclomatic complexity. The goal is to ensure that we never have a a function which is really hard to understand.

* We also require a properly descriptive header comment for each function and one of the main emphases in our code reviews is to evaluate the legibility and sensibility of each function signature very carefully. My thinking is the comment sort of describes "developer's intent" whereas the naming of everything in the signature should give you a strong indication of what the function really does.

Now is this going to buy you good architecture for free, of course not.

But what it does seem to do is keep the cognitive load manageable, pretty much all of the time these rules are followed. Understanding a particular bit of the codebase means reading one simple function, and perhaps 1-2 that are related to it.

Granted we are building websites and web applications which are at most medium fancy, not solving NASA problems, but I can say from working with certain parts of the codebase before and after these standards, it's like night and day.

One "sin" this set of rules encourages is that when the logic is unavoidably complex, people are forced to write a function which calls several other functions that are not used anywhere else; it's basically do_thing_a(); do_thing_b(); do_thing_c();. I actually find this to be great because it's easy to notice and tells us what parts of the code are sufficiently complex or awkward as to merit more careful review. Plus, I don't really care that people will say "that's not the right purpose for functions," the reality is that with proper signatures it reads like an easy "cliffs notes" in fairly plain English of exactly what's about to happen, making the code even easier to understand.

replies(6): >>45080627 #>>45080677 #>>45080764 #>>45080785 #>>45081796 #>>45088387 #
54. boredtofears ◴[] No.45080238{3}[source]
And just like the rule that it replaced, the rule of three is now often interpreted as the "correct" approach always, while I still find reality to be more nuanced.

Sometimes you do have the domain expertise to make the judgment call.

A recent example that comes to mind is a payment calculation. You can go ahead and tie that up in a nice reusable function from the get go - if you've ever dealt with a bug where payment calculations appeared different in some places and it somehow made it in front of a customer you're well aware of how painful this can be. For some things having a single source of truth outweighs any negatives associated with refactoring.

55. drivers99 ◴[] No.45080260{6}[source]
The related blog post (I just found thanks to watching that and then searching for her site) is great too: https://sandimetz.com/blog/2016/1/20/the-wrong-abstraction

It explains so much of what has been bothering me about what I work on at work, and now I understand why and some of what to do about it.

56. kragen ◴[] No.45080482{4}[source]
DRY isn't an optimization of any kind, so it can't be a premature optimization. "Premature optimization" is a specific failure mode of programmers, not just a meaningless term you can use to attack anything you don't like. "Optimization" is refactoring to reduce the use of resources (which are specifically cycles and bytes) and it's "premature" when you don't yet know that you're doing it where it matters.

Otherwise I mostly agree.

57. jaredsohn ◴[] No.45080568{3}[source]
also called WET (write everything twice or write everything thrice)
58. YZF ◴[] No.45080589{3}[source]
Fair enough. But most of the forming is by doing. Someone gave an analogy to music. You can't become a great musician by reading books. Some great musicians have never read a book about music. But yes, reading can be (a great!) part of the learning process. My point was more about rules. The article says things like replacing complex conditionals with intermediate variables. The idea that a certain construct always have higher cognitive load and should be replaced with another is too simplistic IMO.

In order to get a sense of what code is harder to understand you will do better to read code and have others read your code. A good takeaway is to keep this in mind (amongst many other factors) and to understand code needs to be maintained, extended, adapted etc.

The ideas are still useful. The danger is blindly applying rules. As long as the reader knows not to apply any of the suggestions if they don't understand why and have relevant experience ;)

59. awesome_dude ◴[] No.45080627{3}[source]
> Our coding standards require that functions have a fairly low cyclomatic complexity. The goal is to ensure that we never have a a function which is really hard to understand.

https://github.com/fzipp/gocyclo

> * We also require a properly descriptive header comment for each function and one of the main emphases in our code reviews is to evaluate the legibility and sensibility of each function signature very carefully. My thinking is the comment sort of describes "developer's intent" whereas the naming of everything in the signature should give you a strong indication of what the function really does.

https://github.com/mgechev/revive

> Now is this going to buy you good architecture for free, of course not.

It's not architecture to tell people to comment on their functions.

Also FTR, people confuse cyclomatic complexity for automagically making code confusing to the weirdest example I have ever had to deal with - a team had unilaterally decided that the 'else' keyword could never be used in code.

replies(2): >>45080668 #>>45080713 #
60. matijsvzuijlen ◴[] No.45080638{6}[source]
I can understand requiring 100% test coverage, but it seems to me that requiring a test file for every file is preventing your team from doing useful refactoring.

What made your team decide on that rule? Could your team decide to drop it since it hinders improving the design of your code?

61. jonahx ◴[] No.45080668{4}[source]
> he weirdest example I have ever had to deal with - a team had unilaterally decided that the 'else' keyword could never be used in code.

Not weird at all:

https://medium.com/@matryer/line-of-sight-in-code-186dd7cdea...

replies(1): >>45080730 #
62. cncjchsue7 ◴[] No.45080677{3}[source]
This sounds like hell to me.

Not everything is complicated, most functions don't need comments, why require it? Just fix complexity when it arises. Don't mandate that you can't make any complexity.

replies(2): >>45080763 #>>45080788 #
63. cyberax ◴[] No.45080703{3}[source]
Visitor pattern is extremely useful in some areas, such as compiler development.
replies(1): >>45081398 #
64. arbol ◴[] No.45080713{4}[source]
I can understand why else is sometimes not needed. JS linters will remove unnecessary else statements by default.

https://eslint.org/docs/latest/rules/no-else-return#rule-det...

But never using it is crazy.

replies(1): >>45080740 #
65. awesome_dude ◴[] No.45080730{5}[source]
Well, I found it weird - the else keyword has been a stalwart of programming for... several decades now.

Maybe one day we will abstract it away like the goto keyword (goto is a keyword in Go, and other languages still, but I have only seen it used in the wild once or twice in my 7 or 8 years of writing Go)

Goto is still used in almost every language, but it's abstracted away, hidden in loops, and conditionals (which Djikstra said was a perfectly acceptable use of goto), presumably to discourage its direct use to jump to arbitrary points in the code

replies(1): >>45084683 #
66. awesome_dude ◴[] No.45080740{5}[source]
In a similar vein to how I just responded to the other person, maybe eventually we'll abstract `else` away so that it's use is hidden, and the abstraction ensures that it's only being used where we all collectively decide it can/should be used.
67. cco ◴[] No.45080763{4}[source]
What is a function supposed to do and why?
68. hakunin ◴[] No.45080785{3}[source]
I found this type of approach (where you try to meet subjective readability goals with objective/statistical metrics) to not produce clear code in practice. Instead, I suggest this one weird trick: if your colleagues are confused in code review, then rewrite and comment the code until they aren't confused anymore. Don't just explain it to them ad-hoc, make the code+comments become the explanation. There is no better linter than subjective reading by your colleagues. Nothing else works nearly as well. Optimize to your team's understanding, that's it. Somehow, this tends to keep working great even as the team changes.
replies(1): >>45080960 #
69. bornfreddy ◴[] No.45080788{4}[source]
Agreed. If you need a comment to tell you what the function does, you should think deep about naming, and if this fails, consider if this is the correct abstraction. Comments are a way to kick the can down the road - "I was unable to make this code clear enough, so here is the hint to help you".

Edit: sometimes the comments are the best of all evils, and you should use them to explain the constraints that led to this code - they just shouldn't be mandatory.

70. teiferer ◴[] No.45080797{5}[source]
> If you can train yourself to read your own code with fresh eyes almost as soon as you write it, you’d be unlocking a powerful shortcut, a cheat code to life.

This is really a key takeaway here: Always keep your audience in mind. When programming, you have two audiences: the machine executing the code, and fellow programmers maintaining the code. Both are important, but the latter is often neglected and is what the article is about. Optimize for your human audience. What will make it easier for the next person to understand this? Do that.

Like public speaking or writing an article. A great talk or a article happen when the speaker/author knew exactly how the audience would perceive them.

replies(1): >>45080840 #
71. hakunin ◴[] No.45080840{6}[source]
Agreed, I wrote more in depth about it a few years ago: https://max.engineer/maintainable-code
72. necovek ◴[] No.45080960{4}[source]
It's one message I struggle to convey to people I do code reviews for: don't make me understand it, make it more self explanatory so every reader does. (And, yes, I ask for it explicitly too)

(I sometimes "ask" questions for something it took me a few back and forths through code to get so they'd think about how it could be made clearer)

Unfortunately, most people focus on explaining their frame of mind (insecurity?) instead of thinking how can they be the best "teacher".

replies(1): >>45081016 #
73. mnsc ◴[] No.45080993[source]
> You can't win architecture arguments.

I feel this in my soul. But I'm starting to understand this and accept it. Acceptance seem to lessen my frustration on discussing with architects that seemingly always take the opposite stance to me. There is no right or wrong, just always different trade offs depending on what rule or constraint you are prioritizing in your mind.

replies(3): >>45081113 #>>45081624 #>>45083244 #
74. hakunin ◴[] No.45081016{5}[source]
Yeah, not easy, but it helps to build some rapport first, so people learn what you’re after. The way I tend to do that is by leaving a review comment with an example code snippet that makes me understand it better, and a question “what do you think about this version? I tried to clarify a few things here.”. + Explain what was clarified. I find the effort usually pays off.
replies(2): >>45081330 #>>45081735 #
75. berkes ◴[] No.45081113{3}[source]
I've found that listening and asking questions is the key to accepting other people's architectural choices.

Why do they insist on A over B? What trade offs were considered? Why are these trade offs less threatening than other trade offs? What previous failures or difficulties led them to put such weight on this problem over others?

Sometimes it's just ego or stubbornness or routine¹. That can and should be dismissed IMO. Even if through these misguided reasons they choose the "right" architecture, even if the outcome turns out good, that way of working is toxic and bad for any long term project.

More often, there are good, solid reasons behind choices, though. Backed with data or science even. Things I didn't know, or see different, or have data and scientific papers for that "prove" the exact opposite. But it doesn't matter that much, as long as we all understand what we are prioritizing, what the trade offs are and how we mitigate the risks of those trade offs, it's fine.

¹ The worst, IMO, is the "we've always done it like this" trench. An ego can be softened or taken off the team. But unwillingness to learn and change, instilled in team culture is an almost guaranteed recipe for disaster

76. ahartmetz ◴[] No.45081213{3}[source]
What you say seems much more true about traditional French than English gardens tbh. The French style is a very simplistic demonstration of bending nature to the human will. The English style seems to be more about reproducing an overly quaint image of "natural" landscapes (there are very few of these in Europe), which I find much more pleasant in idea and result.
77. n4r9 ◴[] No.45081240{4}[source]
It depends greatly on the situation. If you have five different methods for fetching WidgetInfo from the database and a requirement comes in to add TextProperty to Widget in all views, you're more likely to accidentally miss one of the places that needed a change.

Likewise if someone notices a bug in the method, you then have to go through and figure out which copies have the same bug, and fix each one, and QA test each one separately.

The proper approach is to make a judgement call based on how naturally generic the method is, and whether or not the existing use cases require custom behaviours of it (now or in the near future).

78. seadan83 ◴[] No.45081244{4}[source]
Indeed a trap. I'd say DRY is all about not duplicating logical components. Just because two pieces of code look similar, does not mean they need to be combined.

As an analogy, when writing a book, it's the difference of not repeating the opening plot of the story multiple times vs replacing every instance of the with a new symbol.

79. ashurov ◴[] No.45081283[source]
did you try here?: https://www.abebooks.com/book-search/publisher/yaknyam-press...
replies(1): >>45085783 #
80. pkolaczk ◴[] No.45081298{5}[source]
An obvious example of that is defining named constants and referring them by name instead of repeating the same value in N places. This is also DRY and good kind of DRY.
replies(1): >>45082660 #
81. radiator ◴[] No.45081330{6}[source]
But this might require too much effort from the reviewer
82. seadan83 ◴[] No.45081345{6}[source]
The "rules" for config are different. Code, test code, and config are different, their complexity scales in different ways of course.

By way of analogy for why the two configs are different, for example Two beaches are not the same because they both have very similar sand.

You really have two different configs.. You also have one set of configs. You didn't set up an application that also fetches some config that is already provided. It would be like having a test flag in both config and database, sane flag - two places.

Where config duplication goes bad is when repeatedly the same change is made across all N, local variations have to be reconciled each time and it is N sets of testing you need to do. Something like that in code is potentially more complex, more obviously a duplication of a module, just more likely to be a problem overall.

83. seadan83 ◴[] No.45081396[source]
Experience helps to recognize intent sooner. That reduces cognitive load. Getting lost 5 levels deep seemingly never stops being a thing, not just you.
84. brabel ◴[] No.45081398{4}[source]
That’s only true in languages that do not have Algebraic Data Types and pattern matching, which nowadays is a minority of languages (even Java has it).
replies(1): >>45081544 #
85. tasuki ◴[] No.45081418{4}[source]
> I don’t see how it was a poor choice to literally copy and paste configs that result in hundreds of thousands of lines of yaml

Perhaps one day you will. I'm a dev who worked with infra people who had your philosophy: many copy pasted config files.

Sometimes I needed to add an env var to a service. Expressing "default to false and only set it to true in these three environments" took changing about 30 files. I always made mistakes (usually of omission), and the infra people only ever caught them at deployment time. It was hell.

86. brabel ◴[] No.45081449[source]
You should not need to read every line of code in every file and function to understand what’s going on to the level you need to solve a particular problem. You must make a decision to NOT look deeper at some point on any non- trivial code base. A good program with good names and comments in the appropriate places is what allows you to do exactly that more easily . When you see sort(usernames) in the middle of a function do you need to dive into sort to be able to understand the code in that function?? Probably not, unless you are fixing a bug in how usernames are sorted!

With that said , get good at jumping into definitions, finding all implementations, then jumping back where you were. With a good IDE you can do that at the speed of thought (in IntelliJ that’s Cmb+b, Cmd+Alt+b, Cmd+[ on Mac). I only open more than one file at the same time when comparing them. Otherwise it’s much easier to jump around back and forth (you can even open another function inline if you just want to take a Quick Look, it’s Alt+Space). Don’t use the mouse to do that, things you do all the time can be made an order of magnitude faster via shortcuts. Too many developers I see struggle with that and are embarrassingly slow moving around the code base!

replies(1): >>45082643 #
87. cyberax ◴[] No.45081544{5}[source]
Visitors additionally allow you to decouple graph traversal from the processing. It is still needed even in the languages with pattern matching.

There's also the question of exhaustiveness checking. With visitors, you can typically opt-in to either checking that you handle everything. Or use the default no-ops for anything that you're not interested in.

So if you look at compilers for languages with pattern matching (e.g. Rust), you still see... visitors! E.g.: https://github.com/rust-lang/rust/blob/64a99db105f45ea330473...

replies(1): >>45082244 #
88. KronisLV ◴[] No.45081624{3}[source]
> Acceptance seem to lessen my frustration on discussing with architects that seemingly always take the opposite stance to me. There is no right or wrong, just always different trade offs depending on what rule or constraint you are prioritizing in your mind.

That’s a stance of acceptance, however I’d say that there are people who are absolutely wrong by most metrics sometimes and also stubborn to the point that you’ll never convince them. Ergo, the frustration is inevitable when faced with them.

89. necovek ◴[] No.45081735{6}[source]
I found that to be a double edged sword: some copy and paste it verbatim without thinking it through and adjusting at all.

It's a delicate balance we need to keep in mind between many of:

- maintainable code

- getting things done

- feeling of accomplishment

- feedback loop speed

- coaching

- motivation and emotional state ("why are they pestering me, the code works, I just want to feel productive and useful: this was hard enough to get right as it is")

...and more!

At the same time, some do get the point, but getting readable code is really an art/craft in itself, and nothing but experience and learning to look at it from outside is the main driver to learning.

replies(1): >>45085144 #
90. serpix ◴[] No.45081796{3}[source]
These points are about organising code and workflow. Even if you have organised your functions to the lowest possible unit of work you can still have a mess of async queue microservice hell which is the actual architecture.

Architecture is another topic entirely and the scope is higher abstractions across multiple systems.

91. wreath ◴[] No.45082180{3}[source]
In an industry where most people stay for around 2 years (at least pre 2022), people arent even there to see the results of their decisions.
92. brabel ◴[] No.45082244{6}[source]
The example you posted is very interesting as it used both a visitor and ADTs. It seems the need for the Visitor comes from the generics in this case? Probably a Rust specific limitation. I don’t understand why you mention exhaustiveness though, it’s obviously easy have comprehensive or partial matching with ADT.
replies(1): >>45084559 #
93. epolanski ◴[] No.45082332[source]
There is no objective "perfect", because the "perfect" is in the eyes of the reader.

Also, people confuse familiar with simple, they tend to find things simple if they are familiar, even if they are complex (interwine a lot of different things).

94. ferguess_k ◴[] No.45082643{3}[source]
The problem is, most of us are working in business logic, and I have never been in a company (have been to 5) where you don't have to look into the details. Not only did I need to look at the details, I also needed to read the comments to understand why there is a +0.024 in the code.

This is why I want to get into system programming. At least less business logic. They are still going to be complicated, but I feel it's a lot nicer to, e.g. go from the bottom of the process struct than the bottom of some half-ass business stakeholder!

95. ryeats ◴[] No.45082660{6}[source]
This is actually a particular pet pieve of mine because I worked with the Camel framework which has a lot of boilerplate in strings but if you start using constants for the common parts you now have an unreadable mess of constants concatenated together that buys you nothing.
96. leke ◴[] No.45082718[source]
Yep pretty much. This could literally be notes taken from the book including the phrase itself.
97. fauigerzigerk ◴[] No.45082725{6}[source]
I agree completely. DRY shouldn't be a compression algorithm.

If two countries happen to calculate some tax in the same way at a particular time, I'm still going to keep those functions separate, because the rules are made by two different parliaments idependently of each other.

Referring to the same function would simply be an incorrect abstraction. It would suggest that one tax calculation should change whenever the other changes.

If, on the other hand, both countries were referring to a common international standard then I would use a shared function to mirror the reference/dependency that they decided to put into their respective laws.

98. fireflash38 ◴[] No.45082834{4}[source]
It's not helped by the "jump every 2 years for 20-50% pay bump". They don't have to deal with their own architectural decisions.
99. heresie-dabord ◴[] No.45083102[source]
> Every rule can be used to argue anything.

This is true. However, very few people can clearly explain all the rules.

If they can, they have understood the system and are qualified.

100. vitaflo ◴[] No.45083244{3}[source]
The problem with a lot of devs is trying to win arguments instead of coming to a consensus. What’s best for the team matters more than what’s best for the individual and every team is different.
101. cyberax ◴[] No.45084559{7}[source]
No. The code can be rewritten without visitors using iterators for traversal, for example). But it'll look badly.

Visitors in the linked example are real classic visitors. The code _within_ the visitor methods, of course, uses pattern matching, but the pattern itself is not materially different from C++.

Exhaustiveness checking for pattern matching is also "best effort" for complex matching.

102. jonahx ◴[] No.45084683{6}[source]
In a sense, all of these coding practices -- whether restricting goto to loops and conditionals, which has broad acceptance these days, or avoiding else to "keep the happy left", or anything else in a coding style guide -- are just doing one thing: restricting the language to a smaller subset of itself.

And in general the primary benefit of such restriction is to reduce cognitive load. Scheme is easier than C++. The downside of such restriction is loss of expressiveness. Whether the net benefit is good depends on how these two things trade off. Experience and developer preference are inputs to that equation, which is why devs fight over coding guidelines. But I think it's helpful to boil it down in this way at a high level.

The ideal is smaller language where the expressiveness you've cut away is only rarely useful, and often error-prone.

103. hakunin ◴[] No.45085144{7}[source]
Yeah, this does require a certain team culture building effort. Just starting cold without any expectation-setting might not be received well.

One "rule" I try to meta-promote is — working code is the first step, and a great foundation to then proceed to clear and maintainable code.

Another, is that code reviews are first-class citizens deserving mindfulness.

104. wrs ◴[] No.45085783{3}[source]
AbeBooks has been a subsidiary of Amazon since 2008. (Sure, this wouldn’t give Jeff B. “all the money”, but neither would that seller’s listing on Amazon.)
105. exclipy ◴[] No.45088387{3}[source]
I actually think this is antithetical to the philosophy. Cyclometic complexity is very much not the same as "is this code difficult to understand".

Arbitrary structure rules like "do_thing_a(); do_thing_b(); do_thing_c();" also is not unless you can explain how this helps make it easier to understand compared to say, one big function with "// DO THING A" comments.

106. chanux ◴[] No.45088932{5}[source]
Complexity has to live somewhere. The genius is in putting in places that make things manageable, I guess.

https://ferd.ca/complexity-has-to-live-somewhere.html

107. Pannoniae ◴[] No.45091977{4}[source]
Oh yeah, 100% this, I made a homemade sincos implementation which roughly returns the right result roughly all the time. It's nice because I don't care about the exact answer (it's for randomly rotating angles for generating caves, terrain generation) and it's like 5x as fast as doing it properly!