Most active commenters

    ←back to thread

    1455 points nromiun | 41 comments | | HN request time: 0.001s | source | bottom
    Show context
    exclipy ◴[] No.45077894[source]
    This was my main takeaway from A Philosophy Of Software Design by John Ousterhout. It is the best book on this subject and I recommend it to every software developer.

    Basically, you should aim to minimise complexity in software design, but importantly, complexity is defined as "how difficult is it to make changes to it". "How difficult" is largely determined by the amount of cognitive load necessary to understand it.

    replies(11): >>45077906 #>>45077954 #>>45078135 #>>45078497 #>>45078728 #>>45078760 #>>45078826 #>>45078970 #>>45079961 #>>45080019 #>>45082718 #
    1. bsenftner ◴[] No.45077954[source]
    Which is why I consider DRY (Don't Repeat Yourself) to be an anti-rule until an application is fairly well understood and multiple versions exist. DO repeat yourself, and do not create some smart version of what you think the problem is before you're attempting the 3rd version. Version 1 is how you figure out the problem space, version 2 is how you figure out your solution as a maintainable dynamic thing within a changing tech landscape, and version 3 is when DRY is look at for the first time for that application.
    replies(5): >>45078178 #>>45078299 #>>45078606 #>>45078696 #>>45079410 #
    2. hinkley ◴[] No.45078178[source]
    Some people use a gardening metaphor for code, and I think that since code is from and for humans, that’s not a terrible analogy. It’s organic by origin if not by nature.

    When you’re dealing with perennial plants, there’s only so much control you actually have, and there’s a list of things you know you have to do with them but you cannot do them all at once. There is what you need to do now, what you need to do next year, and a theory of what you’ll do over the next five years. And two years into any five year plan, the five year plan has completely changed. You’re hedging your bets.

    Traditional Formal English and French gardens try to “master” the plants. Force them to behave to an exacting standard. It’s only possible with both a high degree of skill and a vast pool of labor. They aren’t really about nature, or food. They’re displays of opulence. They are conspicuous consumption. They are keeping up with the Joneses. Some people love that about them. More practical people see it as pretentious bullshit.

    I think we all know a few companies that make a bad idea work by sheer force of will and overwhelming resources.

    replies(1): >>45081213 #
    3. zahlman ◴[] No.45078299[source]
    DRY isn't about not reimplementing things; it's about not literally copying and pasting code. Which I have seen all the time, and which some might find easier now but will definitely make the system harder to change (correctly) at some point later on.
    replies(10): >>45078465 #>>45078493 #>>45078525 #>>45078789 #>>45078797 #>>45078961 #>>45079164 #>>45079325 #>>45079628 #>>45079966 #
    4. nicoburns ◴[] No.45078465[source]
    Yeah, I've seen codebases where you have several hundred line components copy-pasted multiple times with say 10-20 lines changed, and you literally have to diff the files to find out why there are several.

    This is unhelpful even if the design is a complete mess.

    5. ryeats ◴[] No.45078493[source]
    This is a trap junior devs fall into DRY isn't free it can be premature optimization since in order to avoid copying code you often add both an abstraction AND couple components together that are logically separate. The issues are at some point they may have slightly different requirements and if done repeatedly you can get to a point that you have all these small layers of abstraction that are cross cutting concerns and making changes have a bigger blast radius than you can intuit easily.
    replies(5): >>45078671 #>>45078700 #>>45079551 #>>45080482 #>>45081244 #
    6. ◴[] No.45078525[source]
    7. martinpw ◴[] No.45078606[source]
    Closely related to the Rule of Three - ok to duplicate once, but if it is needed a third time, consider refactoring: https://en.wikipedia.org/wiki/Rule_of_three_(computer_progra...

    I think it's a pretty good compromise. I have tried in the past not to duplicate code at all, and it often ends up more pain than gain. Allow copy/paste if code is needed in two different places, but refactor if needed in three or more, is a pretty good rule of thumb.

    replies(4): >>45078751 #>>45078801 #>>45080238 #>>45080568 #
    8. rkomorn ◴[] No.45078671{3}[source]
    The reverse of that is people introducing bugs because code that wasn't DRY enough was only changed in some of the places that needed to be changed instead of all the places.

    To me, it's the things that are specifically intended to behave the same should be kept DRY.

    replies(2): >>45079743 #>>45081298 #
    9. tialaramex ◴[] No.45078696[source]
    I think more than a few people have recommended waiting until the 3rd or 4th X before you say OK, Don't Repeat Yourself we need to factor this out. That's where my rule of thumb is too.

    Deliberately going earlier makes sense if experience teaches you there will be 3+ of this eventually, but the point where I'm going to pick "Decline" and write that you need to fix this first is when I see you've repeated something 4-5 times, that's too many, we have machines to do repetition for us, have the machine do it.

    An EnableEditor function? OK, meaningful name. EnablePublisher? Hmm, yes I understand the name scheme but I get a bad feeling. EnableCoAuthor? Approved with a stern note to reconsider, are we really never adding more of these, is there really some reason you can't factor this out? EnableAuditor. No. Stop, this function is named Enable and it takes a Role, do not copy-paste and change the names.

    10. zahlman ◴[] No.45078700{3}[source]
    If you notice that two parts of the code look similar, but have a good reason not to merge or refactor, that deserves a signpost comment.

    If you're copying and pasting something, there probably isn't a good reason for that. (The best common reason I can think of is "the language / framework demands so much boilerplate to reuse this little bit of code that it's a net loss" — which is still a bad feeling.)

    If you rewrite something without noticing that you're doing so, something has definitely gone wrong.

    If a client's requirements change to the point where you can't accommodate them in the nicely refactored function (or to the point where doing so would create an abomination) — then you can make the separate, similar looking version.

    replies(2): >>45078878 #>>45079785 #
    11. rekrsiv ◴[] No.45078751[source]
    On the other hand, just because you know you're going to have to refactor, doesn't mean you should start refactoring once you reach three; you might not yet know the ideal shape for this code until many more duplications.
    12. stevage ◴[] No.45078789[source]
    Copying and pasting code is often fine, particularly when you make a change to one of the copies.

    Over time I have come to prefer having two near copies that are each more concretely expressive of their task than a more abstract version that caters to both.

    13. YZF ◴[] No.45078797[source]
    But sometimes you should copy and paste code because those difference pieces of code can evolve independently. Knowing when to do this and when not to do this is what we do and no rule can blindly say one way or the other.

    Even the most obvious of functions like sin() and cos() may in some circumstances warrant a specialized implementation. Sure, for most stuff you should not have 10 copies of those all over the place. But sometimes you might.

    DRY is a bad rule. The more appropriate rule is avoid duplicating code when not doing so results something better. I.e. judgement always trumps rules.

    replies(1): >>45091977 #
    14. stevage ◴[] No.45078801[source]
    Agreed, it works pretty well for me.

    The hard edge case is when you have a thing that needs to be duplicated along two axes. So now you have two pairs of things, four total. Four simple things or one complex thing.

    15. smallnamespace ◴[] No.45078878{4}[source]
    > If you're copying and pasting something, there probably isn't a good reason for that.

    I would embrace copying and pasting for functionality that I want to be identical in two places right now, but I’m not sure ought to be identical in the future.

    replies(1): >>45082725 #
    16. drbojingle ◴[] No.45078961[source]
    I completely disagree. Sometimes it makes things harder but not 100% of the time.

    Sometimes things are only the same temporarily and shouldn't be brought together.

    17. AstroBen ◴[] No.45079164[source]
    DRY is about concepts, not characters. Don't have multiple implementations of a concept

    If you choose to not copy paste the code you better be damn sure the two places that use it are relying on the same concept, not just superficially similar code thats yet to diverge

    18. ori_b ◴[] No.45079325[source]
    Not copy pasting code also makes it harder to change the system correctly at some point later on, because you transformed a local decision ("does this code do what the caller needs?") onto a global one ("does this code do what any possible caller needs, including across code maintained by other teams?")

    There's no one rule. It takes experience and taste to make good guesses, and you'll often be wrong even so.

    replies(1): >>45081240 #
    19. cyberax ◴[] No.45079410[source]
    DRY means something completely different. It means that there should be just one source of truth.

    Example: you have a config defined as Java/Go classes/structures. You want to check that the config file has the correct syntax. Non-DRY strategy is to describe its structure in an XSD schema (ok, ok JSON schema) and then validate the config. So you end up with two sources of truth: the schema and Java/Go classes, they can drift apart and cause problems.

    The DRY way is to generate the classes/structures that define the config from that schema.

    20. fenomas ◴[] No.45079551{3}[source]
    All my younger colleagues have heard my catchphrase:

    Copy-paste is free; abstractions are expensive.

    replies(1): >>45079892 #
    21. MrDarcy ◴[] No.45079628[source]
    I’ll bite. We’re expanding into Europe. I literally copied and pasted our entire infrastructure into a new folder named “Europe”

    Now there’s a new requirement that only applies to Europe and nowhere else and it’s super easy and straight forward to change the infrastructure.

    I don’t see how it was a poor choice to literally copy and paste configs that result in hundreds of thousands of lines of yaml and I have 25 yoe.

    replies(2): >>45079739 #>>45081418 #
    22. chipsrafferty ◴[] No.45079739{3}[source]
    I think the most common way to approach that problem would be to have a "default config", and overrides. Could you go into more detail about why you didn't do this instead?

    Downsides with your approach is:

    1. Now whenever you want to change something both in Europe and (assuming) USA you have to do it in 2 places. If the change is the same for both, in my system, you could just update the default/shared config. If the change is different for both it's equally easy, but faster, since the overrides are smaller files.

    2. It's not clear what the difference is between Europe and USA if there is 1 line different amongst thousands. If there are more differences in the future, it becomes increasingly difficult to tell the difference easily.

    3. If in the future you also need to add Africa, you just compounded the problems of 1. and 2.

    replies(1): >>45080014 #
    23. sroerick ◴[] No.45079743{4}[source]
    This is the correct take - if you're getting this type of bug, it's now past time for DRY
    24. chipsrafferty ◴[] No.45079785{4}[source]
    I don't think it's as cut and dry as that. In my team we require 100% test coverage. Every file requires an accompanying test file, and every test file is set up with a bunch of mocks.

    Sure, we could take the Foo, Bar, and Baz tables that share 80-90% of common logic and have them inherit from a common, shared, abstract component. We've discussed it in the past. Maybe it's the better solution, maybe not. But it would mean that instead of maintaining 3 component files and 3 test file, which are very similar, and when we need to change something it is often a copy-paste job, instead we'd have to maintain 2 additional files for the shared component, and when that has to change, it would require more work as we then have to add more to the other 3 files.

    Such setups can often cause a cascade of tests that need updated and PRs with dozens of files changed.

    Also, there are many parts of our project where things could be done much better if we were making them from scratch. But, 6 years of changing requirements and new features and this is what we have - and at this point, I'm not sure that having a shared component would actually make things easier unless we rewrite a huge amount of the codebase, for which there is no business reason.

    replies(1): >>45080638 #
    25. wilkystyle ◴[] No.45079892{4}[source]
    One of the many great takeaways from Sandi Metz's talk at Railsconf 2014: "Duplication is far cheaper than the wrong abstraction."

    https://www.youtube.com/watch?v=8bZh5LMaSmE

    Worth watching in its entirety, but the quote is from ~13:59 in that video.

    replies(1): >>45080260 #
    26. hansvm ◴[] No.45079966[source]
    A subtlety still exists there. Copy-pasting is fine. What you're trying to prevent with DRY is two physical locations in your codebase referring to the same semantic context (i.e., when you should change "the thing" you have to remember to change "all the places").

    Somewhat off-topic, that's one usual failure mode of "DRY" code. Code is de-duplicated at a visual level rather than in terms of relevant semantics, so that changes which should only affect one path either affect both or are very complicated to reason about because of the unnecessary coupling.

    27. MrDarcy ◴[] No.45080014{4}[source]
    I don’t do this because with a complete copy I get progressive rollouts across regions without the complexity of if statements and feature flags. That is to say, making the change twice is a feature not a bug when the changes are staggered in time.

    From an operational perspective it’s much more important to ensure the code is clear and readable during an incident.

    Overrides are like inheritance. They are themselves complex and add unnecessary cognitive load.

    Composition is better for the common pieces that never change across regions. Think of an import statement of a common package into both the Europe and North America folders.

    I easily see the one line diff among hundreds of thousands using… diff.

    Regarding Africa, we’ve established 1 is a feature and 2 is a non issue, so I’d copy it again.

    This approach scales both as the team scales and as the infrastructure scales. Teammates can read and comprehend much more easily than hierarchies of overrides, and changes are naturally scoped to pieces of the whole.

    replies(1): >>45081345 #
    28. boredtofears ◴[] No.45080238[source]
    And just like the rule that it replaced, the rule of three is now often interpreted as the "correct" approach always, while I still find reality to be more nuanced.

    Sometimes you do have the domain expertise to make the judgment call.

    A recent example that comes to mind is a payment calculation. You can go ahead and tie that up in a nice reusable function from the get go - if you've ever dealt with a bug where payment calculations appeared different in some places and it somehow made it in front of a customer you're well aware of how painful this can be. For some things having a single source of truth outweighs any negatives associated with refactoring.

    29. drivers99 ◴[] No.45080260{5}[source]
    The related blog post (I just found thanks to watching that and then searching for her site) is great too: https://sandimetz.com/blog/2016/1/20/the-wrong-abstraction

    It explains so much of what has been bothering me about what I work on at work, and now I understand why and some of what to do about it.

    30. kragen ◴[] No.45080482{3}[source]
    DRY isn't an optimization of any kind, so it can't be a premature optimization. "Premature optimization" is a specific failure mode of programmers, not just a meaningless term you can use to attack anything you don't like. "Optimization" is refactoring to reduce the use of resources (which are specifically cycles and bytes) and it's "premature" when you don't yet know that you're doing it where it matters.

    Otherwise I mostly agree.

    31. jaredsohn ◴[] No.45080568[source]
    also called WET (write everything twice or write everything thrice)
    32. matijsvzuijlen ◴[] No.45080638{5}[source]
    I can understand requiring 100% test coverage, but it seems to me that requiring a test file for every file is preventing your team from doing useful refactoring.

    What made your team decide on that rule? Could your team decide to drop it since it hinders improving the design of your code?

    33. ahartmetz ◴[] No.45081213[source]
    What you say seems much more true about traditional French than English gardens tbh. The French style is a very simplistic demonstration of bending nature to the human will. The English style seems to be more about reproducing an overly quaint image of "natural" landscapes (there are very few of these in Europe), which I find much more pleasant in idea and result.
    34. n4r9 ◴[] No.45081240{3}[source]
    It depends greatly on the situation. If you have five different methods for fetching WidgetInfo from the database and a requirement comes in to add TextProperty to Widget in all views, you're more likely to accidentally miss one of the places that needed a change.

    Likewise if someone notices a bug in the method, you then have to go through and figure out which copies have the same bug, and fix each one, and QA test each one separately.

    The proper approach is to make a judgement call based on how naturally generic the method is, and whether or not the existing use cases require custom behaviours of it (now or in the near future).

    35. seadan83 ◴[] No.45081244{3}[source]
    Indeed a trap. I'd say DRY is all about not duplicating logical components. Just because two pieces of code look similar, does not mean they need to be combined.

    As an analogy, when writing a book, it's the difference of not repeating the opening plot of the story multiple times vs replacing every instance of the with a new symbol.

    36. pkolaczk ◴[] No.45081298{4}[source]
    An obvious example of that is defining named constants and referring them by name instead of repeating the same value in N places. This is also DRY and good kind of DRY.
    replies(1): >>45082660 #
    37. seadan83 ◴[] No.45081345{5}[source]
    The "rules" for config are different. Code, test code, and config are different, their complexity scales in different ways of course.

    By way of analogy for why the two configs are different, for example Two beaches are not the same because they both have very similar sand.

    You really have two different configs.. You also have one set of configs. You didn't set up an application that also fetches some config that is already provided. It would be like having a test flag in both config and database, sane flag - two places.

    Where config duplication goes bad is when repeatedly the same change is made across all N, local variations have to be reconciled each time and it is N sets of testing you need to do. Something like that in code is potentially more complex, more obviously a duplication of a module, just more likely to be a problem overall.

    38. tasuki ◴[] No.45081418{3}[source]
    > I don’t see how it was a poor choice to literally copy and paste configs that result in hundreds of thousands of lines of yaml

    Perhaps one day you will. I'm a dev who worked with infra people who had your philosophy: many copy pasted config files.

    Sometimes I needed to add an env var to a service. Expressing "default to false and only set it to true in these three environments" took changing about 30 files. I always made mistakes (usually of omission), and the infra people only ever caught them at deployment time. It was hell.

    39. ryeats ◴[] No.45082660{5}[source]
    This is actually a particular pet pieve of mine because I worked with the Camel framework which has a lot of boilerplate in strings but if you start using constants for the common parts you now have an unreadable mess of constants concatenated together that buys you nothing.
    40. fauigerzigerk ◴[] No.45082725{5}[source]
    I agree completely. DRY shouldn't be a compression algorithm.

    If two countries happen to calculate some tax in the same way at a particular time, I'm still going to keep those functions separate, because the rules are made by two different parliaments idependently of each other.

    Referring to the same function would simply be an incorrect abstraction. It would suggest that one tax calculation should change whenever the other changes.

    If, on the other hand, both countries were referring to a common international standard then I would use a shared function to mirror the reference/dependency that they decided to put into their respective laws.

    41. Pannoniae ◴[] No.45091977{3}[source]
    Oh yeah, 100% this, I made a homemade sincos implementation which roughly returns the right result roughly all the time. It's nice because I don't care about the exact answer (it's for randomly rotating angles for generating caves, terrain generation) and it's like 5x as fast as doing it properly!