Most active commenters
  • wruza(5)
  • dllthomas(4)
  • mewpmewp2(3)
  • dustingetz(3)

←back to thread

361 points mmphosis | 43 comments | | HN request time: 2.853s | source | bottom
Show context
leetrout ◴[] No.42165704[source]
> It's better to have some wonky parameterization than it is to have multiple implementations of nearly the same thing. Improving the parameters will be easier than to consolidate four different implementations if this situation comes up again.

Hard disagree. If you cant decompose to avoid "wonky parameters" then keep them separate. Big smell is boolean flags (avoid altogether when you can) and more than one enum parameter.

IME "heavy" function signatures are always making things harder to maintain.

replies(17): >>42165868 #>>42165902 #>>42166004 #>>42166217 #>>42166363 #>>42166370 #>>42166579 #>>42166774 #>>42167282 #>>42167534 #>>42167823 #>>42168263 #>>42168489 #>>42168888 #>>42169453 #>>42169755 #>>42171152 #
1. thfuran ◴[] No.42165868[source]
I think it's especially bad advice with the "copy paste once is okay". You absolutely do not want multiple (even just two) copies of what's meant to be exactly the same functionality, since now they can accidentally evolve separately. But coupling together things that only happen to be mostly similar even at the expense of complicating their implementation and interface just makes things harder to reason about and work with.
replies(7): >>42166007 #>>42166141 #>>42166159 #>>42166278 #>>42166385 #>>42166712 #>>42187622 #
2. atoav ◴[] No.42166007[source]
My experience is totally different. Sure the popular beginners advice is to never repeat yourself, but in many cases that can actually be a viable operation, especially when you are okay with functions drifting apart or the cases they handle are allowed to differ.

And that happens.

The beginners problem lies in the reasons why that happens — e.g. very often the reason is that someone didn't really think about their argument and return data types, how functions access needed context data, how to return when functions can error in multiple ways etc, so if you find yourself reimplementing the same thing twice because of that — sure thing, you shouldn't — what you should do is go back and think better about how data is supposed to flow.

But if you have a data flow that you are very confident with and you need to do two things that just differ slightly just copy and paste it into two distinct functions, as this is what you want to have in some cases.

Dogmatism gets you only so far in programming.

replies(2): >>42167672 #>>42167872 #
3. jajko ◴[] No.42166141[source]
The problem is, such decisions are taken in the beginning of the project when you are far from full picture. Then comes rest of the app lifecycle - decade(s) of changes, bugfixes, replatformings, data/os/cluster migrations and so on.

I've seen, and even currently work on stuff that has beautiful but hard-to-grok abstractions all over the place (typical result of work of unsupervised brilliant juniors, technical debt in gigatons down the line but its almost always other people's problem). The thing is, that code has seen 10 major projects, absorbed other stuff, meaning and structure of data changed few times, other systems kept evolving etc.

Now all those abstractions are proper hell to navigate and perform any meaningful change. Of course another typical brilliant 5-second-attention-span junior result is complete lack of documentation. So you see stuff happening, but no idea why or why not, what does it mean down the line in other systems, why such choices were made and so on.

These days, I've had enough of any-design-patterns-at-all-costs kool aid and over-engineered cathedrals for rather trivial stuff (I think its mostly down to the anxious ego issue but thats for another discussion), I am more than happy to copy&paste stuff even 20x - if it makes sense at that place. And it does surprisingly often. Yes its very uncool and I won't brag about it on my next job interview, but it keeps things refreshingly and boringly stable and surprisingly also easier to change and test consequences, and somehow that's the priority #1 for most of the companies.

4. ninkendo ◴[] No.42166159[source]
Every time you consider copy pasting, you should be asking yourself “if the stuff I’m pasting needs to change, will I want both of these places to change?” It requires some guessing the future, but usually it’s not hard to answer the question.

IME if something should be an independent function or module, I rarely get to the point of considering copy/pasting it in the first place. If I want to copy/paste it’s usually because the two places currently only incidentally need the same code now, and my gut usually tells me that it will no longer be the case if I have to make any sort of change.

replies(2): >>42166595 #>>42167550 #
5. charles_f ◴[] No.42166278[source]
That's not entirely true. The difference between intentional and accidental repetition is that the first occurs because the rule is the same in both repetitions, and should be the same ; whereas the second happens to be the same for now. In not repeating yourself in the second case you actually risk changing an operation that should remain the same, as a side effect of changing the common function to alter the behaviour of the first.

That's why DRY is a smell (indicates that something might be wrong) and not a rule.

6. chipdart ◴[] No.42166385[source]
> I think it's especially bad advice with the "copy paste once is okay". You absolutely do not want multiple (even just two) copies of what's meant to be exactly the same functionality, since now they can accidentally evolve separately.

Hard disagree. Your type of misconception is the root cause of most broken and unmaintainable projects, and the root of most technical debt and accidental complexity.

People who follow that simplistic logic of "code can accidentally evolve separately" are completely oblivious to the fact that there is seemingly duplicate code which is only incidentally duplicate, but at its core should clearly be and remain completely decoupled.

More to the point, refactoring two member functions that are mostly the same is far simpler than refactoring N classes and interfaces registered in dependency injection systems required to DRY up code.

I lost count I had to stop shortsighted junior developers who completely lost track of what they were doing and with a straight face were citing DRY to justify adding three classes and a interface to implement a strategy pattern because by that they would avoid adding a duplicate method. Absurd.

People would far better if instead of mindlessly parrot DRY they looked at what they are doing and understood that premature abstractions cause far more problems than the ones they solve (if any).

Newbie, inexperienced developers write complex code. Experienced, seasoned developers write simple code. Knowing the importance of having duplicate code is a key factor.

replies(5): >>42166615 #>>42167259 #>>42167267 #>>42168379 #>>42169272 #
7. mewpmewp2 ◴[] No.42166595[source]
Early in my career I started out really DRY, it in my experience and not just the code I wrote led to various issues down the line with unmaintainable edge cases. Especially if many teams are working on those things. It becomes really hard to support at some point. Now I feel much better making things DRY when it is really obvious that it should be.
replies(1): >>42167806 #
8. l33t7332273 ◴[] No.42166615[source]
> Newbie, inexperienced developers write complex code. Experienced, seasoned developers write simple code

This is a really inaccurate generalization. Maybe you could say something about excess complexity, but all problems have some level of irreducible complexity that code fundamentally had to reflect.

replies(2): >>42167156 #>>42167460 #
9. ikrenji ◴[] No.42166712[source]
DRY fanaticism is just as bad as not thinking about DRY at all
10. necovek ◴[] No.42167156{3}[source]
Nope, it is not inaccurate — but you are not wrong either.

Obviously, code will reflect the complexity of the problem.

But incidentally, most problems we solve with code are not that hard, yet most code is extremely complex — a lot more complex than the complexity inherent to the problem. And that's where you can tell an experienced, seasoned (and smart) developer who'd write code that's only complex where it needs to be, from an inexperienced one where code will be complex so it appears "smart".

replies(1): >>42174599 #
11. stouset ◴[] No.42167259[source]
All walks of developers write overly-complex code because they don’t know how to abstract so they either overdo it, under-do it, or just do it badly.

Writing good abstractions is hard and takes practice. Unfortunately the current zeitgeist has (IMO) swung too hard the wrong way with guiding mantras like “explicitness” which is misinterpreted to mean inline all the logic and expose all the details everywhere all the time and “worse is better” which is misinterpreted to justify straight up bad designs / implementations in the name of not overthinking things, instead of good-but-imperfect ones.

The knee-jerk response against abstraction has led to the majority of even seasoned, experienced developers to write overly complex code because they’ve spent a career failing to learn how to abstract. I’d rather us as an industry figure out what makes a quality abstraction and give guidance to junior developers so they learn how to do so responsibly instead of throwing up our hands and acting like it’s impossible. This despite literally all of computing having been built upon a tower of countless abstractions that let us conveniently forget the fact that we’re actually juggling electrons around on rocks.

12. twic ◴[] No.42167267[source]
What thfuran said was:

> You absolutely do not want multiple (even just two) copies of what's meant to be exactly the same functionality, since now they can accidentally evolve separately. But coupling together things that only happen to be mostly similar even at the expense of complicating their implementation and interface just makes things harder to reason about and work with.

So, if things are fundamentally the same, do not duplicate, but if they are fundamentally different, do not unify. This is absolutely correct.

To which you replied:

> People who follow that simplistic logic of "code can accidentally evolve separately" are completely oblivious to the fact that there is seemingly duplicate code which is only incidentally duplicate, but at its core should clearly be and remain completely decoupled.

Despite the fact that this is exactly what the comment you replied to says.

Then you go on a clearly very deeply felt rant about overcomplication via dependency injection and architecture astronautics and so on. Preach it! But this is also nothing to do with what thfuran wrote.

> Newbie, inexperienced developers write complex code. Experienced, seasoned developers write simple code.

Sounds like the kind of overgeneralisation that overconfident mid-career developers make to me.

replies(2): >>42167782 #>>42168986 #
13. ChrisMarshallNY ◴[] No.42167460{3}[source]
Don't look at the code I just wrote (populating a user list with avatars, downloaded via background threads). It might cause trauma.

The last couple of days have been annoying, but I got it to work; just not as easily as I wanted. The platform, itself, has limitations, and I needed to find these, by banging into them, and coding around them, which is ugly.

14. hinkley ◴[] No.42167550[source]
And usually the answer stops becoming a guess at 3. I’ve certainly had enough experiences where we had 2 and 3 in the backlog and no matter how we tried, #3 always required as much or more work than #2 because we guessed wrong and it would have been faster to slam out #2 and let #3 be the expensive one.
15. wruza ◴[] No.42167672[source]
I think that it’s our tooling sucks, not us. Cause we only have functions and duplicated code, but there’s no named-common-block idea, which one could insert, edit and

1) see how it differs from the original immediately next time

2) other devs would see that it’s not just code, but a part of a common block, and follow ideas from it

3) changes to the original block would be merge-compatible downwards (and actually pending)

4) can eject code from this hierarchy in case it completely diverges and cannot be maintained as a part of it anymore

Instead we generate this thread over and over again but no one can define “good {structure,design,circumstances}” etc. It’s all at the “feeling” level and doing so or so in the clueless beginning makes it hard to change later.

replies(2): >>42170174 #>>42171430 #
16. deely3 ◴[] No.42167782{3}[source]
The issue is that you actually never really know is things are fundamentally the same. To know it you have to know the future.
replies(4): >>42168392 #>>42168533 #>>42168831 #>>42169889 #
17. dllthomas ◴[] No.42167806{3}[source]
> I started out really DRY

When you say "DRY" here, would you say you had familiarity with the original definition, or merely what you (quite understandably) inferred from the acronym? Because I think the formulation in The Pragmatic Programmer is pretty spot on in speaking about not repeating "pieces of information", whereas I find in practice most people are reacting to superficial similarity (which may or may not reflect a deeper connection).

replies(1): >>42168241 #
18. dllthomas ◴[] No.42167872[source]
I think a part of the problem is that in addition to being a well regarded principle with a good pedigree, "DRY" is both catchy and (unlike SOLID or similar) seems self explanatory. The natural interpretation, however, doesn't really match what was written in The Pragmatic Programmer, where it doesn't speak of duplicate code but rather duplicate "pieces of information". If "you are okay with functions drifting apart or the cases they handle are allowed to differ" then the two functions really don't represent the same piece of information, and collapsing them may be better or worse but it is no more DRY by that definition.

I've tried to counter-meme with the joke that collapsing superficially similar code isn't improving it, but compressing it, and that we should refer to such activity as "Huffman coding".

It's also worth noting that the focus on syntax can also miss cases where DRY would recommend a change; if you are saying "there is a button here" in HTML and also in CSS and also in JS, your code isn't DRY even if those three look nothing alike (though whether the steps necessary to collapse those will very much depend on context).

replies(2): >>42170038 #>>42171545 #
19. mewpmewp2 ◴[] No.42168241{4}[source]
Looking at the definition, I do believe I wasn't referring to the original definition. I didn't actually know that original definition was specifically limited to the information/knowledge part. I have to assume there's industry wide misunderstanding on this term?

To avoid the confusion, it seems like DRY would be better named something like "Single source of truth". Because I do agree with that.

replies(1): >>42169323 #
20. dustingetz ◴[] No.42168379[source]
root cause of dysfunction is executive management, or really customer and market structure (e.g. govt procurement as an extreme example). Full stop

fwiw i agree that copy paste is fine

replies(1): >>42171273 #
21. dustingetz ◴[] No.42168392{4}[source]
or study abstract algebra (but you’re now a researcher, because programming isn’t yet solved)
22. Aeolun ◴[] No.42168533{4}[source]
I think this is what the original post that people took issue with said? By the time you write the same thing for the third time you are not predicting the future any more, you have practical evidence.
replies(1): >>42188887 #
23. Ma8ee ◴[] No.42168831{4}[source]
Not the future, but the domain.
24. djmips ◴[] No.42168986{3}[source]
To be fair thfuran was hard to decipher and should be refactored to be more clear.
25. brigandish ◴[] No.42169272[source]
If someone writes a strategy pattern to fix duplication, all power to them, it's a well understood, easy to use pattern that fixes several problems.

> adding three classes and a interface to implement a strategy pattern

Sounds like the language used is the problem here, not the intent. Hasn't Java (et al) made this easier yet?

26. dllthomas ◴[] No.42169323{5}[source]
> I have to assume there's industry wide misunderstanding on this term?

The "misunderstanding" is at least as prevalent as the original, yes. I wasn't trying to say the original is "correct" - language is determined by usage - just wondering which you were discussing.

> To avoid the confusion, it seems like DRY would be better named something like "Single source of truth".

It could probably do with a better name, but "single source of truth" is usually about the information operated on by the program, rather than information embodied in the program.

replies(1): >>42170685 #
27. strken ◴[] No.42169889{4}[source]
"Know the future" is part of a software engineer's job description, at least insofar as "know" means "make informed predictions about".

Consider the case of making API calls to a third party. You, today, are writing a function that calls the remote API with some credentials, reauthenticates on auth failure, handles backoff when rate limited, and generates structured logs for outgoing calls.

You need to add a second API call. You're not sure whether to copy the existing code or create an abstraction. What do you do?

Well, in this case, you have a crystal ball! This is a common abstraction that can be identified in other code as well as your own. You don't know the future with 100% confidence, but it's your job to be able to make a pretty good guess using partial information.

28. wruza ◴[] No.42170038{3}[source]
The book assumes that you should know better, that’s the problem. You may understand it correctly and do your best, but remain unsure if that “piece of information” is the same with that one or not, cause it’s open for interpretation.
replies(1): >>42170125 #
29. dllthomas ◴[] No.42170125{4}[source]
Uncertainty as to the line between "one piece of information" and "two pieces of information" may be a problem. I don't think it makes sense to say it's "the problem" when most people don't know that DRY is formulated in those terms in the first place.

Personally, I don't think the ambiguity is actually much of a problem; often it's not ambiguous, and when it is it's usually the case that multiple ways of organizing things are reasonably appropriate and other concerns should dominate (they may need to anyway).

replies(1): >>42170513 #
30. skydhash ◴[] No.42170174{3}[source]
Smalltalk?
replies(1): >>42170551 #
31. wruza ◴[] No.42170513{5}[source]
I read your second paragraph as vagueness is fine, which sort of makes DRY not a helpful principle but a handwavy problem statement with no clear anything.

As in most vague problems, two extreme solutions (join vs dup) are a wrong way to think about it. I have some ideas on how to turn this into a spectrum in a nearby comment.

I think it is important because DRY-flavored problem is basically the thing you meet in the code most. At least that is my experience, as a guy who hates typing out and rediscovering knowledge from slightly different code blocks or tangled multi-path procedures and refactoring these — either in hope that nothing breaks in multiple places, or that you won’t forget to update that one semi-copy.

I’m programming for a very long time and seemingly no one ever even tried to address this in any sensible way.

32. wruza ◴[] No.42170551{4}[source]
Sadly I can’t just go and develop systems in smalltalk eco, too different boots to wear. So there’s no reason to even go and learn about how it does that or a similar thing, cause I not gonna switch or implement it myself in my editor. I’m sure (and confidently so) that I’d like to see exactly the described in editors/ides and that would make my coding life much easier.
33. mewpmewp2 ◴[] No.42170685{6}[source]
You mean it's databases rather than what is in code?

If so, then that's also news to me. I'd have thought that e.g. something like input validation code that can be reused both in backend and client would go under single source of truth. Which I would always prefer not to be repeated, but frequently hard to do unless you have same language in backend and frontend or codegen.

34. karmakurtisaani ◴[] No.42171273{3}[source]
It's, however, unhelpful to point this out, since developers cannot fix it. We need to find ways to live with this dysfunction.
replies(1): >>42177787 #
35. rileymat2 ◴[] No.42171430{3}[source]
Without the encapsulation of a function, won’t the code around the common block depend on the details of the block in ways that cause coupling that make the common block hard to change without detailed analysis of all usages.

I like what you are saying, i think, but am stuck on this internal coupling.

replies(1): >>42191382 #
36. atoav ◴[] No.42171545{3}[source]
Now this is a principle I can totally get behind. If the same information lives in multiple places in your codebase, you are definitly doing it wrong, unless that same information is just coincidentally the same and used for different purposes in different places
37. TheCoelacanth ◴[] No.42174599{4}[source]
I think inexperienced developers write complex code because it's difficult to write simple code and they don't know how yet, not because they're trying to make it complex.
replies(2): >>42180359 #>>42180503 #
38. dustingetz ◴[] No.42177787{4}[source]
it is in fact helpful because it reveals that the problem cannot in fact be fixed at the developer layer, and having that knowledge is the first step down a road towards an actual solution rather than endless bike shedding about whether it is okay to copy paste a function body.
39. necovek ◴[] No.42180359{5}[source]
Yes, I was not trying to imply they do it on purpose, but I can see how it could be read that way.
40. chipdart ◴[] No.42180503{5}[source]
> I think inexperienced developers write complex code because it's difficult to write simple code and they don't know how yet, not because they're trying to make it complex.

From what I've been seeing, inexperienced developers write complex code because they are trained with a bias towards accidentally complex code (i.e., how else would you show off design patterns), they have no experience in dealing with the tradeoffs of writing accidentally complex code, and they do not understand the problems they create for themselves and others by adding complexity where they do not need it.

I'd frame accidental complexity in the same class as dead code: inexperienced developers might be oblivious to the risk presented by codd that serves no purpose, but experienced developers know very well the ticking time bomb nature of it.

41. somethingsome ◴[] No.42187622[source]
I write research code, doing that feels very different than web code for example.

In research it is absolutely OK to copy paste a number x of times, because you don't know a priori what will work the way you want.

Usually, I write an algorithm to solve my problem, then I copy paste the function and change it a bit with another idea, and set a switch to choose between them. Then I copy paste another time as the ideas are flowing, and add one more switch.. Etc..

At some point, when I feel that there is too much duplicated code, I abstract the parts of the functions that are similar and never change, so that I can focus only on the changes of ideas, and no more on the mechanic of the methods.

As the code converges toward something I like, I PRUNE the code and remove all not used functions.

But this process can take weeks, and I can go to another issue in the main time, this is because I don't know in advance what is the right thing to do, so I get a code with several parts duplicated, and when I come back to them, I can choose which version I want to use, if something start to feel smelly, I prune it, etc.. Iteratively.

What I wanted to say, is that duplication of code is really dependent on the kind of code I'm doing.

If I'm doing an app, it's way easier to determine which code to keep and wich code to remove and which code to duplicate. But not all fields are the same.

At some period of my life, I always made clean code for research, you loose too many ideas and hidden behind the abstractions, you are not able anymore to work with your code. When you get a new idea, it requires to go through all the abstractions, which is insane in a very rapidly evolving code.

42. thfuran ◴[] No.42188887{5}[source]
But a thing that you wrote the same a few times isn't something that's definitively required to be the same, it's something that happens to be the same right now. You can often clean things up by factoring out that duplication, but needing to add a bunch of parameters to the resulting function is probably a sign that you're trying to combine things that aren't the same and shouldn't be coupled together.

Where I'm saying you absolutely shouldn't copy paste is where there's a business or technical requirement for something to be calculated/processed/displayed exactly a certain way in several contexts. You don't want to let those drift apart accidentally, though you certainly might decouple them later if that requirement changes.

43. wruza ◴[] No.42191382{4}[source]
It will share nuance with non-hygienic macros, yes. The difference here is that (1) unlike macros which hide what’s going on, the code is always expanded and can be patched locally with the visual indication of an edit, and (2) the changes to the origin block aren’t automatically propagated, you simply see +-patch clutter everywhere, which is actionable but not mandatory.

If you want to patch the origin without cluttering other locations, just move it away from there and put another copy into where it was, and edit.

The key idea is to still have the same copied blocks of code. Code will be there physically repeated at each location. You can erase “block <name> {“ parts from code and nothing will change.

But instead of being lost in the trees these blocks get tagged, so you can track their state and analyze and make decisions in a convenient systemic way. It’s an analysis tool, not a footgun. No change propagates automatically, so coupling problem is not a bigger problem that you would have already with duplicated code approach.

You can even gradually block-ize existing code. See a common snippet again? Wrap it into “block <myname> {…}” and start devtime-tracking it together with similar snippets. Don’t change anything, just take it into real account.