Duplication Isn't Always an Anti-Pattern

1. ninkendo ◴[07 Dec 25 15:39 UTC] No.46182466[source]▶

>>46119117 (OP) #

I've had coworkers in the past that treat code like it needs to be compressed. Like, in the huffman coding sense. Find code that exists in two places, put it in one place, then call it from the original places. Repeat until there's no more duplication.

It results in a brittle nightmare because you can no longer change any of it, because the responsibility of the refactored functions is simply "whatever the orignal code was doing before it was de-duplicated", and don't represent anything logical.

Then, if two places that had "duplicated" code before the refactoring need to start doing different things, the common functions get new options/parameters to cover the different use cases, until those get so huge that they start needing to get broken up too, and then the process repeats until you have a zillion functions called "process_foo" and "execute_bar", and nothing makes sense any more.

I've since become allergic to any sort of refactoring that feels like this kind of compression. All code needs to justify its existence, and it has to have an obvious name. It can't just be "do this common subset of what these 2 other places need to do". It's common sense, obviously, but I still have to explain it to people in code review. The tendency to want to "compress" your code seems to be strong, especially in more junior engineers.

replies(7): >>46182596 #>>46182680 #>>46182731 #>>46182888 #>>46183584 #>>46185399 #>>46191013 #

2. gardenhedge ◴[07 Dec 25 15:53 UTC] No.46182596[source]▶

>>46182466 (TP) #

Yeah I have seen that too. Any it's easily sold to non-technical managers

3. gaigalas ◴[07 Dec 25 16:03 UTC] No.46182680[source]▶

>>46182466 (TP) #

Is there any code (yours, open open source, doesn't matter) that you would recommend as non "huffman compressed"? Give us an example of what you like.

4. swatcoder ◴[07 Dec 25 16:10 UTC] No.46182731[source]▶

>>46182466 (TP) #

Yup. People are taught DRY very early on, as an introductory "engineering" practice above the nuts and bolts of writing code.

But nobody really teaches the distinction between two passages that happen to have an identical implementation vs two passages that represent an identical concept, so they start aggressively DRY'ing up the former even though the practice is only really suited for the latter subset of them.

As you note, when you blindly de-duplicate code that's only identical by happenstance (which is a lot), it's only a matter of time before the concepts making them distinct in the first place start applying pressure for differentiation again and you end up with that nasty spaghetti splatter.

replies(1): >>46186639 #

5. hinkley ◴[07 Dec 25 16:28 UTC] No.46182888[source]▶

>>46182466 (TP) #

I would probably still be working with one of these assholes if I hadn’t gotten laid of. Dude was 40. How tf have you not learned better by now?

6. ShipEveryWeek ◴[07 Dec 25 17:56 UTC] No.46183584[source]▶

>>46182466 (TP) #

I like doing this for data models - but it’s easy for people to go overboard

7. clickety_clack ◴[07 Dec 25 21:35 UTC] No.46185399[source]▶

>>46182466 (TP) #

I think grug has the best refactoring advice: https://grugbrain.dev/

8. RaftPeople ◴[07 Dec 25 23:50 UTC] No.46186639[source]▶

>>46182731 #

> But nobody really teaches the distinction between two passages that happen to have an identical implementation vs two passages that represent an identical concept, so they start aggressively DRY'ing up the former even though the practice is only really suited for the latter subset of them.

Even identical implementations might make more sense to be duplicated when throwing in variables around organizational coupling of different business groups and their change mgmt cycle/requirements.

9. redhale ◴[08 Dec 25 11:19 UTC] No.46191013[source]▶

>>46182466 (TP) #

100% agree. I see this kind of refactoring as a form of bike-shedding. It's so _easy_ to do this, anyone can do it. It's much harder to think about and design for long-term change and maintainability. Much easier to just deduplicate and declare victory.