Most active commenters
  • (5)
  • motorest(4)
  • mdaniel(3)
  • prerok(3)
  • elliotto(3)
  • rednafi(3)
  • tonyarkles(3)
  • fijiaarone(3)
  • devnullbrain(3)

←back to thread

1070 points dondraper36 | 140 comments | | HN request time: 1.334s | source | bottom
1. codingwagie ◴[] No.45069135[source]
I think this works in simple domains. After working in big tech for a while, I am still shocked by the required complexity. Even the simplest business problem may take a year to solve, and constantly break due to the astounding number of edge cases and scale.

Anyone proclaiming simplicity just hasnt worked at scale. Even rewrites that have a decade old code base to be inspired from, often fail due to the sheer amount of things to consider.

A classic, Chesterton's Fence:

"There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, “I don’t see the use of this; let us clear it away.” To which the more intelligent type of reformer will do well to answer: “If you don’t see the use of it, I certainly won’t let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it.”"

replies(44): >>45069141 #>>45069264 #>>45069348 #>>45069467 #>>45069470 #>>45069871 #>>45069911 #>>45069939 #>>45069969 #>>45070101 #>>45070127 #>>45070134 #>>45070480 #>>45070530 #>>45070586 #>>45070809 #>>45070968 #>>45070992 #>>45071431 #>>45071743 #>>45071971 #>>45072367 #>>45072414 #>>45072570 #>>45072634 #>>45072779 #>>45072875 #>>45072899 #>>45073114 #>>45073174 #>>45073183 #>>45073201 #>>45073291 #>>45073317 #>>45073516 #>>45073758 #>>45073768 #>>45073810 #>>45073812 #>>45073942 #>>45073964 #>>45074264 #>>45074642 #>>45080346 #
2. dondraper36 ◴[] No.45069141[source]
The author is a staff engineer at GitHub. I don't think they haven't worked at scale
replies(2): >>45069275 #>>45069653 #
3. MangoToupe ◴[] No.45069264[source]
This could also point to the solution of cutting down the complexity of "big tech". So much of that complexity isn't necessary because it solves problems, it just keeps people employed.
replies(1): >>45069428 #
4. ◴[] No.45069275[source]
5. ricardobeat ◴[] No.45069348[source]
At least half the time, the complexity comes from the system itself, echoes of the organizational structure, infrastructure, and not the requirements or problem domain; so this advice will/should be valid more often than not.
replies(3): >>45069424 #>>45069454 #>>45070465 #
6. codingwagie ◴[] No.45069424[source]
Right but you cant expect perfect implementation, as the complexity of the business needs grows, so does the accidental complexity.
7. mdaniel ◴[] No.45069428[source]
This is a horrifically cynical take and I wish it would stop. I doubt very seriously there is any meaningfully sized collection of engineers who introduce things "just to keep themselves employed," to say nothing of having to now advance that perspective into a full blown conspiracy because code review is also a thing

What is far more likely is the proverbial "JS framework problem:" gah, this technology that I read about (or encounter) is too complex, I just want 1/10th that I understand from casually reading about it, so we should replace it with this simple thing. Oh, right, plus this one other thing that solves a problem. Oh, plus this other thing that solves this other problem. Gah, this thing is too complex!

replies(4): >>45069532 #>>45069875 #>>45072807 #>>45084431 #
8. malux85 ◴[] No.45069454[source]
I was one of the original engineers of DFP at Google and we built the systems that send billions of ads to billions of users a day.

The complexity comes from the fact that at scale, the state space of any problem domain is thoroughly (maybe totally) explored very rapidly.

That’s a way bigger problem than system complexity and pretty much any system complexity is usually the result of edge cases that need to be solved, rather than bad architecture, infrastructure or organisational issues - these problems are only significant at smaller, inexperienced companies, by the time you are at post scale (if the company survives that long) then state space exploration in implementation (features, security, non-stop operations) is where the complexity is.

replies(2): >>45069514 #>>45070057 #
9. prerok ◴[] No.45069467[source]
You are not wrong, but the source of the problem may not be the domain but poor software design.

If the software base is full of gotchas and unintended side-effects then the source of the problem is in unclean separation of concerns and tight coupling. Of course, at some point refactoring just becomes an almost insurmountable task, and if the culture of the company does not change more crap will be added before even one of your refactorings land.

Believe me, it's possible to solve complex problems by clean separation of concerns and composability of simple components. It's very hard to do well, though, so lots of programmers don't even try. That's where you need strict ownership of seniors (who must also subscribe to this point of view).

replies(2): >>45069676 #>>45072434 #
10. jajko ◴[] No.45069470[source]
I am deep in one such corporate complexity, yet I constantly see an ocean of items that could have been in much simpler and more robust way.

Simple stuff had tons of long term advantages and benefits - its easy to ramp up new folks on it compared to some over-abstracted hypercomplex system because some lead dev wanted to try new shiny stuff for their cvs or out of boredom. Its easy to debug, migrate, evolve and just generally maintain, something pure devs often don't care much for unless they become more senior.

Complex optimizations are for sure required for extreme performance or massive public web but that's not the bulk of global IT work done out there.

11. dondraper36 ◴[] No.45069514{3}[source]
Not directly related to the article we're discussing here, but, based on your experience, you might be the ideal kind of person to answer this.

At the scale you are mentioning, even "simple" solutions must be very sophisticated and nuanced. How does this transformation happen naturally from an engineer at a startup where any mainstream language + Postgres covers all your needs, to someone who can build something at Google scale?

Let's disregard the grokking of system design interview books and assume that system design interviews do look at real skills instead of learning common buzzwords.

replies(2): >>45069659 #>>45073682 #
12. elliotto ◴[] No.45069532{3}[source]
I'd recommend reading bullshit jobs by David graeber. Most jobs in most organisations have an incentive structure for an individual to keep themselves employed rather than to actually solve problems.
replies(3): >>45070225 #>>45070329 #>>45072787 #
13. malux85 ◴[] No.45069659{4}[source]
Demonstration of capability will get you hired, capability comes only through practice.

I built a hobby system for anonymously monitoring BitTorrent by scraping the DHT, in doing this, I learned how to build a little cluster, how to handle 30,000 writes a second (which I used Cassandra for - this was new to me at the time) then build simple analytics on it to measure demand for different media.

Then my interview was just talking about this system, how the data flowed, where it can be improved, how is redundancy handled, the system consisted of about 10 different microservices so I pulled the code up for each one and I showed them.

Interested in astronomy? Build a system to track every star/comet. Interested in weather? Do SOTA predictions, interested in geography? Process the open source global gravity maps, interested in trading? Build a data aggregator for a niche.

It doesn’t really matter that whatever you build “is the best in the world or not” - the fact that you build something, practiced scaling it with whatever limited resources you have, were disciplined to take it to completion, and didn’t get stuck down some rabbit hole endlessly re-architecting stuff that doesn’t matter, this is what they’re looking for - good judgement, discipline, experience.

Also attitude is important, like really, really important - some cynical ranter is not going to get hired over the “that’s cool I can do that!” person, even if the cynical ranter has greater engineering skills, genuine enthusiasm and genuine curiosity is infectious.

14. thwarted ◴[] No.45069676[source]
> then the source of the problem is in unclean separation of concerns and tight coupling

Sometimes the problem is in the edges—the way the separate concerns interact—not in the nodes. This may arise, for example, where the need for an operation/interaction between components doesn't need to be idempotent because the need for it to be never came up.

replies(1): >>45069961 #
15. daxfohl ◴[] No.45069871[source]
Though in my previous job, a huge amount of complexity was due to failed, abandoned, or incomplete attempts to refactor/improve systems, and I frequently wondered, if such things had been disallowed, how much simpler the systems we inherited would have been.

This isn't to say you should never try to refactor or improve things, but make sure that it's going to work for 100% of your use cases, that you're budgeted to finish what you start, and that it can be done iteratively with the result of each step being an improvement on the previous.

replies(2): >>45070239 #>>45071003 #
16. fcarraldo ◴[] No.45069875{3}[source]
I don’t agree with the phrasing, but there is certainly a ton of complexity introduced because of engineers who are trying to be promoted or otherwise maintain their image of being capable of solving complex problems (through complex solutions).

It’s not the same as introducing complexity to keep yourself employed, but the result is the same and so is the cause - incentive structures aren’t aligned at most companies to solve problems simply and move on.

replies(1): >>45070209 #
17. patmcc ◴[] No.45069911[source]
The problem with this is no one can agree about what "at scale" means.

Like yes, everyone knows that if you want to index the whole internet and have tens of thousands of searches a second there are unique challenges and you need some crazy complexity. But if you have a system that has 10 transactions a second...you probably don't. The simple thing will probably work just fine. And the vast majority of systems will never get that busy.

Computers are fast now! One powerful server (with a second powerful server, just in case) can do a lot.

replies(2): >>45070278 #>>45070592 #
18. mikeryan ◴[] No.45069939[source]
I had an engineering boss who used this as a mantra (he is now an SVP of engineering at Spotify and we worked together at Comcast)

I think the unspoken part here is “let’s start with…”

It doesn’t mean you won’t have to “do all the things” so much as let’s start with too little so we don’t waste time doing things we end up not needing.

Once you aggregate all the simple things you may end up with a complex behemoth but hopefully you didn’t spend too much time on fruitless paths getting there.

19. prerok ◴[] No.45069961{3}[source]
What, you mean like creating a transaction where if one component does something then the second component fails, the first one should revert?

Again, wrong design. Like I said, it's very difficult to do well. Consider alternate architecture: one component adds the bulk data to request, the second component modifies it and adds other data, then the data is sent to transaction manager that commits or fails the operation, notifying both components of the result.

Now, if the first component is one k8s container already writing to the database and second is then trying to modify the database, rearchitecting that could be a major pain. So, I understand that it's difficult to do after the fact. Yet, if it's not done that way, the problem will just become bigger and bigger. In the long run, it would make more sense to rearchitect as soon as you see such a situation.

20. breadwinner ◴[] No.45069969[source]
The point is to not overengineer. This is not about ignoring scale, or not considering edge cases. Don't engineer for scale that you don't even know is necessary if that complicates the code. Do the simplest thing that meets the current requirements, but write the code in such a way that more features, scale etc. can be added without disrupting dependencies.

See also: Google engineering practices: https://google.github.io/eng-practices/review/reviewer/looki...

And also: https://goomics.net/316

21. wrs ◴[] No.45070057{3}[source]
My rule on edge cases is: It's OK to not handle an edge case if you know what's going to happen in that case and you've decided to accept that behavior because it's not worth doing something different. It's not OK to fail to handle an edge case because you just didn't want to think about it, which quite often is what the argument for not handling it boils down to. (Then there are the edge cases you didn't handle because you didn't know they existed, which are a whole other tragicomedy.)
22. mhitza ◴[] No.45070101[source]
> Anyone proclaiming simplicity just hasnt worked at scale.

Most projects don't operate at scale. And before "at scale", simple, rewritable code will always evolve better, because it's less dense, and less spread out.

There is indeed a balance between the simplest code, and the gradual abstractions needed to maintain code.

I worked with startups, small and medium sized businesses, and with a larger US airline. Engineering complexity is through the roof, when it doesn't have to be. Not on any of the projects I've seen and worked on.

Now if you're an engineer in some mega corp, things could be very different, but you're talking about the 1% there. If not less.

23. sodapopcan ◴[] No.45070127[source]
This is the classic misunderstanding where software engineers can't seem to communicate well with each other.

We can even just look at the title here: Do the simplest thing POSSIBLE.

You can't escape complexity when a problem is complex. You could certainly still complicate it even more than necessary, though. Nowhere in this article is it saying you can avoid complexity altogether, but that many of us tend to over-complicate problems for no good reason.

replies(7): >>45070394 #>>45070713 #>>45072375 #>>45072947 #>>45073130 #>>45074955 #>>45079503 #
24. rufus_foreman ◴[] No.45070134[source]
>> Even rewrites that have a decade old code base to be inspired from, often fail due to the sheer amount of things to consider

A rewrite of a decade old code base is not the simplest thing that could possibly work.

25. rednafi ◴[] No.45070187{3}[source]
Man, who hurt you?

I certainly don’t agree with everything Sean says and admit that “picking the most important work” is a naive thing to say in most scenarios.

But writing Python in production is trivial. Why would anyone lie about that? C is different OTOH. But just because you do a single config change and get paid for that doesn’t mean it’s true for everyone.

Also, staff at GitHub requires a certain bar of excellence. So I wouldn’t blindly dismiss everything just out of spite.

26. mdaniel ◴[] No.45070209{4}[source]
I realized that I should have asked for an example of "too complex" because I may not be following the arguments because my definition of a thing that is "too complex" almost certainly doesn't align with someone else's. In fact, I'd bet that if you rounded up 10 users from this site and polled them for something they thought was "too complex" the intersection would be a very, very small set of things
27. xerxes901 ◴[] No.45070215{3}[source]
I personally know and have (tangentially) worked with the guy and none of what you’ve said is true.

> Look at his CV. Tiny (but impactful) features ///building on existing infrastructure which has already provably scaled to millions and likely has never seen beneath what is a rest api and a react front end///

Off the top of my head he wrote the socket monitoring infrastructure for Zendesk’s unicorn workers, for example.

replies(1): >>45070457 #
28. mdaniel ◴[] No.45070225{4}[source]
I'm with you that the world in general is filled with bullshit jobs, but I do not subscribe to the perspective of wholesale bullshit jobs in the cited "big tech," since in general I do not think that jobs which have meaningful ways to measure them easily fall into bullshit. Maybe middle managers?
replies(1): >>45070907 #
29. rednafi ◴[] No.45070239[source]
Every refactor attempt starts with the intention of 100% coverage.

No one can predict how efficacious that attempt will be from the get-go. Eventually, often people find out that their assumptions were too naive or they don’t have enough budget to push it to completion.

Successful refactoring attempts start small and don’t try to change the universe in a single pass.

replies(1): >>45070675 #
30. rednafi ◴[] No.45070278[source]
Yep, vertical scaling goes a long way. But it’s not compute where the bottleneck for scale lies, rather in the resiliency & availability.

So although a single server goes a long way, to hit that sweet 99.999 SLA, people horizontally scales way before hitting the maximum compute capacity of a singe machine. HA makes everything way more difficult to operate and reason about.

31. ants_everywhere ◴[] No.45070329{4}[source]
He's an anarchist so it's not a surprise that he's grinding out the same old tropes about organizations
32. lll-o-lll ◴[] No.45070394[source]
> We can even just look at the title here: Do the simplest thing POSSIBLE.

I think the nuance here is that “the simplest thing possible” is not always the “best solution”. As an example, it is possible to solve very many business or operational problems with a simple service sitting in front of a database. At scale, you can continue to operate, but the amount of man-hours going into keeping the lights on can grow exponentially. Is the simplest thing possible still the DB?

Complexity is more than just the code or the infrastructure; it needs to run the entire gamut of the solution. That includes looking at the incidental complexity that goes into scaling, operating, maintaining, and migrating (if a temporary ‘too simple but fast to get going’ stack was chosen).

Measure twice, cut once. Understand what you are trying to build, and work out a way to get there in stages that provide business value at each step. Easier said than done.

Edit: Replies seem to be getting hung up over the “DB” reference. This is meant to be a hypothetical where the reader infers a scenario of a technology that “can solve all problems, but is not necessarily the best solution”. Substitute for “writing files to the file system” if you prefer.

replies(8): >>45070526 #>>45070559 #>>45070597 #>>45070639 #>>45070889 #>>45070897 #>>45072898 #>>45073722 #
33. makeitdouble ◴[] No.45070465[source]
> the organizational structure, infrastructure

Those are things that matter and can't be brushed away though.

What Conway's law describes is also optimization of the software to match the shape it can be developped and maintained with fewer frictions.

Same for infra, complexity induced by it shouldn't be simplified unless you also simplify/abatract the infra first.

replies(1): >>45071014 #
34. analog31 ◴[] No.45070480[source]
If it's a legacy system, then it lives at the edges. The edges are everything.

I wish I could remember or find the proof, but in a multi-dimensional space, as the number of dimensions rise, the highest probability is for points to be located near the edges of the system -- with the limit being that they can be treated as if they all live at the edges. This is true for real systems too -- the users have found all of the limits but avoid working past them.

The system that optimally accommodates all of the edges at once is the old system.

replies(1): >>45070523 #
35. CuriouslyC ◴[] No.45070523[source]
You don't need a complicated proof, just assume a distribution in some very high number of dimensions, with samples from that distribution having randomly generated values from the distribution for each dimension. If you have if you have ~300 dimensions then statistically at least one dimension will be ~3SD from the mean, i.e. "on the edge," and as long as any one dimension is close to an edge, we define a point as being "near the edge."

It's not really meaningful though, at high dimensions you want to consider centrality metrics.

36. tbrownaw ◴[] No.45070526{3}[source]
Consider for example, computerizing a currently-manual process. And the 80/20 rule.

Do you handle one "everything is perfect" happy path, and use a manual exception process for odd things?

Do you handle "most" cases, which is more tech work but shrinks the number of people you need handling one-off things?

Or do you try to computerize everything no matter how rare?

replies(1): >>45070928 #
37. javier2 ◴[] No.45070530[source]
First of all, I dont disagree. Just wanted to add that "the simple thing" is often not the obvious thing to do, and only becomes apparent after working on it for a while. Often times, when you dive into a set of adjacent functionality, you discover that it barely even works, and does not actually do nearly all the things you thought it did.
replies(1): >>45070739 #
38. qaq ◴[] No.45070559{3}[source]
Is the simplest thing possible still the DB? Yes thats why google spent decent amount of resources building out spanner because for many biz domains even at hyper scale it's still the DB.
39. bdangubic ◴[] No.45070586[source]
every complex domain and “at scale” is just a bunch of simple things in disguise… our industry is just terrible in general about breaking things down. we sort of know this so we came up with shit things like “microservices” but you spend sufficient time in the industry (almost three decades for me) and you won’t find a single place that has microservices architecture than you haven’t wished was a monolith :) we are just terrible at this… there is no complex domain, it is just a good excuse we use to justify things
replies(1): >>45070637 #
40. hansvm ◴[] No.45070592[source]
Yeah, we do 100k ML inferences per second. It's not a single server, but the architecture isn't much more complicated than that.

With today's computers, indexing the entire internet and serving 100k QPS also isn't really that demanding architecturally. The vast majority of current implementation complexity exists for reasons other than necessity.

41. XorNot ◴[] No.45070597{3}[source]
I have worked at too many companies where the effort spent not using a simple database was an exponential drag on everything.

Hell I just spent a week doing something which should've taken 5 minutes because rather then a settings database, someone has just been maintaining a giant ball of copy+pasted terraform code instead.

replies(1): >>45073504 #
42. zhouzhao ◴[] No.45070637[source]
Oh boy, this is the best example of "I have been doing it the same way for 30 years" I have ever seen in the world wild web
replies(1): >>45071035 #
43. sodapopcan ◴[] No.45070639{3}[source]
Right, and again this is reading too much into it. The simplest thing possible does not mean the best solution. If your solution that worked really well yesterday no longer scales today, it's no longer the correct solution and will require a more complex one.
replies(2): >>45071988 #>>45072897 #
44. daxfohl ◴[] No.45070675{3}[source]
Sure, but do some due diligence. I just say that because I've seen a couple cases where someone does a hack week project that introduces some new approach that "makes things so much cleaner". But then after spending a couple months productionizing it and rolling out the first couple iterations to prod amid much fanfare, it becomes evident that while it makes some things easier (oftentimes things that weren't all that hard to begin with), it makes other things a lot harder. So then you're stuck: do you keep pushing even though it's a net negative, do you roll back and lose all that work, or do you stall and leave a two-headed system?

In most of these cases, a few days up front exploring edge cases would have identified the problems and likely would have red lighted the project before it started. It can make you feel like a party pooper when everyone is excited about the new approach, but I think it's important that a few people on the team are tasked with identifying these edge cases before greenlighting the project. Also, maybe productionize your easiest case first, just to get things going, but then do your hardest case second, to really see if the benefits are there, and designate a go/rollback decision point in your schedule.

Of course, such problems can come up in any project, but from what I've seen they tend to be more catastrophic in refactoring/rearchitecting projects. If nothing else, because while unforeseen difficulties can be hacked around for new feature launches, hacking around problems completely defeats the purpose of a refactoring project.

45. hammock ◴[] No.45070713[source]
Yes. I like to distinguish between “complex” (by nature) and “complicated” (by design)
replies(1): >>45070895 #
46. hammock ◴[] No.45070739[source]
Yes. The simple thing is not necessarily the obvious thing or the most immediately salient thing. First explore the problem-solution space thoroughly, THEN choose the simple thing
47. isaacremuant ◴[] No.45070809[source]
> Even the simplest business problem may take a year to solve, and constantly break due to the astounding number of edge cases and scale.

You're doing it wrong. More likely than not.

> Anyone proclaiming simplicity just hasnt worked at scale. Even rewrites that have a decade old code base to be inspired from, often fail due to the sheer amount of things to consider.

Or, you're just used to excusing complexity because your environment rewards complexity and "big things".

Simple is not necessarily easy. Actually simple can be way harder to think of and push for, because people are so used to complexity.

Yes. Massive scale and operations may make things harder but seeking simplicity is still the right choice and "working in big tech" is not a particular hard or rare credential in HN. Try an actual argument instead of an appeal to self authority.

48. quietbritishjim ◴[] No.45070889{3}[source]
> At scale, you can continue to operate, but the amount of man-hours going into keeping the lights on can grow exponentially. Is the simplest thing possible still the DB?

Don't worry, the second half of the title has this covered:

> ... that could possibly work

In the scenario you've described, the technology is not working, in the complete sense including business requirements of reasonable operating costs.

Perhaps it really did work at first, in the complete sense, when the number of users was quite small. That's where the actual content of the article kicks in: it suggests you really do use that simple solution, because maybe you'll never need to scale after all, or you'll need to rewrite everything by then anyway, or you'll have access to more engineering talent by then, etc. I'd tend to agree, but with the caveat that you should feel free to break the rule so long as you're doing it consciously. But none of that implies that you should end up in the situation you described.

replies(2): >>45071232 #>>45073767 #
49. zdragnar ◴[] No.45070895{3}[source]
The distinction you make is known to me as natural complexity (the base level due to the nature of the domain) and accidental complexity (that which is added unnecessarily on top of it).

Your definition rubs up against what a UX designer taught me years ago, which is that simple and complex are one spectrum, similar to but different from easy and hard.

Often, simple is confused for easy, and complex for hard. However, simple interfaces can hide a lot of information in unintuitive ways, while complex interfaces can present more information and options up front.

replies(1): >>45072993 #
50. neonrider ◴[] No.45070897{3}[source]
> I think the nuance here is that “the simplest thing possible” is not always the “best solution”.

The programmer's mind is the faithful ally of the perfect in its war waged against the good enough.

The "best" solution for most people that have a problem is the one they can use right now.

replies(1): >>45073305 #
51. elliotto ◴[] No.45070907{5}[source]
Do you reckon the KPI's and performance indicators used in big tech count as meaningful ways to measure performance? Wouldn't someone implementing a complex resume-driven project score highly on these measurements, despite a simpler solution being correct? I am not sure that job-hopping every 18 months to maximise TC (ie optimise against your incentives) is a great way to learn about long-term design and organisational implications.

I'm not saying that these jobs are bullshit in the same way that a VP of box-ticking is, just that it's not a conspiracy that a cathedral based on 'design-doc culture' might produce incentives that result in people who focus on maximising their performance on these fiscally rewarding dot points, rather than actualising their innate belief in performant and maintainable systems.

I work at a start-up so if my code doesn't run we don't get paid. This motivates me to write it well.

52. tonyarkles ◴[] No.45070928{4}[source]
My favourite example of this from my own career... automating timesheet -> payroll processing in a unionized environment. As we're converting the collective bargaining agreement into code, we discover that there are a pair of rules that seem contradictory. Go talk to someone in the payroll department to try to figure out how it's handled. Get an answer that makes decent sense, but have a bit of a lingering doubt about the interpretation. Talk to someone else in the same department... they tell us the alternative interpretation.

Bring the problem back to our primary contact and they've got no clue what to do. They're on like year 2 of a 7 year contract and they've just discovered that their payroll department has been interpreting the ambiguous rules somewhat randomly. No one wants to commit to an interpretation without a memorandum of understanding from the union, and no one wants to start the process of negotiating that MoU because it's going to mean backdating 2 years of payroll for an unknown number of employees, who may have been affected by it one month but not the next, depending on who processed their paystub that month.

That was fun :D

replies(2): >>45076449 #>>45076856 #
53. mattmcknight ◴[] No.45070992[source]
This is where John Gall's Systemantics comes into play, “A complex system that works is invariably found to have evolved from a simple system that worked. The inverse proposition also appears to be true: A complex system designed from scratch never works and cannot be made to work. You have to start over, beginning with a working simple system."

Obviously a bit hyperbolic, but matches my experience.

replies(1): >>45073744 #
54. fijiaarone ◴[] No.45071003[source]
The problem isn’t refactoring, its that it was failed, abandoned, or incomplete.

And that’s usually because the person or small group that began the refactor weren’t given the time and resources to do the refactor, and uninterested or unknowledgable people hijacked and over complicated the process, and others blocked it from happening, so what would have taken a few weeks for the initial team to have completed the refactor successfully, with a little help and cooperation from others, and had they not been pulled in 10 different ways to fight other fires — instead after months and months and expending tons of time and money on people mucking it up instead of fixing it, the refactor got abandoned, a million dollars was wasted, and the system as a whole was worse than it was before.

55. fijiaarone ◴[] No.45071014{3}[source]
Conway wasn’t proscribing a goal, he was describing a problem.
56. fijiaarone ◴[] No.45071035{3}[source]
Google and Amazon were doing things at roughly the same scale* 20 years ago on slower hardware and less of it.

* They might be serving twice as much (but definitely not ten times as much) as they were in 2005 but mostly that scales horizontally very easily.

57. lll-o-lll ◴[] No.45071232{4}[source]
> Perhaps it really did work at first, in the complete sense, when the number of users was quite small. That's where the actual content of the article kicks in: it suggests you really do use that simple solution, because maybe you'll never need to scale after all, or you'll need to rewrite everything by then anyway, or you'll have access to more engineering talent by then, etc.

This is where I am arguing nuance. These decisions are contextual; and the superficially more complicated solution may be solving inherent complexity in the problem space that only provides benefit over a time period.

As an example, some team might decide to forgo a database and read/write directly to the file system. This may enable a release in less time and that might be the right decision in certain contexts. Or it could be a terrible decision as the externalised costs begin to manifest and the business fails because of loss of customer trust.

My point is that you cannot only look at what is right in front of you, you also need to tactically plan ahead. In the big org context, you also need to strategically plan ahead.

58. ◴[] No.45071431[source]
59. jimbokun ◴[] No.45071743[source]
When the domain is complex, it's even MORE important that the individual components be simple with clean interfaces between them. If everything is too intertwined, you lose the ability to make changes or add new functionality without accidentally breaking something else.

As for Chesterton's Fence, you have the causality backwards. You should not build a fence or gate before you have a need for it. However, when you encounter an existing fence or gate, assume there must have been a very good reason for building it in the first place.

60. ehnto ◴[] No.45071971[source]
> Even rewrites that have a decade old code base to be inspired from, often fail due to the sheer amount of things to consider.

The amount of knowledge required to first generate the codebase, that is now missing for the rewrite, is the elephant in the room for rewrites. That's a decade of decision making, business rules changing, knowledge leaving when people depart etc.

Much like your example, if you think all the information is in the codebase then you should go away and start talking to the business stakeholders until you understand the scope of what you don't currently know.

61. achierius ◴[] No.45071988{4}[source]
But sometimes it IS better to think a few steps ahead, rather than building a new system from scratch every time things scale up. It's not always easy to upgrade things incrementally: just look at IPv4 vs IPv6
replies(8): >>45072124 #>>45072267 #>>45072373 #>>45072515 #>>45072559 #>>45072870 #>>45074205 #>>45078662 #
62. oivey ◴[] No.45072124{5}[source]
It can be hard enough to fix things when some surprise happens. Unwinding complicated “future proof” things on top of that is even worse. The simpler something is, the less you hopefully have to throw away when you inevitably have to.
63. fruitplants ◴[] No.45072267{5}[source]
I agree with thinking a few steps ahead. It is particularly useful in case of complex problems or foundational systems.

Also maybe simplicity is sometimes achieved AFTER complexity, anyway. I think the article means a solution that works now... target good enough rather than perfect. And the C2 wiki (1) has a subtitle '(if you're not sure what to do yet)'. In a related C2 wiki entry (2) Ward Cunningham says: Do the easiest thing that could possibly work, and then pound it into the simplest thing that could possibly work.

IME a lot of complexity is due to integration (in addition to things like scalability, availability, ease of operations, etc.) If I can keep interfaces and data exchange formats simple (independent, minimal, etc.) then I can refactor individual systems separately.

1. https://wiki.c2.com/?DoTheSimplestThingThatCouldPossiblyWork

2. https://wiki.c2.com/?SimplestOrEasiest

64. monkeyelite ◴[] No.45072367[source]
I have worked at scale - I have found countless examples of people not believing in simple solutions which eventually prevail and replace the big-complex thing.

Complexity is a learned engineering approach - it takes practice to learn to do it another way. So if all you see is complex solutions how would you learn otherwise?

replies(1): >>45072402 #
65. baxtr ◴[] No.45072373{5}[source]
Yes sometimes. But how can you know beforehand? It’s clear in hindsight, for sure.

The most fundamental issue I have witnessed with these things is that people have a very hard time taking a balanced view.

For this specific problem, should we invest in a more robust solution which takes longer to build or should we just build a scrappy version and then scale later?

There is no right or wrong. It’s depends heavily on the context.

But, some people, especially developers I am afraid, only have one answer for every situation.

66. motorest ◴[] No.45072375[source]
> We can even just look at the title here: Do the simplest thing POSSIBLE.

I think you're focusing on weasel words to avoid addressing the actual problem raided by OP, which is the elephant in the room.

Your limited understanding of the problem domain doesn't mean the problem has a simple or even simpler solution. It just means you failed to understand the needs and tradeoffs that led to complexity. Unwittingly, this misunderstanding originates even more complexity.

Listen, there are many types of complexity. Among which there is complexity intrinsic to the problem domain, but there is also accidental complexity that's needlessly created by tradeoffs and failures in analysis and even execution.

If you replace an existing solution with a solution which you believe is simpler, odds are you will have to scramble to address the impacts of all tradeoffs and oversights in your analysis. Addressing those represents complexity as well, complexity created by your solution.

Imagine a web service that has autoscaling rules based on request rates and computational limits. You might look at request patterns and say that this is far too complex, you can just manually scale the system with enough room to handle your average load, and when required you can just click a button and rescale it to meet demand. Awesome work, you simplified your system. Except your system, like all web services, experiences seasonal request patterns. Now you have schedules and meetings and even incidents that wake up your team in the middle of the night. Your pager fires because a feature was released and you didn't quite scaled the service ro accommodate for the new peak load. So now your simple system requires a fair degree of hand holding to work with any semblance of reliability. Is this not a form of complexity as well? Yes, yes it is. You didn't eliminated complexity, it is only shifted to another place. You saw complexity in autoscaling rules and believed you eliminated that complexity by replacing it with manual scaling, but you only ended up shifting that complexity somewhere else. Why? Because it's intrinsic to the problem domain, and requiring more manual work to tackle that complexity introduces more accidental complexity than what is required to address the issue.

replies(1): >>45074818 #
67. motorest ◴[] No.45072402[source]
> I have worked at scale - I have found countless examples of people not believing in simple solutions which eventually prevail and replace the big-complex thing.

I have worked at scale. I have found examples where simple solutions prevail due to inertia and inability or unwillingness to acknowledge the simple solutions failed to adequately address the requirements. The accidental complexity created by those simple solutions is downplayed as it would require reevaluating the simple solution, and thus run books and operations and maintenances are required as part of your daily operations because that's how the system is. And changing it would be too costly.

Let's not fool ourselves.

replies(1): >>45072472 #
68. naasking ◴[] No.45072414[source]
> Even the simplest business problem may take a year to solve, and constantly break due to the astounding number of edge cases and scale.

Is this really because the single problem is inherently difficult, or because you're trying to solve more than one problem (scope creep) due a fear of losing revenue? I think a lot of complexity stems from trying to group disparate problems as if they can have a single solution. If you're willing to live with a smaller customer base, then simple solutions are everywhere.

If you want simple solutions and a large customer base, that probably requires R&D.

replies(1): >>45072861 #
69. motorest ◴[] No.45072434[source]
> If the software base is full of gotchas and unintended side-effects then the source of the problem is in unclean separation of concerns and tight coupling.

Do you know how you get such a system? When you start with a simple system and instead of redesigning it to reflect the complexity you just keep the simple system working while extending it to shoehorn the features it needs to meet the requirements.

We get this all the time, specially when junior developers join a team. Inexperienced developers are the first ones complaining about how things are too complex for what they do. More often than not that just reflects opinionated approached to problem domains they are yet to understand. Because all problems are simple once you ignore all constraints and requirements.

replies(1): >>45077277 #
70. monkeyelite ◴[] No.45072472{3}[source]
> I have worked at scale

Yep- this is why it’s a silly comment to make. Now we are where we are if we didn’t qualify the conversation as being for “big scale engineers” only.

How did those replacements go? Or were you just hoping for the opportunity?

71. twbarr ◴[] No.45072515{5}[source]
IPv6 is arguably a good example of what happens when you don't do the simplest thing possible. What we really needed was a bigger IP address space. What we got was a whole bunch of other crap. If we had literally expanded IPv4 by a couple of octets at the end (with compatible routing), would we be there now?
replies(2): >>45073247 #>>45073948 #
72. tonyedgecombe ◴[] No.45072559{5}[source]
>But sometimes it IS better to think a few steps ahead

The trouble is by the time you get there you will discover the problem isn't what you expected and it will all have been wasted effort.

https://en.wikipedia.org/wiki/You_aren't_gonna_need_it

73. PaulRobinson ◴[] No.45072570[source]
I remember reviewing some code of an engineer I was managing at a FAANG. Noticed an edge case. Pointed out I thought if/when that hit, it was going to cause an alarm that would page on-call. He suggested it might be OK to ship because it was "about a one in a million chance of being hit". The service involved did 500,000 TPS. "So, just 30 times a minute, then?"

And you're right about the amount of engineering that goes into solving problems. One service adjacent to my patch was more than a decade old. Was on a low TPS but critical path for a key business problem. Had not been touched in years. Hadn't caused a single page in that decade, just trudged along, really solidly well engineered service. Somebody suggested we re-write it in a modern architecture and language (it was a kind of mini-monolith in a now unfashionable language). Engineering managers and principals all vetoed that, thank goodness - would have been 5+ years of pain for zero upside.

74. zaphirplane ◴[] No.45072634[source]
Accidental complexity is a thing, YAGNI is a thing, tech debt caused complexity is a thing, I’m a foo programmer let me write bar code like it’s foo is a thing. I don’t know if its all high quality needed
75. flohofwoe ◴[] No.45072779[source]
The key is 'required complexity'.

This is different from adding pointless complexity that doesn't help solve the problem but exists only because it is established 'best practice' or 'because Google does it that way' and I've seen this many more times than complex software where the complexity is actually required. And such needlessly complex software is also usually a source of countless day-to-day problems (if it makes its way out the door in the first place) while the 'simplistic' counterpart usually just hums along in the background without anybody noticing - and if there's a problem it's easy to fix because the codebase is simple and easy to understand by anybody looking into the problem. Of course after 20 years of such changes, the originally simple code base may also grow into a messy hairball, but at least it's still doing its thing.

replies(1): >>45073300 #
76. rcxdude ◴[] No.45072787{4}[source]
I think Graeber misses the mark quite substantially, in that I think to the extent that BS jobs exist, they are rarely perceived as such by the people who are doing them (in fact, the data suggests the opposite correlation: people doing important but 'shit' jobs are more likely to report that their work is bullshit than people doing work that Graeber would view as 'bullshit', like management consulting and marketing).
replies(1): >>45073389 #
77. rcxdude ◴[] No.45072807{3}[source]
I think engineers like to create things. And they will tend, on the whole, to create new things when they have a chance, not because they want to justify their employment, but because they like to do it. And so, if you employ a lot of software engineers, you're going to have a lot of code. Combine that with an incentive structure (likely also created by engineers that like to make new things) which rewards making new things but doesn't particularly reward maintaining old things, and you'll have a lot of new things made, whether it's useful on the scale of the whole organization (something which is very hard to get a good perspective on as an individual contributer, anyway), or not.
78. 3036e4 ◴[] No.45072861[source]
Much of this begins with the customers. If they were better at identifying their real needs and specify the most simple possible tools they need, we would not have to deliver bizarrely complex does-everything bloated monster solutions and they could have much more stable, and cheaper, software.

Of course marketing and sales working hard to convince customers that they need more of everything, all the time, doesn't help.

79. lelanthran ◴[] No.45072870{5}[source]
> But sometimes it IS better to think a few steps ahead, rather than building a new system from scratch every time things scale up.

The problem is knowing when to do it and when not to do it.

If you're even the slightest bit unsure, err on the side of not thinking a few steps ahead because it is highly unlikely that you can see what complexities and hurdles lie in the future.

In short, it's easier to unfuck an under engineered system than an over engineered one.

replies(2): >>45076667 #>>45078684 #
80. jlg23 ◴[] No.45072875[source]
> Even the simplest business problem may take a year to solve, and constantly break due to the astounding number of edge cases and scale.

edge case (n): Requirement discovered after the requirements gathering phase.

81. fauigerzigerk ◴[] No.45072897{4}[source]
The slogan is unhelpful because the cost of failure cannot be factored into the meaning of "working".

"could possibly work" is clearly hyperbole as it would only exclude solutions that are guaranteed to fail.

But even under a more plausible interpretation, this slogan ignores the cost of failure as an independent justification for adding complexity.

It's bad advice.

82. mattlutze ◴[] No.45072898{3}[source]
No pop psychology maxim is universally true. However in your example we're presented with the outdated understanding of "tech debt."

> As an example, it is possible to solve very many business or operational problems with a simple service sitting in front of a database.

If this is the simplest approach within the problem space or business's constraints, and meets the understood needs, it may indeed be the right choice.

> At scale, you can continue to operate, but the amount of man-hours going into keeping the lights on can grow exponentially. Is the simplest thing possible still the DB?

No problem in a dynamic human system can be solved statically and left alone. If the demands on a solution grows, and the problem space or business's needs changes, then the solution should be reassessed and the new conditions solved for.

Think of it alternatively as resource-constrained work allocation, or agile problem solving. If we don't have enough labor available (and we rarely do) to solve everything "best," then we need to draw a line. Decades of practice now have shown that it's a crap shoot to guess at the shape of levels of complexity down the road.

Best case you spend time that could go into something else valuable today, to solve a problem for a year from now; worst case you get the assumptions wrong and fail to solve that second "today" problem as well as still needing to spend future time on refactoring.

83. mytailorisrich ◴[] No.45072899[source]
Sometimes, often even, complexity and edge cases are symptoms that the problem is not fully understood and that the solution is not optimal.
84. imgabe ◴[] No.45072947[source]
A complex system that works is always found to have evolved from a simple system that worked.

You can keep on doing the simplest thing possible and arrive at something very complex, but the key is that each step should be simple. Then you are solving a real problem that you are currently experiencing, not introducing unnecessary complexity to solve a hypothetical problem you imagine you might experience.

replies(1): >>45089116 #
85. saghm ◴[] No.45072993{4}[source]
To me, the benefit of simplicity is that it can help avoid the need to try to guess what future requirements will be by leaving room for iteration. Making something complex up front often increases the burden when trying to change things in the future. Crucially, this requires being flexible to making those changes in the future though rather than letting the status quo remain indefinitely.

The main argument I've seen against this strategy of design is concern over potentially needing to make breaking changes, but in my experience, it tends to be a lot easier to try to come up with a simple design that can solve most of the common cases but leaves design space for future work to solve more niche cases that wouldn't require breaking the existing functionality than trying to anticipate every possible case up front. After a certain point, our confidence in our predictions dips low enough that I think it's smarter to bet on your ability to avoid locking yourself into a choice that would break things to change later than to make the correct choice based on those predictions.

86. afro88 ◴[] No.45073114[source]
> I think this works in simple domains

You're not wrong. So many engineers operating in simple domains, on MVPs that don't have scale yet, on internal tools even. They introduce so much complexity thinking they're making smart moves.

Product people can be similar, in their own way. Spending lots of time making onboarding perfect when the feature could do less, cater for 95% of use cases, and need no onboarding at all.

87. jbreckmckye ◴[] No.45073130[source]
I think you're accidentally conducting a motte and bailey fallacy here.

It's making an ambitious risky claim (make things simpler than you think they need to be) then retreating on pushback to a much safer claim (the all-encompassing "simplest thing possible")

The statement ultimately becomes meaningless because any interrogation can get waved away with "well I didn't mean as simple as that."

But nobody ever thinks their solution is more complex than necessary. The hard part is deciding what is necessary, not whether we should be complex.

replies(1): >>45073666 #
88. vasco ◴[] No.45073174[source]
Consider this, everyone at whichever skill level they're at, benefits from applying simplicity to their designs. Also, everyone at any skill level will have a tendency to think the work they are doing is actually deep enough to require the complexity when it reaches the border of their intelligence.

I dont know if you only have genius friends but I can tell you many stories of things people thought warranted complexity that I thought didn't. So whatever you consider hard enough to warrant complexity, just know there's another smarter guy than you thinking you're spinning your wheels.

Also it's an impossible conversation to have without specific examples. Anyone can come and make a handwavy case about always simplifying and someone can make a case about necessary complexity but without specific example none can be proven wrong.

89. ◴[] No.45073183[source]
90. xelxebar ◴[] No.45073201[source]
> I think this works in simple domains.

Business incentives are aligned around incremental delivery, not around efficient encoding of the target domain. The latter generally requires deep architectural iteration, meaning multiple complete system overhauls and/or rewrites, which by now are even vilified as a trope.

Mostly, though, I think there is just availability bias here. The simple, solid systems operating at scale and handled by a 3-person team are hard to notice over the noise that naturally arises from a 1,000-person suborganization churning on the same problem. Naturally, more devs will only experience the latter, and due to network effects, funding is also easier to come by.

91. Sesse__ ◴[] No.45073247{6}[source]
That “with compatible routing” thing pulls a lot of weight… I mean, if you have literal magic, then sure.

Apart from that, IPv6 _is_ IPv4 with a bigger address space. It's so similar it's remarkable.

92. mpweiher ◴[] No.45073291[source]
Yet, many times a lot of that scale and complexity is accidental.

Case in point: when I joined the BBC I was tasked with "fixing" the sports statistics platform. The existing system consisted of several dozen distinct programs instantiated into well over a hundred processes and running on around a dozen machines.

I DTSSTCPW / YAGNIed the heck out of that thing and the result was a single JAR running on a single machine that ran around 100-1000 times faster and was more than 100 times more reliable. Also about an order of magnitude less code while having more features and being easier to maintain expand.

https://link.springer.com/chapter/10.1007/978-1-4614-9299-3_...

And yeah, I was also extremely wary of tearing that thing down, because I couldn't actually understand the existing system. Nobody could. Took me over half a year to overcome that hesitancy.

Eschew Clever Rules -- Joe Condon, Bell Labs (via "Bumper Sticker Computer Science", in Programming Pearls)

https://tildesites.bowdoin.edu/~ltoma/teaching/cs340/spring0...

93. fuzzfactor ◴[] No.45073300[source]
Yes, I would say do the simplest thing and it could possibly work.

If it doesn't, go from there whether you need to find an alternative or add another layer of complexity.

I think when complexity does build, it can snowball when a crew comes along and finds more than could be addressed in a year or two. People have to be realistic that there's more than one way to address it. For one it could be a project to identify and curtail existing excess complexity, another approach is to reduce the rate of additional complexity, maybe take it to the next level and completely inhibit any further excess, or any additional complexity of any kind at all. Ideally all of the above.

Things are so seldom ideal, and these are professionals ;)

No matter what, the most pressing requirement is to earnestly begin mastering the existing complexity to a pretty good extent before it could possibly be addressed by a neophyte. That's addressing it right there. Bull's-eye in fact.

Once a good amount of familiarity is established this could take months to a year(s) in such a situation. By this point a fairly accurate awareness of the actual degree of complexity/debt can be determined, but you can't be doing nothing this whole time. So you have naturally added at least some debt most likely, hopefully not a lot but now you've got a better handle on how it compares to what is already there. If you're really sharp the only complexity you may add is not excess at all but completely essential and you make sure of it.

Now if you keep carrying on you may find you have added some complexity that may be excess yourself, and by this time you may be in a situation where that excess is "insignificant" compared to what is already there, or even compared to what one misguided colleague or team might be routinely erecting on their own. You may even conclude that the only possible eventual outcome is to topple completely.

What you do about it is your own decision as much as it can be, and that's most often the way it is bound to always increase in most organizations, and never come down. So that's the most common way it's been addressed so far, as can be seen.

94. mpweiher ◴[] No.45073305{4}[source]
And in the context of XP, which is where DTSTTCPW comes from:

The one you can use right now in order to get feedback from real world use, which will be much better at guiding you in improving the solution than what you thought was "best" before you had that feedback.

Real world feedback is the key. Get there as quickly as feasible, then iterate with that.

95. tim333 ◴[] No.45073317[source]
A very complex domain is medical records. The UK has managed to blow billions on custom systems that didn't work. The simplest thing that could have worked was maybe just to download an open source version of VistA (https://en.wikipedia.org/wiki/VistA). Probably would have worked better.
96. elliotto ◴[] No.45073389{5}[source]
Possibly, I mean in the book there's a hard number survey that says 37% of their sample described their job as not making a meaningful difference. It's a great book.

https://inthesetimes.com/article/capitalism-job-bullshit-dav...

97. fmbb ◴[] No.45073504{4}[source]
A giant ball of copypasted Terraform is not the simplest thing that could possibly work.

Adding the runtime complexity and maintenance work for a new database server is not a small decision.

replies(1): >>45074359 #
98. aeonik ◴[] No.45073516[source]
Simple does not mean easy.

This is still one of my favorite software presentations.

https://youtu.be/SxdOUGdseq4?si=OqfLAilgB2ERk_8H

99. PickledJesus ◴[] No.45073666{3}[source]
Thank you, I was trying to put a finer point on what I disagreed with in that comment but that's better than I'd have done. It's like saying "just pick the best option"
100. sethammons ◴[] No.45073682{4}[source]
Systems begin to slow. You measure and figure out a way to get performance acceptable again. You gain stakeholder alignment and push towards delivering results.

There are steps that most take. Start with caching. Then you learn about caching strategies because the cache gets slow. Then you shard the database and start managing multiple database connections and readers and writers. Then you run into memory, cpu, or i/o pressure. Maybe you start horizontally scaling. Connections and file descriptors have limits you learn about. Proxies might enter your lexicon. Monitoring, alerting, and testing all need improvement. And recently teams are getting harder to manage and projects are getting slower. Maybe deploying takes forever. So now we break up into different domains. Core backend, control panel, compliance, event processing, etc.

As the org grows and continues to change, more and more stakeholders appear. Security, API design, different cost modeling, product and design, and this web of stakeholders all have competing needs.

Go back to my opening stanza. Rinse and repeat.

Doing this exposes patterns and erroneous solutions. You work to find the least complex solution necessary to solve the known constraints. Simple is not easy (great talk, look it up). The learnings from these battle scars is what makes a staff level engineer methinks. You gain stories and tools for delivering solutions that solve increasingly larger systems and organizations. I recently was the technical lead for a 40 team software project. I gained some more scars and learnings.

An expert is someone who has made and learned from many mistakes in a narrow field. Those learnings and lessons get passed down in good system design interview books, like Designing Data Intensive Applications.

101. strogonoff ◴[] No.45073722{3}[source]
The apparent discrepancy between “the simplest thing possible” and “the best solution” only exists if we forget that a product exists in time. If the goal is not just an app that works today but something that won’t break tomorrow, that changes what the simplest thing is. If what seems like the simplest thing makes it difficult to maintain, have many poorly vetted dependencies, etc., then that is not really the simplest thing anymore.

When this is accounted for, “the simplest thing” approaches “the best solution”.

102. thinkharderdev ◴[] No.45073744[source]
I agree with the saying as such but I think it's actually a counterpoint to the "do the simplest thing that could possibly work" idea. When building a system initially you want to do the simplest thing that can possible work, given some appropriate definition of "working". Ideally as the systems requirements evolve you should refactor to address the complexity by adding abstractions, making things horizontally scalable, etc. But for any given change the "simplest thing that can possible work" is usually something along the lines of "we'll just add another if-statement" or "we'll just add another parameter to the API call". Before you know it you have an incomprehensible API with 250 parameters which interact in complex ways and a rats nest of spaghetti code serving it.

I prefer the way Einstein said it (or at least I've heard it attributed to him, not sure if he actually said it): "Make things as simple as possible, but no simpler".

replies(1): >>45089198 #
103. jaynate ◴[] No.45073758[source]
Wow, Chesterton’s fence parable could apply in so many places (not the least of which, politics).
104. devnullbrain ◴[] No.45073767{4}[source]
Then the title just means 'do the right thing' and has no value.
replies(1): >>45075349 #
105. greymalik ◴[] No.45073768[source]
> Anyone proclaiming simplicity just hasnt worked at scale.

The author of the article is a staff engineer at GitHub.

106. anymouse123456 ◴[] No.45073810[source]
Okay, I'll bite.

> Anyone proclaiming simplicity just hasnt [sic] worked at scale

I've worked in startups and large tech organizations over decades and indeed, there are definitely some problems in those places that are hard.

That said, in my opinion, the majority of technical solutions were over engineered and mostly waste.

Much simpler, more reliable, more efficient solutions were available, but inappropriately dismissed.

My team was able to demonstrate this by producing a much simpler system, deploying it and delivering it to many millions of people, every day.

Chesterton's fence is great in some contexts, especially politics, but the vast majority of software is so poorly made, it rarely applies IMO.

replies(4): >>45073848 #>>45073850 #>>45073855 #>>45074024 #
107. pickdig ◴[] No.45073812[source]
staying kinda anonymous saying this... oftentimes for most programmers the road is a pretty simple one yet the fence or gate is a tolling station of some private interest, so yeah if possible just quit arguing and try to destroy it.
108. whstl ◴[] No.45073848[source]
Hard agree.

I also worked on some quite large organizations with quite large services that would easily take 10x to 50x the amount of time to ship if they were a smaller org.

Most of the time people were mistaking complexity caused by bad decisions (tech or otherwise) with "domain complexity" and "edge cases" and refusing to acknowledge that things are now harder because of those decisions. Just changing the point of view makes it simple again, but then you run into internal politics.

With microservices especially, the irony was that it was mostly the decisions justified as being done to "save time in the future" that ended up generating the most amount of future work, and in a few cases even problems around compliance and data sovereignty.

109. ◴[] No.45073850[source]
110. gozzoo ◴[] No.45073855[source]
So, case by case then?
111. bryanrasmussen ◴[] No.45073942[source]
>Anyone proclaiming simplicity just hasnt worked at scale.

or they haven't worked in fields that are heavily regulated, or internationally.

This is why the DOGE guys were all like hey there are a bunch of people over 100 years old getting social security!! WTF!? Where someone with a wider range of experience would think, hmm, I bet there is some reason we need to figure out why they just jumped right to "this must be fraud!!"

112. xorcist ◴[] No.45073948{6}[source]
In a place with even less IPv6 adoption, probably. It's not like there wasn't similar proposals discussed, and there's no need to rehash the exact same discussion again.

The problem quickly becomes "how do you route it", and that's where we end up with something like today's IPv6. Route aggregation and PI addresses is impratical with IPv4 + extra bits.

The main changes from v4 to v6 besides the extra bits is mostly that some unnecessary complexity was dropped, which in the end is net positive for adoption.

113. bytefish ◴[] No.45073964[source]
For a lot of problems it’s a good idea to talk to customers and stakeholders, and make the complexity very transparent.

Maybe some of the edge cases only apply to 2% of the customers? Could these customers move to a standard process? And what’s the cost of implementing, testing, integrating and maintaining these customer-specific solutions?

This has actually been the best solution for me to reduce complexity in my software, by talking to customers and business analysts… and making the complexity very transparent by assigning figures to it.

114. ozim ◴[] No.45074024[source]
Problem is that you can’t create a system in vacuum.

Mostly it is not like a movie where you hand pick the team for the job.

Usually you have to play the cards you’re dealt with so you take whatever your team is comfortable building.

Which in the end is dealing with emotions, people ambition, wishes.

I have seen stuff gold plated just because one vocal person was making fuss. I have seen good ideas blocked just because someone wanted to feel important. I have seen teams who wanted to „do proper engineering” but they thought over engineering was proper way and anything less than gold plating makes them look like amateurs.

115. kmacdough ◴[] No.45074205{5}[source]
IPv4 vs IPv6 seems like a great example for why to keep it simple. Even given decades to learn from the success of IPv4 and almost a decade in design and refinement, IPv6 has flopped hard, not so much because of limitations of IPv4, but because IPv6 isn't backwards compatable and created excessive hardware requirements that basically require an entirely parallel IPv6 routing infrastructure to be maintained in addition to IPv4 infrastructure which isn't going away soon. It solved too far ahead for problems we aren't having.

As is IPv4s simplicity got us incredibly far and it turns out NAT and CIDR have been quite effective at alleviating address exhaustion. With some address reallocation and future protocol extensions, its looking entirely possible that a successor was never needed.

116. kloop ◴[] No.45074264[source]
You're not wrong, but I'm also constantly surprised at places where devs will inject complexity.

A former project that had a codec system for serializing objects that involved scala implicits comes to mind. It involved a significant amount of internal machinery, just to avoid writing 5 toString methods. And made it so that changing imports could break significant parts of the project in crazy ways.

It's possible nobody at the beginning of the project knew they would only have 5 of these objects (if they had 5 at the beginning, how many would they have later?), but I think that comes back to the article's point. There are often significantly simpler solutions that have fewer layers of indirection, and will work better. You shouldn't reach for complexity until you need it.

117. ◴[] No.45074359{5}[source]
118. trey-jones ◴[] No.45074642[source]
The classic comeback - every time I mention simplicity to a particular team member of mine, this is what he says. Complexity is unavoidable. Yes. But if you don't fight it tooth and nail, spend more time than you want trying to simplify the solution, getting second opinions (more minds on difficult problems are better!), then you will increase complexity more than you needed to. This is just a different form a technical debt: you will pay the price in maintenance later.
replies(1): >>45074977 #
119. sigseg1v ◴[] No.45074818{3}[source]
Well said.

An example I encountered was someone taking the "KISS" approach to enterprise reporting and ETL requirements. No need to make a layer between their data model and what data is given to the customers, and no need to make a separate replica of the server or db to serve these requests, as those would be complex.

This failed in so many ways I can't count. The system instantly became deeply ingrained in all customer workflows, but they connected via PowerBI via hundreds of non-technical users with bespoke reports. If an internal column name changed or structure of the data model changed so that devs can evolve the platform, users just get a generic error about Query Failed and lit up the support team. Technical explanations about needing to modify their query were totally not understood by the end users and they just want the dev team to fix it. Also no concern in any way for pagination, request complexity limiting, indexes, request rate limiting, etc was considered because those were not considered simple. But those can not be added without breaking changes because a non-tech user will not understand what to do when their report in Excel gets a rate limit on 29 of the 70 queries they launch per second. No concerns about taking prod OLTP databases down with OLAP workflows overloading them.

All in all that system was simple and took about 2 weeks to build, and was rapidly adopted into critical processes, and the team responsible left. It took the remaining team members a bit over 2 years to fix it by redesigning it and hand holding non-technical users all the way down to fixing their own Excel sheets. It was a total nightmare caused by wanting to keep things simple when really this needed: heavy abstraction models, database replicas, infrastructure scaling, caching, rewriting lots of application logic to make data presentable where needed, index tuning, automated generation of large datasets for testing, building automated tests for load testing, release process management, versioning strategies, documentation and communication processes, depreciation policies. They thought that we could avoid months of work and keep it simple and instead caused years of mess because making breaking changes is extremely difficult once you have wide adoption.

replies(1): >>45075434 #
120. MetaWhirledPeas ◴[] No.45074955[source]
Exactly.

And to address something the GP said:

> I am still shocked by the required complexity

Some of this complexity becomes required through earlier bad decisions, where the simplest thing that could possibly work wasn't chosen. Simplicity up front can reduce complexity down the line.

121. Maksadbek ◴[] No.45074977[source]
Exactly! If you don't try to keep it simple, especially in bigtech, things get way too complex. I think choosing simplest solution in bigtech is in orders of magnitude more important than in a simple domains.
122. quietbritishjim ◴[] No.45075349{5}[source]
No, it means don't (usually) over engineer a solution for a larger scale than you can be sure you'll need. If you don't see the value in that then you haven't worked with enough junior developers!
replies(1): >>45078665 #
123. gr4vityWall ◴[] No.45075434{4}[source]
While I tend to agree with your position, it sounds like they built a system in less than 2 weeks that was immediately useful to the organization. That sounds like a win to me, and makes me wonder if there were other ways in hindsight that such a system could evolve.

>They thought that we could avoid months of work and keep it simple and instead caused years of mess because making breaking changes is extremely difficult once you have wide adoption.

Right. Do you think a middle ground was possible? Say, a system that took 1 month to build instead of two weeks, but with a few more abstractions to help with breaking changes in the future.

Thanks for sharing your experience btw, always good to read about real world cases like this from other people.

replies(1): >>45076245 #
124. motorest ◴[] No.45076245{5}[source]
> While I tend to agree with your position, it sounds like they built a system in less than 2 weeks that was immediately useful to the organization. That sounds like a win to me, and makes me wonder if there were other ways in hindsight that such a system could evolve.

I don't think this is an adequate interpretation. Quick time to market doesn't mean the half-baked MVP is the end result.

An adequate approach would be to include work on introducing the missing abstraction layer as technical debt to be paid right after launch. You deliver something that works in 2 weeks and then execute the remaining design as follow-up work. This is what technical debt represents, and why the "debt" analogy fits so well. Quick time to market doesn't force anyone to put together half-assed designs.

125. mcny ◴[] No.45076449{5}[source]
So effectively the company was stealing people's pay
replies(1): >>45087012 #
126. sevensor ◴[] No.45076667{6}[source]
The best way to think a few steps ahead is to make as much of your solution disposable as possible. I optimize for ease of replacement over performance or scalability. This means that my operating assumption is that everything I’m doing is a mistake, so it’s best to work from a position of being able to throw it out and start over. The result is that I spend a lot of time thinking about where the seams are and making them as simple as possible to cut.
127. scarface_74 ◴[] No.45076856{5}[source]
Wouldn’t the simplest thing possible in that case probably just use one of the many SaaS payroll services? If the second largest employer in the US can use ADP, I’m almost sure your company could.
replies(1): >>45086972 #
128. prerok ◴[] No.45077277{3}[source]
Indeed and I have seen it happen many times in my career.

Shoehorning things into working systems is something I have seen juniors do. I have also seen "seniors" do this, but in my view, they are still juniors with more years working on the same code base.

I have once heard it described as "n-years of 1 year experiences". In other words, such a person never learns that program design space must continuously be explored and that recurrence of bugs in the same part of code usually means that a different design is required. They never learn that cause of the bug was not that particular change that caused the unintended side effect but that the fact that there is a side effect is a design bug onto its own.

I do agree, though, that TFA may be proposing sticking with simpler design for longer than advisable.

129. yazantapuz ◴[] No.45078662{5}[source]
But a 128 bit identifier maybe was not the best choice when ipv4 was in the works... maybe 64?
130. devnullbrain ◴[] No.45078665{6}[source]
Well hold on, we're going in circles here

>In the scenario you've described, the technology is not working, in the complete sense including business requirements of reasonable operating costs.

In the parent comment's reasaonble premise, they wouldn't be sure of what they would need.

131. devnullbrain ◴[] No.45078684{6}[source]
Intel followed this strategy with the mobile market to what is apparently terminal fucking.
replies(1): >>45080862 #
132. sgjohnson ◴[] No.45079503[source]
The title is not “Do the simplest thing POSSIBLE”. It’s do the “Simplest thing that could POSSIBLY work”.

There’s a HUGE difference between the simplest thing possible, and the simplest thing that could possibly work.

The simplest thing that could possibly work conveniently lets you forget about the scale. The simplest thing possible does not.

133. etse ◴[] No.45080346[source]
There is a lot of sentiment in these comments about needing to scale still. I wonder how many need to do this in a pre-PMF stage vs growth stage? The trade off is faster growth if your PMF bet wins and loss of time if your bet goes south.
134. lelanthran ◴[] No.45080862{7}[source]
> Intel followed this strategy with the mobile market to what is apparently terminal fucking.

And they followed the alternative with Itanium, and look how that turned out.

135. MangoToupe ◴[] No.45084431{3}[source]
I'm not saying it's a conscious impulse! But I've seen this happen more times than I can count.
136. tonyarkles ◴[] No.45086972{6}[source]
I left out some details, it wasn’t only payroll, there was some other staff management aspects to it. But overall the answer about using ADP for this particular situation is: no.

Not strictly for technical reasons, but definitely for political reasons. The client was potentially the largest organization in my province (state-run healthcare). Outsourcing payroll and scheduling with the potential of breaking the rules in the contracts with the multiple union stakeholders was a completely non-starter. Plus the idea of needing to do lay offs within the payroll department was pretty unpalatable.

137. tonyarkles ◴[] No.45087012{6}[source]
Heh, to make it more fun… it wasn’t actually clear if they were overpaying or underpaying. Underpaying is actually a lot easier to deal with than overpaying. If you underpay someone, the easy solution is to write a cheque and include interest and/or some other form of compensation for the error.

If you overpay someone… getting that money back is a challenge.

To make it more complicated still, there was an element of “we’re not sure if we overpaid or underpaid” but there was also an element of “we gave person X an overtime shift but person Y was entitled to accept or deny that shift before person X would have even had an opportunity to take it”. That’s even harder to compensate for.

replies(1): >>45087401 #
138. mcny ◴[] No.45087401{7}[source]
Thank you for the reply. I was only commenting that wage theft is still wage theft even when there is no malicious intent. Clearly, reality is much more nuanced.
139. codethief ◴[] No.45089116{3}[source]
> the key is that each step should be simple

In other words, every time you optimize only locally and in a single dimension and potentially walk very far away from a global optimum. I have worked on such systems before. Every single step in and by itself was simpler (and also faster, less work) than doing a refactoring (to keep the overall resulting system simple), so we never dared doing the latter. Unfortunately, over time this meant that every new step would incur additional costs due to all the accidental complexity we had accumulated. Time to finally refactor and do things the right way, right? No. Because the costs of refactoring had also kept increasing with every additional step we took, and every feature we patched on. At some point no one really understood the whole system anymore. So we just kept on piling things on top of each other and prayed they would never come crashing down on us.

Then one day, business decided the database layer needed to be replaced for licensing reasons. Guess which component had permeated our entire code base because we never got around doing that refactoring and never implemented proper boundaries and interfaces between database, business and view layer. So what could have been a couple months of migration work, ended up being more than four years of work (of rewriting the entire application from scratch).

140. codethief ◴[] No.45089198{3}[source]
> But for any given change the "simplest thing that can possible work" is usually something along the lines of "we'll just add another if-statement" or "we'll just add another parameter to the API call".

Sounds to me like we need to distinguish between simplicity of the individual diff, and simplicity of the end result (i.e. the overall code base after applying the diff). The former is a very one-dimensional and local way of optimization, which over time can lead you far away from a global optimum.