←back to thread

1070 points dondraper36 | 9 comments | | HN request time: 0.888s | source | bottom
Show context
codingwagie ◴[] No.45069135[source]
I think this works in simple domains. After working in big tech for a while, I am still shocked by the required complexity. Even the simplest business problem may take a year to solve, and constantly break due to the astounding number of edge cases and scale.

Anyone proclaiming simplicity just hasnt worked at scale. Even rewrites that have a decade old code base to be inspired from, often fail due to the sheer amount of things to consider.

A classic, Chesterton's Fence:

"There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, “I don’t see the use of this; let us clear it away.” To which the more intelligent type of reformer will do well to answer: “If you don’t see the use of it, I certainly won’t let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it.”"

replies(44): >>45069141 #>>45069264 #>>45069348 #>>45069467 #>>45069470 #>>45069871 #>>45069911 #>>45069939 #>>45069969 #>>45070101 #>>45070127 #>>45070134 #>>45070480 #>>45070530 #>>45070586 #>>45070809 #>>45070968 #>>45070992 #>>45071431 #>>45071743 #>>45071971 #>>45072367 #>>45072414 #>>45072570 #>>45072634 #>>45072779 #>>45072875 #>>45072899 #>>45073114 #>>45073174 #>>45073183 #>>45073201 #>>45073291 #>>45073317 #>>45073516 #>>45073758 #>>45073768 #>>45073810 #>>45073812 #>>45073942 #>>45073964 #>>45074264 #>>45074642 #>>45080346 #
1. ricardobeat ◴[] No.45069348[source]
At least half the time, the complexity comes from the system itself, echoes of the organizational structure, infrastructure, and not the requirements or problem domain; so this advice will/should be valid more often than not.
replies(3): >>45069424 #>>45069454 #>>45070465 #
2. codingwagie ◴[] No.45069424[source]
Right but you cant expect perfect implementation, as the complexity of the business needs grows, so does the accidental complexity.
3. malux85 ◴[] No.45069454[source]
I was one of the original engineers of DFP at Google and we built the systems that send billions of ads to billions of users a day.

The complexity comes from the fact that at scale, the state space of any problem domain is thoroughly (maybe totally) explored very rapidly.

That’s a way bigger problem than system complexity and pretty much any system complexity is usually the result of edge cases that need to be solved, rather than bad architecture, infrastructure or organisational issues - these problems are only significant at smaller, inexperienced companies, by the time you are at post scale (if the company survives that long) then state space exploration in implementation (features, security, non-stop operations) is where the complexity is.

replies(2): >>45069514 #>>45070057 #
4. dondraper36 ◴[] No.45069514[source]
Not directly related to the article we're discussing here, but, based on your experience, you might be the ideal kind of person to answer this.

At the scale you are mentioning, even "simple" solutions must be very sophisticated and nuanced. How does this transformation happen naturally from an engineer at a startup where any mainstream language + Postgres covers all your needs, to someone who can build something at Google scale?

Let's disregard the grokking of system design interview books and assume that system design interviews do look at real skills instead of learning common buzzwords.

replies(2): >>45069659 #>>45073682 #
5. malux85 ◴[] No.45069659{3}[source]
Demonstration of capability will get you hired, capability comes only through practice.

I built a hobby system for anonymously monitoring BitTorrent by scraping the DHT, in doing this, I learned how to build a little cluster, how to handle 30,000 writes a second (which I used Cassandra for - this was new to me at the time) then build simple analytics on it to measure demand for different media.

Then my interview was just talking about this system, how the data flowed, where it can be improved, how is redundancy handled, the system consisted of about 10 different microservices so I pulled the code up for each one and I showed them.

Interested in astronomy? Build a system to track every star/comet. Interested in weather? Do SOTA predictions, interested in geography? Process the open source global gravity maps, interested in trading? Build a data aggregator for a niche.

It doesn’t really matter that whatever you build “is the best in the world or not” - the fact that you build something, practiced scaling it with whatever limited resources you have, were disciplined to take it to completion, and didn’t get stuck down some rabbit hole endlessly re-architecting stuff that doesn’t matter, this is what they’re looking for - good judgement, discipline, experience.

Also attitude is important, like really, really important - some cynical ranter is not going to get hired over the “that’s cool I can do that!” person, even if the cynical ranter has greater engineering skills, genuine enthusiasm and genuine curiosity is infectious.

6. wrs ◴[] No.45070057[source]
My rule on edge cases is: It's OK to not handle an edge case if you know what's going to happen in that case and you've decided to accept that behavior because it's not worth doing something different. It's not OK to fail to handle an edge case because you just didn't want to think about it, which quite often is what the argument for not handling it boils down to. (Then there are the edge cases you didn't handle because you didn't know they existed, which are a whole other tragicomedy.)
7. makeitdouble ◴[] No.45070465[source]
> the organizational structure, infrastructure

Those are things that matter and can't be brushed away though.

What Conway's law describes is also optimization of the software to match the shape it can be developped and maintained with fewer frictions.

Same for infra, complexity induced by it shouldn't be simplified unless you also simplify/abatract the infra first.

replies(1): >>45071014 #
8. fijiaarone ◴[] No.45071014[source]
Conway wasn’t proscribing a goal, he was describing a problem.
9. sethammons ◴[] No.45073682{3}[source]
Systems begin to slow. You measure and figure out a way to get performance acceptable again. You gain stakeholder alignment and push towards delivering results.

There are steps that most take. Start with caching. Then you learn about caching strategies because the cache gets slow. Then you shard the database and start managing multiple database connections and readers and writers. Then you run into memory, cpu, or i/o pressure. Maybe you start horizontally scaling. Connections and file descriptors have limits you learn about. Proxies might enter your lexicon. Monitoring, alerting, and testing all need improvement. And recently teams are getting harder to manage and projects are getting slower. Maybe deploying takes forever. So now we break up into different domains. Core backend, control panel, compliance, event processing, etc.

As the org grows and continues to change, more and more stakeholders appear. Security, API design, different cost modeling, product and design, and this web of stakeholders all have competing needs.

Go back to my opening stanza. Rinse and repeat.

Doing this exposes patterns and erroneous solutions. You work to find the least complex solution necessary to solve the known constraints. Simple is not easy (great talk, look it up). The learnings from these battle scars is what makes a staff level engineer methinks. You gain stories and tools for delivering solutions that solve increasingly larger systems and organizations. I recently was the technical lead for a 40 team software project. I gained some more scars and learnings.

An expert is someone who has made and learned from many mistakes in a narrow field. Those learnings and lessons get passed down in good system design interview books, like Designing Data Intensive Applications.