Most active commenters
  • GMoromisato(8)
  • jiggawatts(3)

←back to thread

1070 points dondraper36 | 31 comments | | HN request time: 1.129s | source | bottom
1. GMoromisato ◴[] No.45069016[source]
One of the ironies of this kind of advice is that it's best for people who already have a lot of experience and have the judgement to apply it. For instance, how do you know what the "simplest thing" is? And how can you be sure that it "could possibly work"?

Yesterday I had a problem with my XLSX importer (which I wrote myself--don't ask why). It turned out that I had neglected to handle XML namespaces properly because Excel always exported files with a default namespace.

Then I got a file that added a namespace to all elements and my importer instantly broke.

For example, Excel always outputs <cell ...> whereas this file has <x:cell ...>.

The "simplest thing that could possibly work" was to remove the namespace prefix and just assume that we don't have conflicting names.

But I didn't feel right about doing that. Yes, it probably would have worked fine, but I worried that I was leaving a landmine for future me.

So instead I spent 4 hours re-writing all the parsing code to handle namespaces correctly.

Whether or not you agree with my choice here, my point is that doing "the simplest thing that could possible work" is not that easy. But it does get easier the more experience you have. Of course, by then, you probably don't need this advice.

replies(11): >>45069191 #>>45069245 #>>45069268 #>>45069600 #>>45070183 #>>45070459 #>>45072910 #>>45073086 #>>45075511 #>>45076327 #>>45077197 #
2. bvirb ◴[] No.45069191[source]
We attempt to address this problem at work with an extra caveat to never add code "in the wrong direction" -- so it's fine (usually preferable) to have a partial implementation, as long as it's heading in the direction we'd like the more complete implementation to go in. Basically "KISS, but no hacks".
replies(3): >>45069426 #>>45070547 #>>45073815 #
3. nibalizer ◴[] No.45069245[source]
It’s the same for AI vibecoding. The more experience you have, the easier it is to keep the agent on the right path. Same for identifying which tasks to use an agent for vs doing yourself.
4. taffer ◴[] No.45069268[source]
> One of the ironies of this kind of advice is that it's best for people who already have a lot of experience and have the judgement to apply it. For instance, how do you know what the "simplest thing" is?

I think the author kind of mentions this: "Figuring out the simplest solution requires considering many different approaches. In other words, it requires doing engineering."

replies(3): >>45069350 #>>45069490 #>>45074169 #
5. tuatoru ◴[] No.45069350[source]
I like this. I had a rule of three: figure out three qualitatively different ways to solve the problem - different in kind, not just in choice of tools. Once you have three you start to understand the trade-offs. And you can come up with others quite easily.
replies(1): >>45069505 #
6. GMoromisato ◴[] No.45069426[source]
I really like this as a guideline.
7. GMoromisato ◴[] No.45069490[source]
Agreed! The author is clearly an experienced and talented software engineer.

But the irony, in my opinion, is that experienced engineers don't need this advice (they are already "doing engineering"), but junior engineers can't use this advice because they don't have the experience to know what the "simplest thing" is.

Still, the advice is useful as a mantra: to remind us of things we already know but, in the heat of the moment, sometimes forget.

8. GMoromisato ◴[] No.45069505{3}[source]
I like that as a process. Seeing the trade-offs is the key. I argue that engineering is all about trade-offs.
9. thefourthchime ◴[] No.45069600[source]
I think most commentators here are missing the point that doing the "simplest" thing doesn't mean doing the hackiest, quickest thing.

The simplest thing can be very difficult to do. It require thought and understanding the system, which is what he says at the very beginning. But I think most people read the headline and just started spewing personal grievances.

replies(3): >>45069665 #>>45071047 #>>45076015 #
10. GMoromisato ◴[] No.45069665[source]
My point is exactly that "the simplest thing can be very difficult to do". You need to be an experienced engineer to apply this advice.

But an experienced engineer already knows this!

I just think it's ironic that this advice is useless to junior engineers but unneeded by senior engineers.

replies(1): >>45071205 #
11. jiggawatts ◴[] No.45070183[source]
Don't confuse sloppy with simple. Parsing XML with regex[1] (or a non-namespace-compliant XML parser) is not simple. It's messy, verbose, error-prone, and not in any way idiomatic or simple.

If you had just used a compliant XML parser as intended, you might not even have noticed that different encodings of namespaces was even occurring in the files! It just "doesn't register" when you let the parser handle this for you in the same sense that if you parse HTML (or XML) properly, then you won't notice all of the &amp; and &lt; encodings either. Or CDATA. Or Unicode escapes. Or anything else for that matter that you may not even be aware of.

You may be a few more steps away from making an XLSX importer work robustly. Did you read the spec? The container format supports splitting single documents into multiple (internal) files to support incremental saves of huge files. That can trip developers in the worst way, because you test with tiny files, but XLSX-handling custom code tends to be used to bulk import large files, which will occasionally use this splitting. You'll lose huge blocks of data in production, silently! That's not fun (or simple) to troubleshoot.

The fast, happy path is to start with something like System.IO.Packaging [2] which is the built-in .NET libary for the Open Packaging Conventions (OPC) container format, which is the underlying container format of all Office Open XML (OOXML) formats. Use the built-in XML parser, which handles namespaces very well. Then the only annoyance is that OOXML formats have two groups of namespaces that they can use, the Microsoft ones and the Open "standardised" ones.

[1] Famously! https://stackoverflow.com/questions/8577060/why-is-it-such-a...

[2] https://learn.microsoft.com/en-us/dotnet/api/system.io.packa...

replies(1): >>45070764 #
12. daxfohl ◴[] No.45070459[source]
I gauge it as "the simplest thing to transition". Most of the time, it's easier to transition a single service that doesn't rely on a big number of complex abstractions or extra infrastructure, even if it's at the expense of some clutter or a bit of redundancy. The new owner can step through the code and see what's going on without having to work backward to understand the abstractions or coordination of services or whatever else.

Of course plenty of times there'll be some abstractions that make the code easier to follow, even at the expense of logic locality. And other times where extra infrastructure is really necessary to improve reliability, or when your in-memory counter hack gets more requirements and replacing it with a dedicated rate limiter lets you delete all that complexity. And in those cases, by all means, add the abstractions or infrastructural pieces as needed.

But in all such cases, I try to ask myself, if I need to hand off this project afterward, which approach is going to make things easiest to explain?

Note that my perception of this has changed over time. Long ago, I was very much in the camp of "simple" meaning: make everything as terse as possible, put everything in its own service, never write code when a piece of infrastructure could do it, decouple everything to the maximum extent, make everything config-based. I ironically remember imagining how delighted the new owners would be to receive such a well-factored thing that was almost no code at all; just abstraction upon abstraction upon event upon abstraction that fit together perfectly via some config file. Of course, transition was a complete fail, as they didn't care enough to grok how the all pieces were designed to fit together, and within a month, they'd broken just about every abstraction I'd built into it, and it was a pain for anybody to work with.

Since then, I've kept things simpler, only using abstractions and extra infra where it'd be weird not to, and always thinking what's going to be the easiest thing to transition. And even though I'm not necessarily transitioning a ton of stuff, it's generally easier to ramp up teams or onboard new hires or debug problems when the code just does what it says. And it's nice because when a need for a new abstraction becomes apparent, you don't have to go back and undo the old one first.

13. ehansdais ◴[] No.45070547[source]
Just curious, how would that be applied to the xslx namespace problem example given? If the full fix is to implement namespacing, what would the KISS approach be in the right direction?
replies(1): >>45072204 #
14. GMoromisato ◴[] No.45070764[source]
Parsing XML is relatively trivial--I'd never use regex, of course, but a basic recursive descent parser can do it pretty easily. I mean, the whole point of XML is that it's supposed to be easy to parse and generate!

Namespaces add a wrinkle, but it wasn't that hard to add. And I was able to add namespace aliasing in my API to handle the two separate "standard" namespaces that you're talking about.

But you're right about OPC/OOXML--those are massive specs and even the tiny slice that I'm handling has been error-prone. I haven't dealt with multiple internal files, so that's a future bug waiting for me. The good news is I'm building a nice library of test files for my regression tests!

replies(1): >>45070940 #
15. jiggawatts ◴[] No.45070940{3}[source]
> Parsing XML is relatively trivial

It really isn't, and rolling your own parser is the diametric opposite of the "do the simplest thing" philosophy.

The XML v1.1 spec is 126 KB of text, and that doesn't even include XML Namespaces, which is a separate spec with 25 KB of text.

XML is only "simple" in the sense of being well-defined, which makes interoperability simple, in some sense. Contrast this with ill-defined or implementation-defined text formats, where it's decidedly not simple to write an interoperable parser.

As an end-user of XML, the simplest thing is to use an off-the-shelf XML parser, one that's had the bugs beaten out of it by millions of users.

There are very few programming languages out that don't have a convenient, full-featured XML parser library ready to use.

replies(1): >>45071486 #
16. noodletheworld ◴[] No.45071047[source]
Yes, but this is meaningless advice.

The best solution is the simplest.

The quickest? No the simplest; sometimes thats longer.

So definitely not a complex solution? No, sometimes complexity is required, its the simplest solution possible given your constraints.

Soo… basically, the advice is “pick the right solution”.

Sometimes that will be quick. Sometimes slow. Sometimes complex. Sometimes config, Sometimes distributed.

It depends.

But the correct solution will be the simplest one.

Its just: “solve your problems using good solutions not bad ones”

…and that indeed both good, and totally useless advice.

replies(1): >>45072705 #
17. bibabaloo ◴[] No.45071205{3}[source]
> I just think it's ironic that this advice is useless to junior engineers but unneeded by senior engineers.

That's a good way of putting it. The advice essentially boils down to "do the right thing, don't do the wrong thing". Which is good (if common sense) advice, but doesn't practically really help with making decisions.

18. GMoromisato ◴[] No.45071486{4}[source]
Well we can agree that most people shouldn't implement their own XML parser.
19. jiggawatts ◴[] No.45072204{3}[source]
Use an off-the-shelf parser that handles namespaces. And escapes. And CData. And everything else you haven't thought of: https://stackoverflow.com/questions/701166/can-you-provide-s...

This avoids the endless whack-a-mole that you get with a partial solution such as "assume namespaces are superflous", which you almost certainly will eventually discover weren't optional.

Or some other hapless person using your terrible code will discover at 2am at night sitting alone in the office building while desperately trying to do something mission critical such as using a "simple" XML export tool to cut over ten thousand users from one Novel system to another so that the citizens of the state have a functioning government in the morning.

Ask me how I know that kind of "probably won't happen" thing will, actually, happen.

20. Paracompact ◴[] No.45072705{3}[source]
The article responds to this.
replies(1): >>45074687 #
21. b_e_n_t_o_n ◴[] No.45072910[source]
The simplest thing would have been not to rely on something as complex as XLST, but that ship had sailed long ago.
22. pnt12 ◴[] No.45073086[source]
I don't think it's ironic: maybe it's not intuitive / not easy to do!

But there's utility in talking about it. If you teach people that good engineers prepare for Google scale, they will lean towards that. If you teach that unnecessary complexity is painful and slows you down, they will lean towards that.

Maybe we need a Rosetta stone of different simple and complex ways to do common engineering stuff!

23. thinkharderdev ◴[] No.45073815[source]
Yes, this is an excellent rule. I read an essay years ago (which I can't find now) about technical debt whether the author separate tech debt into two flavors which he analogized to a mortgage (good) and credit card debt (bad). Basically, getting the right design but only partially implementing it is like a mortgage, you're making a down payment on the full implementation and you can pay down the debt over time. But doing terrible hacks to "get something working" is like credit card debt. You're buying some time but will have to pay that back later (with a lot of interest).
24. anon6362 ◴[] No.45074169[source]
Yep. Engineering almost always involves experimenting for suitability of multiple approaches, configurations, and other concerns. Measure, measure, and measure some more while considering nonfunctional requirements/concerns... something no LLM can (yet) do. (I don't hold out hope that there won't soon be some fully-autonomous coding/systems management LLMs that can create a tight Prompt/REPL/Test loop to take requirements and feedback directly from users.)
25. noodletheworld ◴[] No.45074687{4}[source]
Really?

We both read the article; you know as well as I do that the advice in it is to build simple reliable system that focus on actual problems not imagined ones.

…but does not say how to do that; and offers no meaningful value for someone trying to pick the “right” thing in the entire solution space that is both sufficiently complex and scalable to solve the requirements, but not too scalable, or too complex.

There’s just some vague hand waving about over engineering things at Big Corp, where, ironically, scale is an issue that mandates a certain degree of complexity in many cases.

Here’s some thing that works better than meaningless generic advice: specific detailed examples.

You will note the total lack of them in this article, and others like it.

Real articles with real advice are a mix of practical examples that illustrate the generic advice they’re giving.

You know why?

…because you can argue with a specific example. Generic advice with no examples is not falsifiable.

You can agree with the examples, or disagree with them; you can argue that examples support or do not support the generic advice. People can take the specific examples and adapt them as appropriate.

…but, generic advice on its own is just an opinion.

I can arbitrarily assert “100% code coverage is meaningless; there are hot paths that need heavy testing and irrelevant paths that do not require code coverage. 100% code coverage is a fools game that masks a lack of a deeper understanding of what you should be testing”; it may sound reasonable, it may not. That’s your opinion vs mine.

…but with some specific examples of where it is true, and perhaps, not true, you could specifically respond to it, and challenge it with counter examples.

(And indeed, you’ll see that specific examples turn up here in this comment thread as arguments against it; notably not picked up to be addressed by the OP in their hacker news feedback section)

26. ozgrakkurt ◴[] No.45075511[source]
Advice is useless, learning and experience works
27. tetha ◴[] No.45076015[source]
I currently have one concept stuck in my mind, which I would call "Complexity distribution".

For example, at work, the simplest solution across the whole organization was to adopt the most complex PostgreSQL deployment structure and backup solutions.

This sounds counter-intuitive at first. But this way, the company can invest ~3 full time employees on having an HA, PITR capable PostgreSQL clutser with properly archived backups around ~25 other development teams can rely on. This stack solves so many B2B problems of business continuity, security, backups, availability.

And on the other hand, for the dev-teams, the PostgreSQL is suddenly very simple. Inject ~8 variables into a container and you can claim all of these good things for your application without ever thinking about those.

28. me-vs-cat ◴[] No.45076327[source]
> I spent 4 hours re-writing all the parsing code to handle namespaces correctly.

Wouldn't it have been less effort and simpler to replace the custom code with an existing XML parser? It appears that in your case the simplest thing would have been easy, though the aphorism doesn't promise "easy".

If using a library wasn't possible for you due to NIH-related business requirements and given the wide proliferation of XML libraries under a multitude of licenses, then your pain appears to have been organizationally self-inflicted. That's going to be hard to generalize to others in different organizations.

replies(1): >>45077245 #
29. sjducb ◴[] No.45077197[source]
I think the author would say you did the simplest thing that could work.

Ignoring the namespace creates ongoing complexity that you have to be aware of. Your solution now just works and users can use namespaces if they want.

The author deals with this in the hacks section.

30. GMoromisato ◴[] No.45077245[source]
Honestly, four hours spent programming is my idea of heaven.

I totally agree with you that most people should not implement their own XML parser, much less an Excel importer. But I'm grateful to have the luxury of being allowed/able to do both.

The specific choice I made doesn't matter. What matters is the process of deciding trade-offs between one approach and another.

My point is that the OP advice of "do the simplest thing that could possibly work" doesn't help a junior engineer (who doesn't have the experience to evaluate the trade-off) but it's superfluous for a senior engineer (who already has well-developed instincts).

replies(1): >>45077590 #
31. me-vs-cat ◴[] No.45077590{3}[source]
That's fair, and I also especially enjoy exploring "solved" problems when I don't have a backlog demanding priority.

Still, your experience with those holding "senior" job titles involves greater median expertise than I have found in my experience.