←back to thread

3883 points kuroguro | 3 comments | | HN request time: 1.061s | source
Show context
segmondy ◴[] No.26297060[source]
This is why I come to HN, I was going to skip this because I thought it was about video games, but really glad to have read it, and loved every line of the article.

So much to get from this.

Even if you don't have the source, you can make a change if you are annoyed enough.

If you don't like something, and the source code is out there, really go contribute.

Performance matters, know how to profile and if using an external dependency, then figure out their implementation details.

Algorithms & Data structures matter, often I see devs talking about how it doesn't matter much but the difference between using a hashmap vs array is evident.

Attentive code reviews matter, chances are they gave this to a junior dev/intern, and it worked with a small dataset and no one noticed.

replies(4): >>26297140 #>>26297183 #>>26297384 #>>26301592 #
closeparen ◴[] No.26297183[source]
I think this is a perfect example of “algorithms and data structures emphasis is overblown.” Real world performance problems don’t look like LeetCode Hard, they look like doing obviously stupid, wasteful work in tight loops.
replies(6): >>26297360 #>>26297559 #>>26297571 #>>26297622 #>>26298428 #>>26299214 #
baby ◴[] No.26299214[source]
And here what matters is not your programming skills, it’s your profiling skills. Every dev writes code that’s not the most optimized piece from the start, hell we even say “don’t optimise prematurely”. But good devs know how to profile and flamegraph their app, not leetcode their app.
replies(1): >>26299471 #
segmondy ◴[] No.26299471[source]
actually, "don't optimize prematurely" is a poor advice. just recently I was doing a good review that had the same issue where they were counting the size of an array in a loop, when stuff was being added to the array in the loop too. obvious solution was to track the length and

   arr = []
   while ...:
      if something:
         arr.append(foo)
      ...
      if count(arr) == x:
        stuff
      ...
changed to

   arr=[]
   arr_size = 0
   while ...:
      if something:
         arr.append(foo)
         arr_size++;
      ...
      if arr_size == x:
        stuff
      ...
This is clearly optimization, but it's not premature. The original might just pass code review, but when it wrecks havoc, the amount of time it will cost will not be worth it, jira tickets, figuring out why the damn thing is slower, then having to recreate it in dev, fixing it, reopening another pull request, review, deploy, etc. Sometimes "optimizing prematurely" is the right thing to do if it doesn't cost much time to do or overly completely the initial solution. Of course, this depends on the language, some languages will track the length of the array so checking the size is o(1), but not all languages do, so checking the length can be expensive, knowing the implementation detail matters.
replies(2): >>26300921 #>>26301513 #
dthul ◴[] No.26301513[source]
I'm not sure I would prefer the second version in a code review. I find the first version is conceptually nicer because it's easy to see that you will always get the correct count. In the second version you have to enforce that invariant yourself and future code changes could break it. If this is premature optimization or not depends on the size of the array, number of loop iterations and how often that procedure is called. If that's an optimization you decide to do, I think it would be nice to extract this into an "ArrayWithLength" data structure that encapsulates the invariant.
replies(1): >>26301900 #
thaumasiotes ◴[] No.26301900[source]
> In the second version you have to enforce that invariant yourself and future code changes could break it.

Yes, that's a real issue. But we've been given two options:

- Does the correct thing, and will continue to do the correct thing regardless of future changes to the code. Will break if the use case changes, even if the code never does.

- Does the correct thing, but will probably break if changes are made to the code. Will work on any input.

It actually seems a lot more likely to me that the input given to the code might change than that the code itself might change. (That's particularly the case for the original post, where the code serves to read a configuration file, but it's true in general.)

replies(1): >>26301991 #
1. dthul ◴[] No.26301991[source]
Yes, I absolutely see the reasoning and I think if one does go the route of encapsulating the more efficient array logic one can have the best of both options.
replies(1): >>26302954 #
2. thaumasiotes ◴[] No.26302954[source]
> if one does go the route of encapsulating the more efficient array logic one can have the best of both options.

Do you see a way to do this that doesn't involve rolling your own array-like or list-like data type and replacing all uses of ordinary types with the new one? (This is actually already the implementation of standard Python types, but if you're encountering the problem, it isn't the implementation of your types.)

replies(1): >>26305375 #
3. dthul ◴[] No.26305375[source]
I guess it depends on the language and library you are using but I have the feeling that in most cases one would probably need to replace the usage of the old data type with the new one.