How I cut GTA Online loading times by 70%

1. segmondy ◴[28 Feb 21 21:06 UTC] No.26297060[source]▶

This is why I come to HN, I was going to skip this because I thought it was about video games, but really glad to have read it, and loved every line of the article.

So much to get from this.

Even if you don't have the source, you can make a change if you are annoyed enough.

If you don't like something, and the source code is out there, really go contribute.

Performance matters, know how to profile and if using an external dependency, then figure out their implementation details.

Algorithms & Data structures matter, often I see devs talking about how it doesn't matter much but the difference between using a hashmap vs array is evident.

Attentive code reviews matter, chances are they gave this to a junior dev/intern, and it worked with a small dataset and no one noticed.

replies(4): >>26297140 #>>26297183 #>>26297384 #>>26301592 #

2. bluedino ◴[28 Feb 21 21:15 UTC] No.26297140[source]▶

>>26297060 (TP) #

I always tell a story about an application we ran, it generated its own interface based on whatever was in inventory. Someone did something really stupid and duplicated each inventory item for each main unit we sold...so you had recursive mess. Assigning 100,000 items when previously it was 100-ish

Anyway, everyone just rolled their eyes and blamed the fact that the app was written in Java.

It ended up generating an XML file during that minute long startup....so we just saved the file to the network and loaded it on startup. If inventory changed, we’d re-generate the file once and be done with it.

replies(1): >>26298345 #

3. closeparen ◴[28 Feb 21 21:22 UTC] No.26297183[source]▶

>>26297060 (TP) #

I think this is a perfect example of “algorithms and data structures emphasis is overblown.” Real world performance problems don’t look like LeetCode Hard, they look like doing obviously stupid, wasteful work in tight loops.

replies(6): >>26297360 #>>26297559 #>>26297571 #>>26297622 #>>26298428 #>>26299214 #

4. nikanj ◴[28 Feb 21 21:42 UTC] No.26297360[source]▶

>>26297183 #

And trying to optimize them gets you stink eye at code review time. Someone quotes Knuth, they replace your fast 200 lines with slow-as-molasses 10 lines and head to the bar.

replies(1): >>26298047 #

5. madeofpalk ◴[28 Feb 21 21:45 UTC] No.26297384[source]▶

>>26297060 (TP) #

> Even if you don't have the source, you can make a change if you are annoyed enough.

Well, until you get flagged by the anti cheat and get your account and motherboard banned...

replies(1): >>26297540 #

6. zionic ◴[28 Feb 21 22:04 UTC] No.26297540[source]▶

>>26297384 #

Imagine getting banned for fixing their insane load times lol

replies(2): >>26300933 #>>26301212 #

7. johnfn ◴[28 Feb 21 22:07 UTC] No.26297559[source]▶

>>26297183 #

Exactly - though to add a little nuance to your post, it’s about having a million loops in a 10M line code base and exactly one of them is operating maximally slowly. So preventing the loop from entering the code base is tough - finding it is key.

8. segmondy ◴[28 Feb 21 22:09 UTC] No.26297571[source]▶

>>26297183 #

leetcode style thinking will allow you to spot obviously stupid wasteful work in tight loops.

9. rictic ◴[28 Feb 21 22:16 UTC] No.26297622[source]▶

>>26297183 #

True that it's rare that you need to pull out obscure algorithms or data structures, but in many projects you'll be _constantly_ constructing, composing, and walking data structures, and it only takes one or two places that are accidentally quadratic to make something that should take milliseconds take minutes.

The mindset of constantly considering the big-O category of the code you're writing and reviewing pays off big. And neglecting it costs big as well.

replies(2): >>26297739 #>>26313778 #

10. ilaksh ◴[28 Feb 21 22:33 UTC] No.26297739{3}[source]▶

>>26297622 #

Except that you need to test your software and if you see performance problems, profile them to identify the cause. It's not like you have one single chance to get everything right.

replies(1): >>26325876 #

11. gridspy ◴[28 Feb 21 23:17 UTC] No.26298047{3}[source]▶

>>26297360 #

Unfortunately this. Or they will say "don't optimize it until it proves to be slow in production" - at which point it is too dangerous to change it.

12. indeedmug ◴[01 Mar 21 00:09 UTC] No.26298345[source]▶

>>26297140 #

It's a lot easier to blame an language for being slow because it's obvious. Blaming algorithms requiresputying in the time to figure things out.

replies(2): >>26298646 #>>26313825 #

13. wnoise ◴[01 Mar 21 00:23 UTC] No.26298428[source]▶

>>26297183 #

... that's the exact opposite of what I took from this.

The obviously stupid, wasteful work is at heart an algorithmic problem. And it cropped up even in the simplest of data structures. A constant amount of wasteful work often isn't a problem even in tight loops. A linear amount of wasted work, per loop, absolutely is.

replies(1): >>26300665 #

14. acdha ◴[01 Mar 21 00:56 UTC] No.26298646{3}[source]▶

>>26298345 #

There’s also a confound between a language and its communities. I’ve seen so many cases where a “slow” language like Python or (older) Perl smoked Java or C++ because the latter developers were trying to follow cultural norms which said that Real Decelopers™ don’t write simple code and they had huge memory churn with dense object hierarchies and indirection so performance ended up being limited by O(n) XML property lookups for a config setting which nobody had ever changed whereas the “slow” language developer had just implemented a simple algorithm directly and most of the runtime was in highly-optimized stdlib native code, a fast regex instead of a naive textbook parser which devolved into an object churn benchmark, etc.

Languages like Java get a lot of bad reputation for that because of popularity: not just that many people are hired into broken-by-design environments (or ones where they’re using some framework from a big consultancy or a vendor who makes most of their revenue from consulting services) but also because many people learn the language as their first language and often are deeply influenced by framework code without realizing the difference between widely used long-term reusable code and what most projects actually need and are staffed for. It’s easy to see the style of the Java standard frameworks or one of the major Apache projects and think that everyone is supposed to write code like that, forgetting that they have to support a greater number of far more diverse projects over a longer timeframe than your in-house business app nobody else works on. Broader experience helps moderate this but many places choose poor metrics and neglect career development.

replies(1): >>26312091 #

15. baby ◴[01 Mar 21 02:49 UTC] No.26299214[source]▶

>>26297183 #

And here what matters is not your programming skills, it’s your profiling skills. Every dev writes code that’s not the most optimized piece from the start, hell we even say “don’t optimise prematurely”. But good devs know how to profile and flamegraph their app, not leetcode their app.

replies(1): >>26299471 #

16. segmondy ◴[01 Mar 21 03:38 UTC] No.26299471{3}[source]▶

>>26299214 #

actually, "don't optimize prematurely" is a poor advice. just recently I was doing a good review that had the same issue where they were counting the size of an array in a loop, when stuff was being added to the array in the loop too. obvious solution was to track the length and

   arr = []
   while ...:
      if something:
         arr.append(foo)
      ...
      if count(arr) == x:
        stuff
      ...

changed to

   arr=[]
   arr_size = 0
   while ...:
      if something:
         arr.append(foo)
         arr_size++;
      ...
      if arr_size == x:
        stuff
      ...

This is clearly optimization, but it's not premature. The original might just pass code review, but when it wrecks havoc, the amount of time it will cost will not be worth it, jira tickets, figuring out why the damn thing is slower, then having to recreate it in dev, fixing it, reopening another pull request, review, deploy, etc. Sometimes "optimizing prematurely" is the right thing to do if it doesn't cost much time to do or overly completely the initial solution. Of course, this depends on the language, some languages will track the length of the array so checking the size is o(1), but not all languages do, so checking the length can be expensive, knowing the implementation detail matters.

replies(2): >>26300921 #>>26301513 #

17. lmm ◴[01 Mar 21 08:24 UTC] No.26300665{3}[source]▶

>>26298428 #

It's not something that requires deep algorithms/data structures knowledge, is the point. Knowing how to invert a binary tree won't move the needle on whether you can spot this kind of problem. Knowing how to operate a profiler is a lot more useful.

18. rocqua ◴[01 Mar 21 09:11 UTC] No.26300921{4}[source]▶

>>26299471 #

With these things, I have always had the hope that an optimizing compiler would catch this. I think it is an allowed optimization if the count function is considered `const` in C or C++ at least.

19. rocqua ◴[01 Mar 21 09:14 UTC] No.26300933{3}[source]▶

>>26297540 #

Getting banned for DLL injection seems very likely to me. It certainly is a risk.

Heck, it might be against the EULA, which probably doesn't hold op legally, but is decent grounds for a ban.

20. madeofpalk ◴[01 Mar 21 10:01 UTC] No.26301212{3}[source]▶

>>26297540 #

Getting banned for modifying the game process seems very commonplace and likely? It'll be a part of any anti-cheat system, it's basically table stakes.

21. dthul ◴[01 Mar 21 10:54 UTC] No.26301513{4}[source]▶

>>26299471 #

I'm not sure I would prefer the second version in a code review. I find the first version is conceptually nicer because it's easy to see that you will always get the correct count. In the second version you have to enforce that invariant yourself and future code changes could break it. If this is premature optimization or not depends on the size of the array, number of loop iterations and how often that procedure is called. If that's an optimization you decide to do, I think it would be nice to extract this into an "ArrayWithLength" data structure that encapsulates the invariant.

replies(1): >>26301900 #

22. fctorial ◴[01 Mar 21 11:08 UTC] No.26301592[source]▶

>>26297060 (TP) #

This was probably a compiler bug. I don't think the programmers coding the business logic were using 'strlen' and 'sscanf' directly.

23. thaumasiotes ◴[01 Mar 21 12:00 UTC] No.26301900{5}[source]▶

>>26301513 #

> In the second version you have to enforce that invariant yourself and future code changes could break it.

Yes, that's a real issue. But we've been given two options:

- Does the correct thing, and will continue to do the correct thing regardless of future changes to the code. Will break if the use case changes, even if the code never does.

- Does the correct thing, but will probably break if changes are made to the code. Will work on any input.

It actually seems a lot more likely to me that the input given to the code might change than that the code itself might change. (That's particularly the case for the original post, where the code serves to read a configuration file, but it's true in general.)

replies(1): >>26301991 #

24. dthul ◴[01 Mar 21 12:11 UTC] No.26301991{6}[source]▶

>>26301900 #

Yes, I absolutely see the reasoning and I think if one does go the route of encapsulating the more efficient array logic one can have the best of both options.

replies(1): >>26302954 #

25. thaumasiotes ◴[01 Mar 21 14:18 UTC] No.26302954{7}[source]▶

>>26301991 #

> if one does go the route of encapsulating the more efficient array logic one can have the best of both options.

Do you see a way to do this that doesn't involve rolling your own array-like or list-like data type and replacing all uses of ordinary types with the new one? (This is actually already the implementation of standard Python types, but if you're encountering the problem, it isn't the implementation of your types.)

replies(1): >>26305375 #

26. dthul ◴[01 Mar 21 17:29 UTC] No.26305375{8}[source]▶

>>26302954 #

I guess it depends on the language and library you are using but I have the feeling that in most cases one would probably need to replace the usage of the old data type with the new one.

27. closeparen ◴[02 Mar 21 05:10 UTC] No.26312091{4}[source]▶

>>26298646 #

> devolved into an object churn benchmark

I'm stealing this phrase.

28. imtringued ◴[02 Mar 21 10:24 UTC] No.26313778{3}[source]▶

>>26297622 #

People complain about Big-O once they reach the end of its usefulness. Your algorithm is O(n) or O(n log n) but it is still too slow.

29. imtringued ◴[02 Mar 21 10:32 UTC] No.26313825{3}[source]▶

>>26298345 #

Java is RAM guzzler with some a small inability to optimize with value types. In its class (managed programming language without value types) it is pretty much as fast as it gets.

The two performance flaws that exist are:

1. Old Java frameworks were not written with performance in mind

2. Your entire app is written in Java so you don't benefit from C++ libraries

30. rictic ◴[03 Mar 21 08:03 UTC] No.26325876{4}[source]▶

>>26297739 #

The later in development a problem is caught, the more expensive it is. The farther it gets along the pipeline of concept -> prototype -> testing -> commit -> production, the longer it's going to take to notice, repro, identify the responsible code, and fix.

It's true that you don't just have one shot to get it right, but you can't afford to be littering the codebase with accidentally quadratic algorithms.

I fairly regularly encounter code that performed all right when it was written, then something went from X0 cases to X000 cases and now this bit of N^2 code is taking minutes when it should take milliseconds.