Why is Windows so slow?

(games.greggman.com)

Show context

evmar ◴[19 Dec 11 07:38 UTC] No.3368867[source]▶

I don't know this poster, but I am pretty familiar with the problem he's encountering, as I am the person most responsible for the Chrome build for Linux.

I (and others) have put a lot of effort into making the Linux Chrome build fast. Some examples are multiple new implementations of the build system ( http://neugierig.org/software/chromium/notes/2011/02/ninja.h... ), experimentation with the gold linker (e.g. measuring and adjusting the still off-by-default thread flags https://groups.google.com/a/chromium.org/group/chromium-dev/... ) as well as digging into bugs in it, and other underdocumented things like 'thin' ar archives.

But it's also true that people who are more of Windows wizards than I am a Linux apprentice have worked on Chrome's Windows build. If you asked me the original question, I'd say the underlying problem is that on Windows all you have is what Microsoft gives you and you can't typically do better than that. For example, migrating the Chrome build off of Visual Studio would be a large undertaking, large enough that it's rarely considered. (Another way of phrasing this is it's the IDE problem: you get all of the IDE or you get nothing.)

When addressing the poor Windows performance people first bought SSDs, something that never even occurred to me ("your system has enough RAM that the kernel cache of the file system should be in memory anyway!"). But for whatever reason on the Linux side some Googlers saw it fit to rewrite the Linux linker to make it twice as fast (this effort predated Chrome), and all Linux developers now get to benefit from that. Perhaps the difference is that when people write awesome tools for Windows or Mac they try to sell them rather than give them away.

Including new versions of Visual Studio, for that matter. I know that Chrome (and Firefox) use older versions of the Visual Studio suite (for technical reasons I don't quite understand, though I know people on the Chrome side have talked with Microsoft about the problems we've had with newer versions), and perhaps newer versions are better in some of these metrics.

But with all of that said, as best as I can tell Windows really is just really slow for file system operations, which especially kills file-system-heavy operations like recursive directory listings and git, even when you turn off all the AV crap. I don't know why; every time I look deeply into Windows I get more afraid ( http://neugierig.org/software/chromium/notes/2011/08/windows... ).

replies(10): >>3368892 #>>3368926 #>>3369043 #>>3369059 #>>3369102 #>>3369181 #>>3369566 #>>3369907 #>>3370579 #>>3372438 #

vog ◴[19 Dec 11 09:19 UTC] No.3369059[source]▶

>>3368867 #

What is preventing you from using MinGW? That way, you could use the GNU toolchain (Make, GCC, Binutils etc.) and still have full access to the Win32 API. You could reuse almost all of your Unix build scripts, and the rest boils usually down to making your scripts aware of different file extensions (.exe/.dll instead of /.so).

Even better, you can do cross compiling with MinGW. So if your toolchain dosn't perform well on Windows, just use GCC as a cross compiler and build your stuff on a Linux or BSD machine. Then use Windows for testing the executable. (On smaller projects, you usually don't even need Windows for that, since Wine does the job as well.)

(Full disclosure: I'm the maintainer of a Free Software project that makes cross compiling via MinGW very handy: http://mingw-cross-env.nongnu.org/)

replies(1): >>3369129 #

TwoBit ◴[19 Dec 11 09:58 UTC] No.3369129[source]▶

>>3369059 #

VC++ generates significantly better code than GCC. Enough so that performance-minded projects usually wouldn't consider MinGW/GCC for Windows code.

replies(2): >>3369388 #>>3369389 #

1. gillianseed ◴[19 Dec 11 12:10 UTC] No.3369389[source]▶

>>3369129 #

Please show me something to back this nonsense up.

replies(1): >>3369431 #

2. hemancuso ◴[19 Dec 11 12:32 UTC] No.3369431[source]▶

>>3369389 (TP) #

How about you back up calling it nonsense?

replies(2): >>3369570 #>>3370003 #

3. gillianseed ◴[19 Dec 11 13:49 UTC] No.3369570[source]▶

>>3369431 #

We've done compiler tests between VC express 2010 and GCC (Mingw gcc 4.6.x branch) at work with GCC beating VC express at '-O3 -ffast-math -march=corei7' vs '/O2 /arch:SSE2' for our code on Intel Core i7 nehalem. GCC even beat ICC on certain tests on that same Intel hardware.

What we weren't able to compare between the compilers was link-time optimization and profile-guided optimization since Microsoft crippled VC express by removing these optimizations.

So when someone makes claims that 'VC++ generates significantly better code than GCC' I want to see something backing that up. Had I made a blanket statement that 'GCC generates significantly better code than VC++' someone would call me on backing up that aswell, and rightly so.

replies(1): >>3369717 #

4. CurtHagenlocher ◴[19 Dec 11 14:43 UTC] No.3369717{3}[source]▶

>>3369570 #

So when you didn't use the two most important perf features in MSC, its performance was underwhelming. This is no surprise.

Also, if you were doing anything heavily floating-point, MSC 2010 would be a bad choice because it doesn't vectorize. Even internally at Microsoft, we were using ICC for building some math stuff. The next release of MSC fixes this.

replies(2): >>3369816 #>>3373122 #

5. gillianseed ◴[19 Dec 11 15:14 UTC] No.3369816{4}[source]▶

>>3369717 #

Well we obviously didn't enable PGO/LTO for GCC either when doing these tests as that would have been pointless.

It would have been interesting to compare the quality of the respective compiler's PGO/LTO optimizations (particularly PGO given that for GCC and ICC code is sometimes up to 20% faster with that optimization) but not interesting enough for us to purchase a Visual Studio licence.

And yes we use floating point math in most of our code, and if MSC doesn't vectorize then that would certainly explain worse performance. However this further denies the blanket statement of 'VC++ generates significantly better code than GCC.' which I was responding to.

6. hahaiamatwork ◴[19 Dec 11 16:06 UTC] No.3370003[source]▶

>>3369431 #

With respect, this is not how claims work. You can't make a claim and then expect your opponents to have the burden of proof to refute it.

If you make a claim such as 'GCC produces significantly worse code than alternate compiler A' then it's completely reasonable to ask for something to support it. Tone wise perhaps the post could have been improved, but the principle stands.

7. makomk ◴[20 Dec 11 12:10 UTC] No.3373122{4}[source]▶

>>3369717 #

I believe that at least one of the projects the original blog post mentioned - Chromium - can't be compiled with LTO or PGO enabled. Apparently the linker just runs out of memory with it and most large projects.

replies(1): >>3373445 #

8. gillianseed ◴[20 Dec 11 14:07 UTC] No.3373445{5}[source]▶

>>3373122 #

Well it makes sense that LTO would have high memory requirements given that the point of the optimization is to look at the entire program as one entity rather than on a file by file scope and I have no doubt this can cause problems with very large projects.

PGO on the other hand seems very unlikely to fail due to memory constraints, atleast I've never come across that happening, the resulting code for the profiling stage will of course be bigger since it contains profiling code but I doubt the compilation stage requires alot more memory even though it examines the generated profiling data when optimizing the final binary.

It seems weird that PGO would not work with Chromium given that it's used in Firefox (which is not exactly a small project) to give a very noticeable speed boost (remember the 'Firefox runs faster with windows Firefox binary under wine than the native Linux binary debacle'? That was back when linux Firefox builds didn't use PGO while the windows builds did.)

↑