←back to thread

150 points shaunpud | 2 comments | | HN request time: 0.4s | source
Show context
fh973 ◴[] No.45060597[source]
Swap on servers somewhat defeats the purpose of ECC memory: your program state is now subject to complex IO path that is not end-to-end checksum protected. Also you get unpredictable performance.

So typically: swap off on servers. Do they have a server story?

replies(6): >>45060665 #>>45060768 #>>45062143 #>>45062478 #>>45062741 #>>45110791 #
blueflow ◴[] No.45062478[source]
First, having no swap means anonymous pages cannot be evicted, named pages must be evicted instead.

Second, the binaries of your processes are mapped in as named pages (because they come from the ELF file).

Named pages are generell not understood as "used" memory because they can be evicted and reclaimed, but if you have a service with a 150MB binary running, those 150MB of seemingly "free" memory are absolutely crucial for performance.

Running out of this 150MB of disk cache will result in the machine using up all I/O capacities to re-fetch the ELF from disk and likely become unresponsive. Having swap does significantly delay this lock-up by allowing anonymous pages to be evicted, so the same memory pressure will cause less stalls.

So until the OOM management on Linux gets fixed, you need swap.

replies(1): >>45063510 #
Scaevolus ◴[] No.45063510[source]
Swapping anonymous pages can bring the system to a crawl too. High memory pressure makes things very slow with swap, while with swap off high memory pressure is likely to invoke the oom killer and lets the system violently repair.
replies(1): >>45063942 #
1. blueflow ◴[] No.45063942[source]
The "bug" with the OOM killer that i implied is that what you describe does not happen. Which is not surprising because disk cache thrashing is normal mode of operation for serving big files to the network. An OOM killer acting on that alone would be problematic, but without swap, that's where the slowdown will happen for other workloads, too.

Its less a bug but an understood problem, and there aren't any good solutions around yet.

replies(1): >>45066248 #
2. Trixter ◴[] No.45066248[source]
earlyoom is what we use to address this. We can't tolerate any kind of swapping at all in our workloads, where it is better for the system to kill one process to save the others, than for the system to slow down or lock up.