Most active commenters

eru(4)
winrid(3)
cogman10(3)

Popular/hot comments

>>45662595 #

←back to thread

mo server

(disco.cloud)

Show context

speedgoose ◴[21 Oct 25 21:12 UTC] No.45661785[source]▶

>>45661253 (OP) #

Looking at the htop screenshot, I notice the lack of swap. You may want to enable earlyoom, so your whole server doesn't go down when a service goes bananas. The Linux Kernel OOM killer is often a bit too late to trigger.

You can also enable zram to compress ram, so you can over-provision like the pros'. A lot of long-running software leaks memory that compresses pretty well.

Here is how I do it on my Hetzner bare-metal servers using Ansible: https://gist.github.com/fungiboletus/794a265cc186e79cd5eb2fe... It also works on VMs.

replies(15): >>45661833 #>>45662183 #>>45662569 #>>45662628 #>>45662841 #>>45662895 #>>45663091 #>>45664508 #>>45665044 #>>45665086 #>>45665226 #>>45666389 #>>45666833 #>>45673327 #>>45677907 #

levkk ◴[21 Oct 25 21:51 UTC] No.45662183[source]▶

>>45661785 #

Yeah, no way. As soon as you hit swap, _most_ apps are going to have a bad, bad time. This is well known, so much so that all EC2 instances in AWS disable it by default. Sure, they want to sell you more RAM, but it's also just true that swap doesn't work for today's expectations.

Maybe back in the 90s, it was okay to wait 2-3 seconds for a button click, but today we just assume the thing is dead and reboot.

replies(16): >>45662314 #>>45662349 #>>45662398 #>>45662411 #>>45662419 #>>45662472 #>>45662588 #>>45663055 #>>45663460 #>>45664054 #>>45664170 #>>45664389 #>>45664461 #>>45666199 #>>45667250 #>>45668533 #

gchamonlive ◴[21 Oct 25 22:04 UTC] No.45662314[source]▶

>>45662183 #

How programs use ram also changed from the 90s. Back then they were written targeting machines that they knew would have a hard time fitting all their data in memory, so hitting swap wouldn't hurt perceived performance too drastically since many operations were already optimized to balance data load between memory and disk.

Nowadays when a program hits swap it's not going to fallback to a different memory usage profile that prioritises disk access. It's going to use swap as if it were actual ram, so you get to see the program choking the entire system.

replies(2): >>45662410 #>>45662768 #

1. winrid ◴[21 Oct 25 22:15 UTC] No.45662410[source]▶

>>45662314 #

Exactly. Nowadays, most web services are run in a GC'ed runtime. That VM will walk pointers all over the place and reach into swap all the time.

replies(1): >>45662595 #

2. cogman10 ◴[21 Oct 25 22:35 UTC] No.45662595[source]▶

>>45662410 (TP) #

Depends entirely on the runtime.

If your GC is a moving collector, then absolutely this is something to watch out for.

There are, however, a number of runtimes that will leave memory in place. They are effectively just calling `malloc` for the objects and `free` when the GC algorithm detects an object is dead.

Go, the CLR, Ruby, Python, Swift, and I think node(?) all fit in this category. The JVM has a moving collector.

replies(4): >>45662942 #>>45663386 #>>45664264 #>>45665210 #

3. zozbot234 ◴[21 Oct 25 23:10 UTC] No.45662942[source]▶

>>45662595 #

Every garbage collector has to constantly sift through the entire reference graph of the running program to figure out what objects have become garbage. Generational GC's can trace through the oldest generations less often, but that's about it.

Tracing garbage collectors solve a single problem really really well - managing a complex, possibly cyclical reference graph, which is in fact inherent to some problems where GC is thus irreplaceable - and are just about terrible wrt. any other system-level or performance-related factor of evaluation.

replies(2): >>45663131 #>>45663383 #

4. cogman10 ◴[21 Oct 25 23:34 UTC] No.45663131{3}[source]▶

>>45662942 #

> Every garbage collector has to constantly sift through the entire reference graph of the running program to figure out what objects have become garbage.

There's a lot of "it depends" here.

For example, an RC garbage collector (Like swift and python?) doesn't ever trace through the graph.

The reason I brought up moving collectors is by their nature, they take up a lot more heap space, at least 2x what they need. The advantage of the non-moving collectors is they are much more prompt at returning memory to the OS. The JVM in particular has issues here because it has pretty chunky objects.

replies(1): >>45664560 #

5. eru ◴[22 Oct 25 00:05 UTC] No.45663383{3}[source]▶

>>45662942 #

Modern garbage collectors have come a long way.

Even not so modern ones: have you heard of generational garbage collection?

But even in eg Python they introduced 'immortal objects' which the GC knows not to bother with.

replies(1): >>45665528 #

6. eru ◴[22 Oct 25 00:05 UTC] No.45663386[source]▶

>>45662595 #

A moving GC should be better at this, because it can compact your memory.

replies(1): >>45663579 #

7. cogman10 ◴[22 Oct 25 00:29 UTC] No.45663579{3}[source]▶

>>45663386 #

A moving collector has to move to somewhere and, generally by it's nature, it's constantly moving data all across the heap. That's what makes it end up touching a lot more memory while also requiring more memory. On minor collections I'll move memory between 2 different locations and on major collections it'll end up moving the entire old gen.

It's that "touching" of all the pages controlled by the GC that ultimately wrecks swap performance. But also the fact that moving collector like to hold onto memory as downsizing is pretty hard to do efficiently.

Non-moving collectors are generally ultimately using C allocators which are fairly good at avoiding fragmentation. Not perfect and not as fast as a moving collector, but also fast enough for most use cases.

Java's G1 collector would be the worst example of this. It's constantly moving blocks of memory all over the place.

replies(1): >>45664965 #

8. manwe150 ◴[22 Oct 25 02:26 UTC] No.45664264[source]▶

>>45662595 #

MemBalancer is a relatively new analysis paper that argues having swap allows maximum performance by allowing small excesses, that avoids needing to over-provision ram instead. The kind of gc does not matter since data spends very little time in that state and on the flip side, most of the time the application has twice has access to twice as much memory to use

9. Dylan16807 ◴[22 Oct 25 03:15 UTC] No.45664560{4}[source]▶

>>45663131 #

> The reason I brought up moving collectors is by their nature, they take up a lot more heap space, at least 2x what they need.

If the implementer cares about memory use it won't. There are ways to compact objects that are a lot less memory-intensive than copying the whole graph from A to B and then deleting A.

10. eru ◴[22 Oct 25 04:44 UTC] No.45664965{4}[source]▶

>>45663579 #

> It's that "touching" of all the pages controlled by the GC that ultimately wrecks swap performance. But also the fact that moving collector like to hold onto memory as downsizing is pretty hard to do efficiently.

The memory that's now not in use, but still held onto, can be swapped out.

11. masklinn ◴[22 Oct 25 05:33 UTC] No.45665210[source]▶

>>45662595 #

Python’s not a mover but the cycle breaker will walk through every object in the VM.

Also since the refcounts are inline, adding a reference to a cold object will update that object. IIRC Swift has the latter issue as well (unless the heap object’s RC was moved to the side table).

12. winrid ◴[22 Oct 25 06:29 UTC] No.45665528{4}[source]▶

>>45663383 #

It doesn't matter. The GC does not know what heap allocations are in memory vs swap, and since you don't write applications thinking about that, running a VM with a moving GC on swap is a bad idea.

replies(1): >>45666328 #

13. eru ◴[22 Oct 25 08:37 UTC] No.45666328{5}[source]▶

>>45665528 #

A moving GC can make sure to separate hot and cold data, and then rely on the kernel to keep hot data in RAM.

replies(1): >>45674983 #

14. winrid ◴[22 Oct 25 20:51 UTC] No.45674983{6}[source]▶

>>45666328 #

Yeah but in practice I'm not sure that really works well with any GCs today? Ive tried this with modern JVM and Node vms, it always ended up with random multi second lockups. Not worth the time.

↑