Most active commenters
  • TheDong(3)
  • CGamesPlay(3)

←back to thread

804 points jryio | 15 comments | | HN request time: 0.001s | source | bottom
Show context
speedgoose ◴[] No.45661785[source]
Looking at the htop screenshot, I notice the lack of swap. You may want to enable earlyoom, so your whole server doesn't go down when a service goes bananas. The Linux Kernel OOM killer is often a bit too late to trigger.

You can also enable zram to compress ram, so you can over-provision like the pros'. A lot of long-running software leaks memory that compresses pretty well.

Here is how I do it on my Hetzner bare-metal servers using Ansible: https://gist.github.com/fungiboletus/794a265cc186e79cd5eb2fe... It also works on VMs.

replies(15): >>45661833 #>>45662183 #>>45662569 #>>45662628 #>>45662841 #>>45662895 #>>45663091 #>>45664508 #>>45665044 #>>45665086 #>>45665226 #>>45666389 #>>45666833 #>>45673327 #>>45677907 #
1. TheDong ◴[] No.45664508[source]
Even better than earlyoom is systemd-oomd[0] or oomd[1].

systemd-oomd and oomd use the kernel's PSI[2] information which makes them more efficient and responsive, while earlyoom is just polling.

earlyoom keeps getting suggested, even though we have PSI now, just because people are used to using it and recommending it from back before the kernel had cgroups v2.

[0]: https://www.freedesktop.org/software/systemd/man/latest/syst...

[1]: https://github.com/facebookincubator/oomd

[2]: https://docs.kernel.org/accounting/psi.html

replies(3): >>45664907 #>>45665996 #>>45666462 #
2. CGamesPlay ◴[] No.45664907[source]
"earlyoom is just polling"?

> systemd-oomd periodically polls PSI statistics for the system and those cgroups to decide when to take action.

It's unclear if the docs for systemd-oomd are incorrect or misleading; I do see from the kernel.org link that the recommended usage pattern is to use the `poll` system call, which in this context would mean "not polling", if I understand correctly.

replies(2): >>45664986 #>>45665579 #
3. 100721 ◴[] No.45664986[source]
Unrelated to the topic, it seems awfully unintuitive to name a function ‘poll’ if the result is ‘not polling.’ I’m guessing there’s some history and maybe backwards-compatible rewrites?
replies(3): >>45665384 #>>45665391 #>>45665401 #
4. unilynx ◴[] No.45665384{3}[source]
Poll takes a timeout parameter. ‘Not polling’ is just a really long timeout
5. friendzis ◴[] No.45665391{3}[source]
"Let the underlying platform do the polling and return once the condition is met"
6. CGamesPlay ◴[] No.45665401{3}[source]
Specifically, earlyoom’s README says it repeatedly checks (“periodically polls”) the memory pressure, using CPU each time even when there is no change. The “poll” system call waits for the kernel to notify the process that the file has changed, using no CPU until the call resolves. It’s unclear what systemd-oomd does, because it uses the phrase “periodically polls”,
replies(1): >>45666160 #
7. TheDong ◴[] No.45665579[source]
systemd-oomd, oomd, and earlyoom all do poll for when to actually take action on OOM conditions.

What I was trying to say is that the actual information on when there's memory pressure is more accurate for systemd-oomd / oomd because they use PSI, which the kernel itself is updating over time, and they just poll that, while earlyoom is also internally making its own estimates at a lower granularity than the kernel does.

8. speedgoose ◴[] No.45665996[source]
Thanks, I will try that out.
9. immibis ◴[] No.45666160{4}[source]
The "poll" system call does not wait until a file changes.
replies(1): >>45666622 #
10. geokon ◴[] No.45666462[source]
Do you have any insight in to why this isn't included by default in distros like Ubuntu. It's kind of bewildering that the default behavior on Ubuntu is to just lock up the whole system on OOM
replies(1): >>45666764 #
11. CGamesPlay ◴[] No.45666622{5}[source]
s/the file has changed/it has published new data to the file descriptor/

See https://docs.kernel.org/accounting/psi.html

12. TheDong ◴[] No.45666764[source]
systemd-oomd I'm pretty sure is enabled by default in fedora and ubuntu desktop.

I think it's off on the server variants.

replies(2): >>45667002 #>>45667257 #
13. galangalalgol ◴[] No.45667002{3}[source]
Is there any way to get something like the oomd or zram that works on gpu memory? I run into gpu memory leaks more often. Itbseems to be electron usually.
replies(1): >>45677987 #
14. geokon ◴[] No.45667257{3}[source]
Kubuntu LTS definitely didnt have it by default. And there are no system settings exposing it (or ZRAM)
15. fireant ◴[] No.45677987{4}[source]
GPU memory model quite different from CPU memory model, with application level explicit synchronization and coherency and so on. I don't think that transparent compression would be possible, and even if it would surely carry drastic perf downside