←back to thread

804 points jryio | 7 comments | | HN request time: 0s | source | bottom
Show context
speedgoose ◴[] No.45661785[source]
Looking at the htop screenshot, I notice the lack of swap. You may want to enable earlyoom, so your whole server doesn't go down when a service goes bananas. The Linux Kernel OOM killer is often a bit too late to trigger.

You can also enable zram to compress ram, so you can over-provision like the pros'. A lot of long-running software leaks memory that compresses pretty well.

Here is how I do it on my Hetzner bare-metal servers using Ansible: https://gist.github.com/fungiboletus/794a265cc186e79cd5eb2fe... It also works on VMs.

replies(15): >>45661833 #>>45662183 #>>45662569 #>>45662628 #>>45662841 #>>45662895 #>>45663091 #>>45664508 #>>45665044 #>>45665086 #>>45665226 #>>45666389 #>>45666833 #>>45673327 #>>45677907 #
levkk ◴[] No.45662183[source]
Yeah, no way. As soon as you hit swap, _most_ apps are going to have a bad, bad time. This is well known, so much so that all EC2 instances in AWS disable it by default. Sure, they want to sell you more RAM, but it's also just true that swap doesn't work for today's expectations.

Maybe back in the 90s, it was okay to wait 2-3 seconds for a button click, but today we just assume the thing is dead and reboot.

replies(16): >>45662314 #>>45662349 #>>45662398 #>>45662411 #>>45662419 #>>45662472 #>>45662588 #>>45663055 #>>45663460 #>>45664054 #>>45664170 #>>45664389 #>>45664461 #>>45666199 #>>45667250 #>>45668533 #
bayindirh ◴[] No.45662411[source]
This is a wrong belief because a) SSDs make swap almost invisible, so you can have that escape ramp if something goes wrong b) SWAP space is not solely an escape ramp which RAM overflows into anymore.

In the age of microservices and cattle servers, reboot/reinstall might be cheap, but in the long run it is not. A long running server, albeit being cattle, is always a better solution because esp. with some excess RAM, the server "warms up" with all hot data cached and will be a low latency unit in your fleet, given you pay the required attention to your software development and service configuration.

Secondly, Kernel swaps out unused pages to SWAP, relieving pressure from RAM. So, SWAP is often used even if you fill 1% of your RAM. This allows for more hot data to be cached, allowing better resource utilization and performance in the long run.

So, eff it, we ball is never a good system administration strategy. Even if everything is ephemeral and can be rebooted in three seconds.

Sure, some things like Kubernetes forces "no SWAP, period" policies because it kills pods when pressure exceeds some value, but for more traditional setups, it's still valuable.

replies(8): >>45662537 #>>45662599 #>>45662646 #>>45662687 #>>45663237 #>>45663354 #>>45664553 #>>45664705 #
adastra22 ◴[] No.45662646[source]
What pressure? If your ram is underutilized, what pressure are you talking about?

If the slowest drive on the machine is the SSD, how does caching to swap help?

replies(2): >>45662707 #>>45662734 #
bayindirh ◴[] No.45662707[source]
A long running Linux system uses 100% of its RAM. Every byte unused for applications will be used as a disk cache, given you read more data than your total RAM amount.

This cache is evictable, but it'll be there eventually.

Linux used to don't touch unused pages in the RAM in the older days if your RAM was not under pressure, but now it swaps out pages unused for a long time. This allows more cache space in RAM.

> how does caching to swap help?

I think I failed to convey what I tried to say. Let me retry:

Kernel doesn't cache to SSD. It swaps out unused (not accessed) but unevictable pages to SWAP, assuming that these pages will stay stale for a very long time, allowing more RAM to be used as cache.

When I look to my desktop system, in 12 days, Kernel moved 2592MB of my RAM to SWAP despite having ~20GB of free space. ~15GB of this free space is used as disk cache.

So, to have 2.5GB more disk cache, Kernel moved 2592 MB of non-accessed pages to SWAP.

replies(3): >>45662776 #>>45663196 #>>45667848 #
1. wallstop ◴[] No.45662776[source]
Edit:

    wallstop@fridge:~$ free -m
                   total        used        free      shared  buff/cache   available
    Mem:           15838        9627        3939          26        2637        6210
    Swap:           4095           0        4095


    wallstop@fridge:~$ uptime

    00:43:54 up 37 days, 23:24,  1 user,  load average: 0.00, 0.00, 0.00
replies(1): >>45662870 #
2. bayindirh ◴[] No.45662870[source]
The command you want to use is "free -m".

This is from another system I have close:

                   total        used        free      shared  buff/cache   available
    Mem:           31881        1423        1042          10       29884       30457
    Swap:            976           2         974
2MB of SWAP used, 1423 MB RAM used, 29GB cache, 1042 MB Free. Total RAM 32 GB.
replies(3): >>45663312 #>>45663669 #>>45667833 #
3. eru ◴[] No.45663312[source]
If you are interested in human consumption, there's "free --human" which decided on useful units by itself. The "--human" switch is also available for "du --human" or "df --human" or "ls -l --human". It's often abbreviated as "-h", but not always, since that also often stands for "--help".
replies(1): >>45667223 #
4. wallstop ◴[] No.45663669[source]
Thanks! My other problem was formatting. Just wanted to share that I see 0 swap usage and nowhere near 100% memory usage as a counterpoint.
5. bayindirh ◴[] No.45667223{3}[source]
Thanks, I generally use free -m since my brain can unconsciously parse it after all these years. ls -lh is one of my learned commands though. I type it in automatically when analyzing things.

ls -lrt, ls -lSh and ls -lShr are also very common in my daily use, depending on what I'm doing.

6. ta1243 ◴[] No.45667833[source]
So that 2M of used swap is completely irrelevant. Same on my laptop

               total        used        free      shared  buff/cache   available
    Mem:           31989       11350        4474        2459       16164       19708
    Swap:           6047          20        6027
My syslog server on the other hand (which does a ton of stuff on disk) does use swap

    Mem:            1919         333          75           0        1511        1403
    Swap:           2047         803        1244
With uptime of 235 days.

If I were to increase this to 8G of ram instead of 2G, but for arguments sake had to have no swap as the tradeoff, would that be better or worse. Swap fans say worse.

replies(1): >>45667951 #
7. bayindirh ◴[] No.45667951{3}[source]
> So that 2M of used swap is completely irrelevant.

As I noted somewhere, my other system has 2,5GB of SWAP allocated over 13 days. That system is a desktop system and juggles tons of things everyday.

I have another server with tons of RAM, and the Kernel decided not to evict anything to SWAP (yet).

> If I were to increase this to 8G of ram instead of 2G, but for arguments sake had to have no swap as the tradeoff, would that be better or worse. Swap fans say worse.

I'm not a SWAP fan, but I support its use. On the other hand I won't say it'd be worse, but it'd be overkill for that server. Maybe I can try 4, but that doesn't seem to be necessary if these numbers are stable over time.