Replacing a $3000/mo Heroku bill with a $55/mo server

Another option would be to have more memory that required over-engineer and to adjust the oom score per app, adding early kill weight to non critical apps and negative weight to important apps. oom_score_adj is already set to -1000 by OpenSSH for example.

    NSDJUST=$(pgrep -x nsd); echo -en '-378' > /proc/"${NSDJUST}"/oom_score_adj

Another useful thing to do is effecively disable over-commit on all staging and production servers (0 ratio instead of 2 memory to fully disable as these do different things, memory 0 still uses formula)

    vm.overcommit_memory = 0
    vm.overcommit_ratio = 0

Also use a formula to set min_free and reserved memory using a formula from Redhat that I do not have handy based on installed memory. min_free can vary from 512KB to 16GB depending on installed memory.

    vm.admin_reserve_kbytes = 262144
    vm.user_reserve_kbytes = 262144
    vm.min_free_kbytes = 1024000

At least that worked for me in about 50,000 physical servers for over a decade that were not permitted to have swap and installed memory varied from 144GB to 4TB of RAM. OOM would only occur when the people configuring and pushing code would massively over-commit and not account for memory required by the kernel. Not following best practices defined by Java and thats a much longer story.

Another option is to limit memory per application in cgroups but that requires more explaining than I am putting in an HN comment.

Another useful thing is to never OOM kill in the first place on servers that are only doing things in memory and need not commit anything to disk. So don't do this on a disked database. This is for ephemeral nodes that should self heal. Wait 60 seconds so drac/ilo can capture crash message and then earth shattering kaboom...

    # cattle vs kittens, mooooo...
    kernel.panic = 60
    vm.panic_on_oom = 2

For a funny side note, those options can also be used as a holy hand grenade to intentionally unsafely reboot NFS diskless farms when failing over to entirely different NFS server clusters. setting panic to 15 mins, triggering OOM panic by setting min_free to 16TB at the command line via Ansible not in sysctl.conf, swapping clusters, arp storm and reconverge.