←back to thread

115 points saisrirampur | 1 comments | | HN request time: 0s | source
Show context
Joker_vD ◴[] No.44568895[source]
The tl;dr is that Postgres, as any long-running "server" process (especially as a DBMS server!) does not run with SIG_DFL as the handler for SIGTERM; it instead sets up the signal handler that merely records the fact that the signal has happened, in hopes that whatever loops are going on will eventually pick it up. As usual, some loops don't but it's very hard to notice.
replies(2): >>44569721 #>>44570211 #
bbarnett ◴[] No.44569721[source]
Indeed. I've seen DBMSes take close to 10 minutes to gracefully exit, even when idle.

Timeout in sysvinit and service files, for a graceful exit, is typically 900 seconds.

Most modern DBMS daemons will recover if SIGKILL, most of the time, especially if you're using transactions. But startup will then be lagged, as it churns through and resolves on startup.

(unless you've modified the code to short cut things, hello booking.com "we're smarter than you" in 2015 at least, if not today)

replies(1): >>44570734 #
1. sjsdaiuasgdia ◴[] No.44570734[source]
Yeah, as I've said on many incident calls...we can pay for transaction recovery at shutdown or we can pay for it at startup, but it's got to happen somewhere.

The "SIGKILL or keep waiting?" decision comes down to whether you believe the DBMS is actually making progress towards shutdown. Proving that the shutdown is progressing can be difficult, depending on the situation.