←back to thread

234 points benocodes | 1 comments | | HN request time: 0s | source
Show context
whalesalad ◴[] No.41836959[source]
So satisfying to do a huge upgrade like this and then see the actual proof in the pudding with all the reduced latencies and query times.
replies(1): >>41837062 #
hu3 ◴[] No.41837062[source]
Yeah some numbers caught my attention like ~94% reduction in overall database lock time.

And to think they never have to worry about VACUUM. Ahh the peace.

replies(4): >>41837227 #>>41837317 #>>41837626 #>>41838255 #
InsideOutSanta ◴[] No.41837317[source]
As somebody who has always used MySQL, but always been told that I should be using Postgres, I'd love to understand what the issues with VACUUM are, and what I should be aware of when potentially switching databases?
replies(5): >>41837435 #>>41838160 #>>41838537 #>>41839012 #>>41845775 #
mjr00 ◴[] No.41838160{3}[source]
Worth reading up on Postgres' MVCC model for concurrency.[0]

Short version is that VACUUM is needed to clean up dead tuples and reclaim disk space. For most cases with smaller amounts of data, auto-vacuum works totally fine. But I've had issues with tables with 100m+ rows that are frequently updated where auto-vacuum falls behind and stops working completely. These necessitated a full data dump + restore (because we didn't want to double our storage capacity to do a full vacuum). We fixed this by sharding the table and tweaking auto-vacuum to run more frequently, but this isn't stuff you have to worry about in MySQL.

Honestly if you're a small shop without database/postgres experts and MySQL performance is adequate for you, I wouldn't switch. Newer versions of MySQL have fixed the egregious issues, like silent data truncation on INSERT by default, and it's easier to maintain, in my experience.

[0] https://www.postgresql.org/docs/current/mvcc-intro.html

replies(2): >>41838279 #>>41838746 #
williamdclt ◴[] No.41838746{4}[source]
As much as I have gripes with the autovac, I’m surprised at the idea of getting to such a broken state. 100M rows is not small but not huge, how frequent is “frequent updates”? How long ago was that (there’s been a lot of changes in autovac since v9)?

“Stops working completely” should not be a thing, it could be vacuuming slower than the update frequency (although that’d be surprising) but I don’t know of any reason it’d just stop?

That being said I’ve also had issues with autovac (on aurora to be fair, couldn’t say if it was aurora-specific) like it running constantly without vacuuming anything, like there was an old transaction idling (there wasn’t)

replies(2): >>41839294 #>>41839304 #
1. sgarland ◴[] No.41839304{5}[source]
On decently-sized tables (100,000,000 is, as you say, not small but not huge), if you haven’t tuned cost limiting and/or various parameters for controlling autovacuum workers, it’s entirely possible for it to effectively do nothing, especially if you’re in the cloud with backing disks that have limited IOPS / throughput.

It continues to baffle me why AWS picks some truly terrible defaults for parameter groups. I understand most of them come from the RDBMS defaults, but AWS has the luxury of knowing precisely how many CPUs and RAM any given instance has. On any decently-sized instance, it should allocate far more memory for maintenance_work_mem, for example.