←back to thread

366 points virtualwhys | 5 comments | | HN request time: 0.224s | source
Show context
halayli ◴[] No.41899794[source]
This topic cannot be discussed alone without talking about disks. SSDs write 4k page at a time. Meaning if you're going to update 1 bit, the disk will read 4k, you update the bit, and it writes back a 4k page in a new slot. So the penalty for copying varies depending on the disk type.
replies(2): >>41900275 #>>41904085 #
srcreigh ◴[] No.41900275[source]
Postgres pages are 8kb so the point is moot.
replies(2): >>41901535 #>>41901808 #
1. halayli ◴[] No.41901808[source]
I am referring to physical pages in an SSD disk. The 8k pg page maps to 2 pages in a typical SSD disk. Your comment proves my initial point, which is write amplification cannot be discussed without talking about the disk types and their behavior.
replies(2): >>41902116 #>>41903957 #
2. emptiestplace ◴[] No.41902116[source]
Huh? It seems you've forgotten that you were just saying that a single bit change would result in a 4096 byte write.
replies(1): >>41906755 #
3. mschuster91 ◴[] No.41903957[source]
> The 8k pg page maps to 2 pages in a typical SSD disk.

You might end up with even more than that due to filesystem metadata (inode records, checksums), metadata of an underlying RAID mechanism or, when working via some sort of networking, stuff like ethernet frame sizes/MTU.

In an ideal world, there would be a clear interface which a program can use to determine for any given combination of storage media, HW RAID, transport layer (local attach vs stuff like iSCSI or NFS), SW RAID (i.e. mdraid), filesystem and filesystem features what the most sensible minimum changeable unit is to avoid unnecessary write amplification bloat.

4. Tostino ◴[] No.41906755[source]
> a single bit change would result in a 4096 byte write

On (most) SSD hardware, regardless of what software you are using to do the writes.

At least that's how I read their comment.

replies(1): >>41907050 #
5. emptiestplace ◴[] No.41907050{3}[source]
Right, and if pg writes 8192 bytes every time, this is no longer relevant.