I was actually surprised / disappointed by the poor quality of the drivers out there. If you instrument and log the bytes u8g2 sends to the device and how the buffer is sent, you see it’s pretty messy, inefficient and hacky. Seems like everyone copy+pasting everyone else’s code, without understanding fully.
So in the end I just decided to draw into a local buffer and then send that buffer in one single transaction to the display.
Once you figure out the data sheet it’s actually very easy to program these displays. For example, to test any SSDxxxx display, you need to send all of two bytes: 0xaf (display on), 0xa5 (all pixels on)
I am now looking at SSD1322, which is both grayscale and has enough internal RAM for two buffers for even smoother drawing (write into a non-displayed region of the RAM and then change the display offset)