←back to thread

131 points pgedge_postgres | 8 comments | | HN request time: 0.631s | source | bottom
1. verelo ◴[] No.45534285[source]
Interesting, i always see attempts to make these types of database tools as super interesting but then I think about all the undocumented edge cases that can come up and they scare me off.

Many many years ago I worked on a monitoring tool that itself needed to be highly available, and we needed a solution like this. Ever since that time I've done everything in my power to avoid it.

What are the real world cases you built this for? And how can someone like me who has been bruised by past experiences get comfortable with it?

replies(6): >>45534920 #>>45534996 #>>45535118 #>>45535625 #>>45537130 #>>45537140 #
2. victor9000 ◴[] No.45534920[source]
What failure cases did you encounter?
3. pgedge_postgres ◴[] No.45534996[source]
Getting some examples of real-world cases to share and will comment back with them ASAP; in the meantime, would you mind sharing what undocumented edge cases you came across and what solutions you explored to handle them? It would help with sharing super relevant use cases :-)
replies(1): >>45540984 #
4. pgedge_postgres ◴[] No.45535118[source]
Just a guess, but some of the undocumented edge cases you saw might be explored in this blog from one of our software engineers, Shaun Thomas. It's all about conflict resolution & avoidance in PostgreSQL, in general: https://www.pgedge.com/blog/living-on-the-edge

If understanding how conflicts are handled in pgEdge is helpful, here's a link to the docs on the subject: https://docs.pgedge.com/spock_ext/conflicts

And the FAQ also delves into it some: https://www.pgedge.com/resources/faq

5. baq ◴[] No.45535625[source]
Typical use case would be a anyone who has global presence, but serves users in particular geos (think AWS): you want a global user database but it’s soooo convenient to be able to join with regional data in a single query.
6. jwr ◴[] No.45537130[source]
> edge cases that can come up and they scare me off

They should! Read some of the excellent Jepsen analyses to see how scary things can be: https://jepsen.io/analyses

7. vyruss ◴[] No.45537140[source]
Local write latency in a geo-distributed database is also important for some use cases.
8. verelo ◴[] No.45540984[source]
I tried to escape this world as quickly as possible, realizing how horrible it was, but the largest issue I ran into was around IO. Creating an environment that was highly tolerant to fault while having little to no replication delay meant checking in on the master database frequently. Keeping in mind this was around 2010 I found that the IO load on these databases was substantially larger than any database that i had ever worked on before. Things like available file handlers and other related performance problems came up more frequently than I’ve ever experienced before and frankly more frequently than I’ve ever experienced since.

If I was to summarize it, I would just say the performance characteristics were not something I was used to experiencing and often they would surprise me when they occurred, which meant having a good quality of a while for running this application was very challenging.