Most active commenters

    ←back to thread

    73 points ajhool | 15 comments | | HN request time: 1.279s | source | bottom

    It's common to see here that Postgres hosted in RDS can handle 99% of workloads up to millions of users. I'm building an IoT app with a plan to ingest the IoT traffic into dynamo partitioned on user id (I'm quite familiar with the tradeoffs) and everything else be in Postgres. A few services but not microservice (basically: core service, identity service, IoT data service, notification service). Ingesting and monitoring about 1,000,000 IoT devices daily (1 packer per device per day) and about 1,000,000 users with only 5,000 active users per day (basically we monitor user IoT devices 24/7 but only some 5,000 users will have anomalous results and log in).

    In the database posts & discussions here I sometimes find that the opinions are strong but the numbers are missing. Obviously applications have wide variation in traffic and query complexity so apples to apples comparisons are hard. Still, I would greatly benefit from hearing some real world experiences with numbers.

    Rough approximation database questions for current or prior applications:

    1. How many customers do you have?

    2. What's expected daily traffic? Peak traffic?

    3. What database engine or engines do you use?

    4. How many rows or how much storage does your db have?

    5. What else about your application is relevant for database load?

    6. Microservice, Service, or monolith. Happy with it?

    1. jedberg ◴[] No.43366398[source]
    I think you might be asking the wrong questions. They key questions are Queries per second and the median response size of the query.

    For example at reddit (15 years ago) we had 10x more vote traffic than comment traffic, but we only needed two databases to handle votes (technically only one the other was just for redundancy).

    But we needed nine comments databases. Mainly because the median query response was so much bigger for comments.

    replies(4): >>43367667 #>>43367866 #>>43368412 #>>43371013 #
    2. jascination ◴[] No.43367667[source]
    Unrelated: I love HN; a random database question and fkn jedberg is one of the first responders
    replies(2): >>43368839 #>>43369790 #
    3. cogman10 ◴[] No.43367866[source]
    I'm sure latency matters a lot as well.

    The users of our apps are pretty tolerant of 5 to 10 minute request times for some of our pages, which means we've been able to get away with just a few servers for several TBs of data stored and served. (100+mb responses are not unusual for us).

    If we had to rethink and redesign the system to cut down those times, we'd need a lot more databases and a much cleverer storage strategy than we currently have.

    While I'm sure response time for Reddit is really important, I could imagine that an IOT serving system needs almost nothing in to hit something like a 10 to 20 second response time for 5k devices.

    replies(1): >>43372650 #
    4. ajhool ◴[] No.43368412[source]
    Thank you very much! Did you start with separate databases or wait until you needed to scale workloads separately before breaking things up?
    replies(1): >>43368844 #
    5. chistev ◴[] No.43368839[source]
    Who is he?
    replies(3): >>43368941 #>>43368942 #>>43375315 #
    6. chistev ◴[] No.43368844[source]
    I imagine they didn't optimize prematurely.
    replies(1): >>43373380 #
    7. RandomBacon ◴[] No.43368941{3}[source]
    Apparently the first paid employee of reddit https://www.jedberg.net
    8. mayli ◴[] No.43368942{3}[source]
    jedberg.net https://www.jedberg.net I was the first (paid) employee of reddit, and currently run Site Reliability at Netflix.
    9. moralestapia ◴[] No.43369790[source]
    jascinating!
    10. ◴[] No.43371013[source]
    11. gnz11 ◴[] No.43372650[source]
    Your users are tolerant of 5-10 minute page request times? Is that a typo?
    replies(1): >>43372786 #
    12. cogman10 ◴[] No.43372786{3}[source]
    Nope. I'm very fortunate
    replies(1): >>43385809 #
    13. dpkirchner ◴[] No.43373380{3}[source]
    Not to denigrate jedberg but a lot of folks optimized prematurely before realizing they shouldn't. (I've certainly been guilty of it.)
    14. ratg13 ◴[] No.43375315{3}[source]
    You can click on HN usernames to read their profile.
    15. ◴[] No.43385809{4}[source]