Ask HN: How much traffic do you serve and with which database engine?

Let's do some math.

First, the IOT devices reporting daily. In the absence of further context, I’m going to assume that it doesn’t matter when the devices report, so they can be configured to spread out their load. I’m also going to assume 1kb of data per device, but with an HTTPS API there’s roughly 7kb of overhead that we need to account for when calculating bandwidth. (Source: http://netsekure.org/2010/03/tls-overhead/ . TLS session resumption gets that down to ~300 bytes, but adds implementation complications.)

    $ units -1v '1M req/day' 'req/sec'
        1M req/day = 11.574074 req/sec
    $ units -1v '1M req/day * 8kbyte/req' 'kbyte/sec' # Incoming bandwidth
        1M req/day * 8kbyte/req = 92.592593 kbyte/sec
    $ units -1v '1M req/day * 1kbyte/req' 'years/TB' # Storage
        reciprocal conversion
        1 / (1M req/day * 1kbyte/req) = 2.7379093 years/TB

It looks like our load here is a whopping 12 RPS, and we could handle the traffic on an ISDN line from the 1990s. Data storage is a little trickier; if we can’t compress or delete old data we may have to stop by Best Buy for a new hard drive every half decade or so.

Users can’t be configured to load balance themselves, so we’ll be pessemistic and assume that every single one of them logs in to check their device over their morning coffee. We’ll also assume that every time they do that they want all of the data from the last 3 months, though in practice this could probably be summarized before we send it to them.

    $ units -1v '5000 req/1 hour' 'req/sec'
        5000 req/1 hour = 1.3888889 req/sec
    $ units -1v '5000 req/1 hour * 90 kbyte/req' 'MB/sec' # Outgoing bandwidth
        5000 req/1 hour * 90 kbyte/req = 0.125 MB/sec

For this, we have just under 2 RPS, but our responses are quite a lot bigger so the bandwidth is higher—we probably want to move into the early 2000s and upgrade to a DSL connection. Oh, and we also want to make sure our disk can handle the read load—but conveniently, since we’re just pulling raw data and not asking for any complicated joins, these numbers are actually the same. 2 RPS gives us 500ms per request; since a spinning rust drive pessimistically takes 10ms/seek we only have a ~50-seek budget per request so we probably want to make sure to cluster or partition the table by device to improve data locality. (Doing that increases the write load significantly, though, so maybe we want to upgrade to an SSD or think about a database that’s smart enough to do some kind of periodic bulk rebalancing.)

Oh, I almost forgot, we'll also want to make sure we have disk space to keep track of all of those idle logins who aren’t doing anything:

    $ units -1v '1M records * 1kbyte/rec' 'GB'
        1M records * 1kbyte/rec = 1 GB

Modern computers—for very relative definitions of modern—are fast, if they don’t get bogged down. Based on my numbers, you could probably run your system from a Macbook in your desk drawer; but it would be trivial to add a requirement which would multiply any of these numbers by several orders of magnitude. The problem with trying to compare architectures is that you have to really understand what your needs are, and not only are they going to be difficult to compare to someone else’s setup (small differences can add up fast when multiplied across requests) but it’s also hard to tell how much effort the other system put into optimizing (or, for that matter, how much effort you want to put into optimizing).

This is a nice analysis of incoming and outgoing bandwidth, but totally neglects to analyze the compute power required by each query. You can’t answer that without knowing the possible queries that users can send. Do they have a dashboard with a known set of queries? Do they have a GraphQL API that can produce pathological queries? Can they send SQL directly? Are they running aggregations? Do they use filters? Are they querying recent data or all historical data?

And that’s just the read side. You also need to ask about ingestion and transformation. Is the database storing raw events and nothing else? (Is the user happy with that?) Is it append only? Is it being rolled up and summarized into daily partitions? How many transactions per second? How many of those are INSERT vs. UPDATE or DELETE? Which rows are being updated? Only recent or any of them? All of them?

etc…

There is no generic answer to this question, and “requests per second” is a reductive and insufficient interpretation of the problem that won’t identify any of the hidden complexity.