←back to thread

183 points crescit_eundo | 4 comments | | HN request time: 0.613s | source
1. spectraldrift ◴[] No.45054829[source]
People often forget a message queue is just a simple, high-throughput state machine.

It's tempting to roll your own by polling a database table, but that approach breaks down- sometimes even at fairly low traffic levels. Once you move beyond a simple cron job, you're suddenly fighting row locking and race conditions just to prevent significant duplicate processing; effectively reinventing a wheel, poorly (potentially 5 or 10 times in the same service).

A service like SQS solves this with its state management. A message becomes 'invisible' while being processed. If it's not deleted within the configurable visibility timeout, it transitions back to available. That 'fetch next and mark invisible' state transition is the key, and it's precisely what's so difficult to implement correctly and performantly in a database every single time you need it.

replies(1): >>45054938 #
2. groone ◴[] No.45054938[source]
Message becomes invisible in a regular relational database when using `SELECT FOR UPDATE SKIP LOCKED`
replies(2): >>45055183 #>>45056115 #
3. kerblang ◴[] No.45055183[source]
Overall it's completely feasible to build a message queue with RDBMS _because_ they have locking. You might end up doing extra work compared to some other products that make message queueing easy/fun/so-simple-caveman-etc.

Now if SQS has some super-scalar mega-cluster capability where one instance can deliver 100 billion messages a day across the same group of consumers, ok, I'm impressed, because most MQ's can't, because... locking. Thus Kafka (which is not a message queue).

I think the RDBMS MQ should be treated as the "No worse than this" standard - if my fancy new message queueing product is even harder to set up, it isn't worth your trouble. But SQS itself IS pretty easy to use.

4. spectraldrift ◴[] No.45056115[source]
That's totally feasible, and works for small to medium traffic (SQS scales seamlessly from 1 message per year to millions per second).

In practice, I've never seen this implemented correctly in the wild- most people don't seem to care enough to handle the transactions properly. Additionally, if you want additional features like DLQs or metrics on stuck message age, you'll end up with a lot more complexity just to get parity with a standard queue system.

A common library could help with this though.