←back to thread

183 points crescit_eundo | 1 comments | | HN request time: 0s | source
Show context
spectraldrift ◴[] No.45054829[source]
People often forget a message queue is just a simple, high-throughput state machine.

It's tempting to roll your own by polling a database table, but that approach breaks down- sometimes even at fairly low traffic levels. Once you move beyond a simple cron job, you're suddenly fighting row locking and race conditions just to prevent significant duplicate processing; effectively reinventing a wheel, poorly (potentially 5 or 10 times in the same service).

A service like SQS solves this with its state management. A message becomes 'invisible' while being processed. If it's not deleted within the configurable visibility timeout, it transitions back to available. That 'fetch next and mark invisible' state transition is the key, and it's precisely what's so difficult to implement correctly and performantly in a database every single time you need it.

replies(1): >>45054938 #
groone ◴[] No.45054938[source]
Message becomes invisible in a regular relational database when using `SELECT FOR UPDATE SKIP LOCKED`
replies(2): >>45055183 #>>45056115 #
1. kerblang ◴[] No.45055183[source]
Overall it's completely feasible to build a message queue with RDBMS _because_ they have locking. You might end up doing extra work compared to some other products that make message queueing easy/fun/so-simple-caveman-etc.

Now if SQS has some super-scalar mega-cluster capability where one instance can deliver 100 billion messages a day across the same group of consumers, ok, I'm impressed, because most MQ's can't, because... locking. Thus Kafka (which is not a message queue).

I think the RDBMS MQ should be treated as the "No worse than this" standard - if my fancy new message queueing product is even harder to set up, it isn't worth your trouble. But SQS itself IS pretty easy to use.