Show HN: Hatchet – Open-source distributed task queue

1. kcorbitt ◴[08 Mar 24 18:13 UTC] No.39643991[source]▶

I love your vision and am excited to see the execution! I've been looking for exactly this product (postgres-backed task queue with workers in multiple languages and decent built-in observability) for like... 3 years. Every 6 months I'll check in and see if someone has built it yet, evaluate the alternatives, and come away disappointed.

One important feature request that probably would block our adoption: one reason why I prefer a postgres-backed queue over eg. Redis is just to simplify our infra by having fewer servers and technologies in the stack. Adding in RabbitMQ is definitely an extra dependency I'd really like to avoid.

(Currently we've settled on graphile-worker which is fine for what it does, but leaves a lot of boxes unchecked.)

replies(9): >>39644137 #>>39645512 #>>39646111 #>>39647059 #>>39647179 #>>39650750 #>>39651174 #>>39652574 #>>39652765 #

2. abelanger ◴[08 Mar 24 18:23 UTC] No.39644137[source]▶

>>39643991 (TP) #

Thank you, appreciate the kind words! What boxes are you looking to check?

Yes, I'm not a fan of the RabbitMQ dependency either - see here for the reasoning: https://news.ycombinator.com/item?id=39643940.

It would take some work to replace this with listen/notify in Postgres, less work to replace this with an in-memory component, but we can't provide the same guarantees in that case.

replies(2): >>39647886 #>>39647932 #

3. BenjieGillam ◴[08 Mar 24 19:52 UTC] No.39645512[source]▶

>>39643991 (TP) #

Not sure if you saw it but Graphile Worker supports jobs written in arbitrary languages so long as your OS can execute them: https://worker.graphile.org/docs/tasks#loading-executable-fi...

Would be interested to know what features you feel it’s lacking.

replies(1): >>39647940 #

4. doctorpangloss ◴[08 Mar 24 20:36 UTC] No.39646111[source]▶

>>39643991 (TP) #

Why does the RabbitMQ dependency matter?

It was pretty painless for me to set up and write tests against. The operator works well and is really simple if you want to save money.

I mean, isn’t Hatchett another dependency? Graphile Worker? I like all these things, but why draw the line at one thing over another over essentially aesthetics?

You better start believing in dependencies if you’re a programmer.

replies(3): >>39646667 #>>39647092 #>>39651689 #

5. eska ◴[08 Mar 24 21:24 UTC] No.39646667[source]▶

>>39646111 #

Introducing another piece of software instead of using one you already use anyway introduces new failures. That’s hardly aesthetics.

As a professional I’m allergic to statements like “you better start believing in X”. How can you even have objective discourse at work like that?

replies(1): >>39646910 #

6. doctorpangloss ◴[08 Mar 24 21:47 UTC] No.39646910{3}[source]▶

>>39646667 #

> Introducing another piece of software instead of using one you already use anyway introduces new failures.

Okay, but we're talking about this on a post about using another piece of software.

What is the rational for, well this additional dependency, Hatchet, that's okay, and its inevitable failures are okay, but this other dependency, RabbitMQ, which does something different, but will have fewer failures for some objective reasons, that's not okay?

Hatchet is very much about aesthetics. What else does Hatchet have going on? It doesn't have a lot of history, it's going to have a lot of bugs. It works as a DSL written in Python annotations, which is very much an aesthetic choice, very much something I see a bunch of AI startups doing, which I personally think is kind of dumb. Like OpenAI tools are "just" JSON schemas, they don't reinvent everything, and yet Trigger, Hatchet, Runloop, etc., they're all doing DSLs. It hews to a specific promotional playbook that is also very aesthetic. Is this not the "objective discourse at work" you are looking for?

I am not saying it is bad, I am saying that 99% of people adopting it will be doing so for essentially aesthetic reasons - and being less knowledgable about alternatives might describe 50-80% of the audience, but to me, being less knowledgeable as a "professional" is an aesthetic choice. There's nothing wrong with this.

You can get into the weeds about what you meant by whatever you said. I am aware. But I am really saying, I'm dubious of anyone promoting "Use my new thing X which is good because it doesn't introduce a new dependency." It's an oxymoron plainly on its face. It's not in their marketing copy but the author is talking about it here, and maybe the author isn't completely sincere, maybe the author doesn't care and will happily write everything on top of RabbitMQ if someone were willing to pay for it, because that decision doesn't really matter. The author is just being reactive to people's aesthetics, that programmers on social media "like" Postgres more than RabbitMQ, for reasons, and that means you can "only" use one, but that none of those reasons are particularly well informed by experience or whatever, yet nonetheless strongly held.

When you want to explain something that doesn't make objective sense when read literally, okay, it might have an aesthetic explanation that makes more sense.

replies(3): >>39647315 #>>39647710 #>>39649731 #

7. ako ◴[08 Mar 24 22:03 UTC] No.39647059[source]▶

>>39643991 (TP) #

Funny how this is vision now. I started my career 29 years ago at a company that build exactly this, but based on oracle. The agents would run on Solaris, aix, vax vms, hpux, windows nt, iris, etc. Was also used to create an automated cicd pipeline to build all binaries on all these different systems.

replies(2): >>39650305 #>>39651858 #

8. blandflakes ◴[08 Mar 24 22:06 UTC] No.39647092[source]▶

>>39646111 #

And you better start critically assessing dependencies if you're a programmer. They aren't free; this is a wild take.

9. bevekspldnw ◴[08 Mar 24 22:17 UTC] No.39647179[source]▶

>>39643991 (TP) #

You can do a fair amount of this with Postgres using locks out of the box. It’s not super intuitive but I’ve been using just Postgres and locks in production for many years for large task distribution across independent nodes.

replies(1): >>39648940 #

10. danielovichdk ◴[08 Mar 24 22:32 UTC] No.39647315{4}[source]▶

>>39646910 #

I fully agree with you.

'But I am really saying, I'm dubious of anyone promoting "Use my new thing X which is good because it doesn't introduce a new dependency."'

"Advances in software technology and increasing economic pressure have begun to break down many of the barriers to improved software productivity. The ${PRODUCT} is designed to remove the remaining barriers […]"

It reads like the above quote from the pitch of r1000 in 1985. https://datamuseum.dk/bits/30003882

11. eska ◴[08 Mar 24 23:17 UTC] No.39647710{4}[source]▶

>>39646910 #

> You can get into the weeds about what you meant by whatever you said. I am aware.

>When you want to explain something that doesn't make objective sense when read literally, okay, it might have an aesthetic explanation that makes more sense.

What an attitude and way to kill a discussion. Again, hard for me to imagine that you're able to have objective discussions at work. As you wish I won't engage in discourse with you so you can feel smart.

12. jaggederest ◴[08 Mar 24 23:41 UTC] No.39647886[source]▶

>>39644137 #

I come to this only as an interested observer, but my experience with listen/notify is that it outperforms rabbitmq/kafka in small to medium operations and has always pleasantly surprised me. You might find out it's a little easier than you think to slim your dependency stack down.

replies(1): >>39651534 #

13. kcorbitt ◴[08 Mar 24 23:47 UTC] No.39647932[source]▶

>>39644137 #

Boxes-wise, I'd like a management interface at least as good as the one Sidekiq had in Rails for years. Would also need some hard numbers around performance and probably a bit more battle-testing before using this in our current product.

14. kcorbitt ◴[08 Mar 24 23:49 UTC] No.39647940[source]▶

>>39645512 #

That's interesting! Would that still involve each worker node needing to have Nodejs installed to run the process that actually reads from the queue? That's doable, but makes the deployment story a little more annoying/complicated if I want a worker that just runs Python or Rust or something.

Feature-wise, the biggest missing pieces from Graphile Worker for me are (1) a robust management web ui and (2) really strong documentation.

replies(1): >>39648175 #

15. BenjieGillam ◴[09 Mar 24 00:29 UTC] No.39648175{3}[source]▶

>>39647940 #

Yes, currently Node is the runtime but we could bundle that up into a binary blob if that would help; one thing to download rather than installing Node and all its dependencies?

A UI is a common request, something I’ve been considering investing effort into. I don’t think we’ll ever have one in the core package, but probably as a separate package/plugin (even a third party one); we’ve been thinking more about the events and APIs such a system would need and making these available, and adding a plugin system to enable tighter integration.

Could you expand on what’s missing in the documentation? That’s been a focus recently (as you may have noticed with the new expanded docusaurus site linked previously rather than just a README), but documentation can always be improved.

16. renegade-otter ◴[09 Mar 24 02:34 UTC] No.39648940[source]▶

>>39647179 #

I wrote about one simple implementation:

https://renegadeotter.com/2023/11/30/job-queues-with-postrgr...

replies(1): >>39649375 #

17. bevekspldnw ◴[09 Mar 24 04:29 UTC] No.39649375{3}[source]▶

>>39648940 #

Looks very similar to my solution. :-)

18. necovek ◴[09 Mar 24 05:57 UTC] No.39649731{4}[source]▶

>>39646910 #

There is some implicit context you are missing here.

Tools like hatchet are one less dependency for projects already using Postgres: Postgres has become a de-facto database to build against.

Compare that to an application built on top of Postgres and using Celery + Redis/RabbitMQ.

Also, it seems like you are confusing aesthetic with ergonomics. Since forever, software developers have tried to improve on all of "aesthetics" (code/system structure appearance), "ergonomics" (how easy/fast is it to build with) and "performance" (how well it works), and the cycle has been continuous (we introduce extra abstractions, then do away with some when it gets overly complex, and on and on).

replies(1): >>39650308 #

19. throwawaymaths ◴[09 Mar 24 08:22 UTC] No.39650305[source]▶

>>39647059 #

Also basically has existed as an open source (pro version has web dashboard and complex task zoo) drop-in library (no sidecar dependencies outside of postgres) in Elixir for years called Oban.

replies(1): >>39650905 #

20. danielovichdk ◴[09 Mar 24 08:23 UTC] No.39650308{5}[source]▶

>>39649731 #

"Since forever, software developers have tried to improve on all of "aesthetics" (code/system structure appearance), "ergonomics" (how easy/fast is it to build with) and "performance" (how well it works), and the cycle has been continuous"

Fast,easy,well,cheap is not a quality measure but it sure is a way to build more useless abstractions. You tell me which abstractions has made your software twice as effective.

replies(1): >>39651567 #

21. ◴[09 Mar 24 10:26 UTC] No.39650750[source]▶

>>39643991 (TP) #

22. cpursley ◴[09 Mar 24 11:02 UTC] No.39650905{3}[source]▶

>>39650305 #

Yep, it feels like half the show hn launches is for infrastructure tooling that already exist natively or as plug and play libraries for Elixir/Erlang.

I really try to suggest people skip Node and learn a proper backend language with a solid framework with a proven architecture.

replies(1): >>39651364 #

23. rubenfiszel ◴[09 Mar 24 12:14 UTC] No.39651174[source]▶

>>39643991 (TP) #

Windmill is is built exactly like that, what box is left unchecked for it if you had time to review it?

replies(1): >>39655895 #

24. zepolen ◴[09 Mar 24 13:05 UTC] No.39651364{4}[source]▶

>>39650905 #

Oban looks great, how would one run a python cuda based workload on it?

replies(1): >>39651529 #

25. hosh ◴[09 Mar 24 13:44 UTC] No.39651529{5}[source]▶

>>39651364 #

You could shell out to execute with porcelain, make the python a long-running process and use ports, or port your python code to NX.

26. hosh ◴[09 Mar 24 13:45 UTC] No.39651534{3}[source]▶

>>39647886 #

How do you handle things when no listeners are available to be notified?

replies(1): >>39653677 #

27. hosh ◴[09 Mar 24 13:54 UTC] No.39651567{6}[source]▶

>>39650308 #

Efficacy has more to do with the specific situation than the tools you use. Rather, it is versatility of a tool that allows someone to take advantage of the situation.

What makes abstractions more versatile has more to do with its composability and expressiveness of those compositions.

An abstraction that attempts to (apparently) reduce complexity without also being composable, is overall less versatile. Usually, something that does one thing well, is designed to also be as simple as possible. Otherwise you are increasing the overall complexity (and reducing reliability or making it fragile instead of anti-fragile) for very little gain.

28. otabdeveloper4 ◴[09 Mar 24 14:16 UTC] No.39651689[source]▶

>>39646111 #

> You better start believing in dependencies if you’re a programmer.

Yeah, faith will be your last resort when the resulting tower of babel fails in hitherto unknown to man modes.

29. sixdimensional ◴[09 Mar 24 14:41 UTC] No.39651858[source]▶

>>39647059 #

Because people don’t know what they don’t know, and, learning from others (along with human knowledge sharing and transfer) doesn’t seem to be what society often prioritizes in general.

Not so much talking about the original post, I think it’s awesome what they are building, and clearly they have learned by observing other things.

30. simplyinfinity ◴[09 Mar 24 16:09 UTC] No.39652574[source]▶

>>39643991 (TP) #

Hope im not misunderstanding, but have you checked gearman? While I haven't used it personally, ive used similar thing but in c#, namely hangfire.

31. magic_hamster ◴[09 Mar 24 16:33 UTC] No.39652765[source]▶

>>39643991 (TP) #

For what it's worth, RabbitMQ is extremely low maintenance, fire and forget. In the multiple years we've used it in production I can't remember a single time we had an issue with rabbit or that we needed to do anything after the initial set up.

32. abelanger ◴[09 Mar 24 18:23 UTC] No.39653677{4}[source]▶

>>39651534 #

Presumably there'd be a messages table that you listen/notify on, and you'd replay messages that weren't consumed when a listener rejoins. But yeah, this is the overhead I was referencing.

replies(2): >>39654661 #>>39655861 #

33. jaggederest ◴[09 Mar 24 20:56 UTC] No.39654661{5}[source]▶

>>39653677 #

Yep, but practically speaking, you need those records anyway even if you're using another queue to actually distribute the jobs. At least every system I've ever built of a reasonable size has a job audit table anyway. Plus it's an "Enterprise Feature™" so you can differentiate on it if you like that kind of feature-based pricing

replies(1): >>39655867 #

34. hosh ◴[10 Mar 24 00:49 UTC] No.39655861{5}[source]▶

>>39653677 #

With the way LISTEN/NOTIFY works, Postgres doesn't keep a record of messages that are not sent. So you cannot replay this. Unless you know something about postgresql that I don't know about.

replies(1): >>39655893 #

35. hosh ◴[10 Mar 24 00:49 UTC] No.39655867{6}[source]▶

>>39654661 #

Postgres's LISTEN/NOTIFY doesn't keep those kinds of records. The whole point of using SKIP LOCK is so that you can make updates to rows to keep those kinds of messages with concurrent consumers.

replies(1): >>39657159 #

36. yencabulator ◴[10 Mar 24 00:57 UTC] No.39655893{6}[source]▶

>>39655861 #

You insert work-to-be-performed into a table, and use NOTIFY only to wake up consumers that there is more work to be had. Consumers that weren't there at the time of NOTIFY can look at the rows in the table at startup.

replies(1): >>39656434 #

37. yencabulator ◴[10 Mar 24 00:58 UTC] No.39655895[source]▶

>>39651174 #

Note that Hatchet is MIT license and Windmill is AGPL-3.. that's enough of a reason for many.

38. hosh ◴[10 Mar 24 03:12 UTC] No.39656434{7}[source]▶

>>39655893 #

I see. So the notify is just to say there is work to be performed, but there is no payload that includes the job. The consumer still has to make a query. If there isn’t enough work, the queries should come back empty. This saves from having to poll, but not a true push system.

replies(1): >>39657180 #

39. jaggederest ◴[10 Mar 24 06:32 UTC] No.39657159{7}[source]▶

>>39655867 #

Yes. I'm saying you'll manually need to insert some kind of job audit log into a different table. Cheers

40. jaggederest ◴[10 Mar 24 06:36 UTC] No.39657180{8}[source]▶

>>39656434 #

as far as I can tell NOTIFY is fanout, in the sense that it will send a message to all the LISTENing connections, so it wouldn't make sense in that context anyway. It's not one-to-one, it's about making sure that jobs get picked up in a timely fashion. If you're doing something fancier with event sourcing or equivalent, you can send events via NOTIFY, and have clients decide what to do with those events then.

Quoth the manual: "The NOTIFY command sends a notification event together with an optional “payload” string to each client application that has previously executed LISTEN channel for the specified channel name in the current database. Notifications are visible to all users."

replies(1): >>39657571 #

41. hosh ◴[10 Mar 24 08:19 UTC] No.39657571{9}[source]▶

>>39657180 #

Notify can be triggered with stored procedures to send payloads related to changes to a table. It can be set up to send the id of a row that was inserted or updated, for example. (But WAL replication is usually better for this)

replies(1): >>39660110 #

42. yencabulator ◴[10 Mar 24 15:52 UTC] No.39660110{10}[source]▶

>>39657571 #

Broadcasting the id to a lot of workers is not useful, only one of them should work on the task. Waking up the workers to do a SELECT FOR UPDATE .. SKIP LOCKED is the trick. At best the NOTIFY payload could include the kind of worker that should wake up.

replies(1): >>39666056 #

43. ◴[11 Mar 24 09:17 UTC] No.39666056{11}[source]▶

>>39660110 #