←back to thread

114 points KraftyOne | 3 comments | | HN request time: 0.001s | source

Hi HN - I’m Peter, here with Harry (devhawk), and we’re building DBOS Java, an open-source Java library for durable workflows, backed by Postgres.

https://github.com/dbos-inc/dbos-transact-java

Essentially, DBOS helps you write long-lived, reliable code that can survive failures, restarts, and crashes without losing state or duplicating work. As your workflows run, it checkpoints each step they take in a Postgres database. When a process stops (fails, restarts, or crashes), your program can recover from those checkpoints to restore its exact state and continue from where it left off, as if nothing happened.

In practice, this makes it easier to build reliable systems for use cases like AI agents, payments, data synchronization, or anything that takes hours, days, or weeks to complete. Rather than bolting on ad-hoc retry logic and database checkpoints, durable workflows give you one consistent model for ensuring your programs can recover from any failure from exactly where they left off.

This library contains all you need to add durable workflows to your program: there's no separate service or orchestrator or any external dependencies except Postgres. Because it's just a library, you can incrementally add it to your projects, and it works out of the box with frameworks like Spring. And because it's built on Postgres, it natively supports all the tooling you're familiar with (backups, GUIs, CLI tools) and works with any Postgres provider.

If you want to try it out, check out the quickstart:

https://docs.dbos.dev/quickstart?language=java

We'd love to hear what you think! We’ll be in the comments for the rest of the day to answer any questions.

Show context
ivanr ◴[] No.45925417[source]
Hello Peter. Thank you for your work. I really like this approach. I too have been following Temporal and I like it, but I don't think it's a good match for simpler systems.

I've been reading the DBOS Java documentation and have some questions, if you don't mind:

- Versioning; from the looks of it, it's either automatically derived from the source code (presumably bytecode?), or explicitly set for the entire application? This would be too coarse. I don't see auto-configuration working, as a small bug fix would invalidate the version. (Could be less of a problem if you're looking only at method signatures... perhaps add to the documentation?) Similarly, for the explicit setting, a change to the version number would invalidate all existing workflows, which would be cumbersome.

Have you considered relying on serialVersionUID? Or at least allowing explicit configuration on a per workflow basis? Or a fallback method to be invoked if the class signatures change in a backward incompatible way?

Overall, it looks like DBOS is fairly easy to pick up, but having a good story for workflow evolution is going to be essential for adoption. For example, if I have long-running workflows... do I have to keep the old code running until all old instances complete? Is that the idea?

- Transactional use; would it be possible to create a new workflow instance transactionally? If I am using the same database for DBOS and my application, I'd expect my app to do some work and create some jobs for later, reusing my transaction. Similarly, maybe when the jobs are running, I'd perhaps want to use the same transaction? As in, the work is done and then DBOS commits?

I know using the same transaction for both purposes could be tricky. I have, in fact, in the past, used two for job handling. Some guidance in the documentation would be very useful to have.

Thank you.

replies(1): >>45927642 #
KraftyOne ◴[] No.45927642[source]
Thanks for the great questions!

1. Yes, currently versioning is either automatically derived from a bytecode hash or manually set. The intended upgrade mechanism is blue-green deployments where old workflows drain on old code versions while new workflows start on new code versions (you can also manually fork workflows to new code versions if they're compatible). Docs: https://docs.dbos.dev/java/tutorials/workflow-tutorial#workf...

That said, we're working on improvements here--we want it to be easier to upgrade your workflows without blue-green deployments to simplify operating really long-running (weeks-months) workflows.

2. Could I ask what the intended use case is? DBOS workflow creation is idempotent and workflows always recover from the last completed step, so a workflow that goes "database transaction" -> "enqueue child workflow" offers the same atomicity guarantees as a workflow that does "enqueue child workflow as part of database transaction", but the former is (as you say) much simpler to implement.

replies(1): >>45928917 #
ivanr ◴[] No.45928917[source]
> versioning

Here's an example of a common long-running workflow: SaaS trials. Upon trial start, create a workflow to send the customer onboarding messages, possibly inspecting the account state to influence what is sent, and finally close the account that hasn't converted. This will usually take 14-30 days, but could go for months if manually extended (as many organisations are very slow to move).

I think an "escape hatch" would be useful here. On version mismatch, invoke a special function that is given access to the workflow state and let it update the state to align it with the most recent code version.

> transactional enqueuing

A workflow that goes "database transaction" -> "enqueue child workflow" is not safe if the connection with DBOS is lost before the second step completes safely. This would make DBOS unreliable. Doing it the other way round can work provided each workflow checks that the "object" to which it's connected exists. That would deal with the situation where a workflow is created, but the transaction rolls back.

If both the app and DBOS work off the same database connection, you can offer exactly-once guarantees. Otherwise, the guarantees will depend on who commits first.

Personally, I would prefer all work to be done from the same connection so that I can have the benefit of transactions. To me, that's the main draw of DBOS :)

replies(1): >>45932581 #
1. KraftyOne ◴[] No.45932581{3}[source]
> versioning

Yes, something like that is needed, we're working on building a good interface for it.

> transactional enqueueing

But it is safe as long as it's done inside a DBOS workflow. If the connection is lost (process crashes, etc.) after the transaction but before the child is enqueued, the workflow will recover from the last completed step (the transaction) and then proceed to enqueue the child. That's part of the power of workflows--they provide atomicity across transactions.

replies(1): >>45932630 #
2. ivanr ◴[] No.45932630[source]
> > transactional enqueueing > But it is safe as long as it's done inside a DBOS workflow.

Yes, but I was talking about the point at which a new workflow is created. If my transaction completes but DBOS disappears before the necessary workflows are created, I'll have a problem.

Taking my trial example, the onboarding workflow won't have been created and then perhaps the account will continue to run indefinitely, free of charge to the user.

replies(1): >>45933192 #
3. KraftyOne ◴[] No.45933192[source]
Looking at the trial example, the way I would solve it is that when the user clicks "start trial", that starts a small synchronous workflow that first creates a database entry, then enqueues the onboarding workflow, then returns. DBOS workflows are fast enough to use interactively for critical tasks like this.