Today we want to share our TypeScript library for lightweight durable execution. We’ve been working on it since last year and recently released v2.0 with a ton of new features and major API overhaul.
https://github.com/dbos-inc/dbos-transact-ts
Durable execution means persisting the execution state of your program while it runs, so if it is ever interrupted or crashes, it automatically resumes from where it left off.
Durable execution is useful for a lot of things:
- Orchestrating long-running or business-critical workflows so they seamlessly recover from any failure.
- Running reliable background jobs with no timeouts.
- Processing incoming events (e.g. from Kafka) exactly once
- Running a fault-tolerant distributed task queue
- Running a reliable cron scheduler
- Operating an AI agent, or anything that connects to an unreliable or non-deterministic API.
What’s unique about DBOS’s take on durable execution (compared to, say, Temporal) is that it’s implemented in a lightweight library that’s totally backed by Postgres. All you have to do to use DBOS is “npm install” it and annotate your program with decorators. The decorators store your program’s execution state in Postgres as it runs and recover it if it crashes. There are no other dependencies you have to manage, no separate workflow server–just your program and Postgres.
One big advantage of this approach is that you can add DBOS to ANY TypeScript application–it’s just a library. For example, you can use DBOS to add reliable background jobs or cron scheduling or queues to your Next.js app with no external dependencies except Postgres.
Also, because it’s all in Postgres, you get all the tooling you’re familiar with: backups, GUIs, CLI tools–it all just works.
Want to try DBOS out? Initialize a starter app with:
npx @dbos-inc/create -t dbos-node-starter
Then build and start your app with: npm install
npm run build
npm run start
Also check out the docs: https://docs.dbos.dev/We'd love to hear what you think! We’ll be in the comments for the rest of the day to answer any questions you may have.
DBOS makes external asynchronous API calls reliable and crashproof, without needing to rely on an external orchestration service.
For code, here's the bare minimum code example for a workflow:
class Example {
@DBOS.step()
static async step_one() {
...
}
@DBOS.step()
static async step_two() {
...
}
@DBOS.workflow()
static async workflow() {
await Example.step_one()
await Example.step_two()
}
}
The steps can be any TypeScript function.Then we have a bunch more examples in our docs: https://docs.dbos.dev/.
Or if you want to try it yourself download a template:
npx @dbos-inc/create
Did you do literature research of Smalltalk?
For versioning, each workflow is tagged with the code version that ran it, and we recommend recovering workflows on an executor running the same code version as what the workflow started on. Docs for self hosting: https://docs.dbos.dev/typescript/tutorials/development/self-.... In our hosted service (DBOS Cloud) this is all done automatically.
https://supabase.com/blog/durable-workflows-in-postgres-dbos https://news.ycombinator.com/item?id=42379974
We use spot instances for most things to keep costs down and job queues to link steps. Can you provide an example of a distributed workflow setup?
- Which workflows are executing
- What their inputs were
- Which steps have completed
- What their outputs were
Here's a reference for the Postgres tables DBOS uses to manage that state: https://docs.dbos.dev/explanations/system-tables
It just seems that the “durability” guarantees get less reliable as you add more dependencies on external systems. Or at least, the reliability is subject to the interpretation of whichever application code interacts with the result of these workflows (e.g. the shipping service must know to ignore rows in the local purchase DB if they’re not linked to a committed DBOS transaction).
Where DBOS helps is in ensuring the entire workflow, including all backup steps, always run. So if your service is interrupted and that causes the Stripe call to fail, upon restart your program will automatically retry the Stripe call and if that doesn't work, back out and run the step that closes out the failed purchase.
That said, we know sometimes you have to do surgery on a long-running workflow, and we're looking at adding better tooling for it. It's completely doable because all the state is stored in Postgres tables (https://docs.dbos.dev/explanations/system-tables).
this is good until you the postgres server fills up with load and need to scale up/fan out work to a bunch of workers? how do you handle that?
(disclosure, former temporal employee, but also no hate meant, i'm all for making more good orcehstration choices)
The big advantages of using Postgres are:
1. Simpler architecturally, as there are no external dependencies.
2. You have complete control over your execution state, as it's all on tables on your Postgres server (docs for those tables: https://docs.dbos.dev/explanations/system-tables#system-tabl...)
- Drizzle (we're also a sponsor to Drizzle): https://docs.dbos.dev/typescript/tutorials/orms/using-drizzl...
- Knex: https://docs.dbos.dev/typescript/tutorials/orms/using-knex
- Prisma: https://docs.dbos.dev/typescript/tutorials/orms/using-prisma
More ORM support is on the way.
However, if you're interfacing with a third-party API, then that wouldn't be part of a database transaction (you'll use @DBOS.step instead). The reason is that you don't want to hold database locks when you're not performing database operations.
Docs for Queues and Parallelism: https://docs.dbos.dev/typescript/tutorials/queue-tutorial
For example, if I change the code / transactions in a step, how do you reconcile what state to prepare for which transactions. For example, you'll need to reconcile deleted and duplicated calls to the DB?
That said, we know sometimes you have to do surgery on a long-running workflow, and we're looking at adding better tooling for it. It's completely doable because all the state is stored in Postgres tables (https://docs.dbos.dev/explanations/system-tables).
Is it possible to mix typescript and python steps?