Most active commenters
  • calcsam(4)
  • jumski(3)

←back to thread

433 points calcsam | 14 comments | | HN request time: 0.625s | source | bottom

Hi HN, we’re Sam, Shane, and Abhi, and we’re building Mastra (https://mastra.ai), an open-source JavaScript SDK for building agents on top of Vercel’s AI SDK.

You can start a Mastra project with `npm create mastra` and create workflow graphs that can suspend/resume, build a RAG pipeline and write evals, give agents memory, create multi-agent workflows, and view it all in a local playground.

Previously, we built Gatsby, the open-source React web framework. Later, we worked on an AI-powered CRM but it felt like we were having to roll all the AI bits (agentic workflows, evals, RAG) ourselves. We also noticed our friends building AI applications suffering from long iteration cycles: they were getting stuck debugging prompts, figuring out why their agents called (or didn’t call) tools, and writing lots of custom memory retrieval logic.

At some point we just looked at each other and were like, why aren't we trying to make this part easier, and decided to work on Mastra.

Demo video: https://www.youtube.com/watch?v=8o_Ejbcw5s8

One thing we heard from folks is that seeing input/output of every step, of every run of every workflow, is very useful. So we took XState and built a workflow graph primitive on top with OTel tracing. We wrote the APIs to make control flow explicit: `.step()` for branching, `.then()` for chaining, and `.after()` for merging. We also added .`.suspend()/.resume()` for human-in-the-loop.

We abstracted the main RAG verbs like `.chunk()`, `embed()`, `.upsert(),’ `.query()`, and `rerank()` across document types and vector DBs. We shipped an eval runner with evals like completeness and relevance, plus the ability to write your own.

Then we read the MemGPT paper and implemented agent memory on top of AI SDK with a `lastMessages` key, `topK` retrieval, and a `messageRange` for surrounding context (think `grep -C`).

But we still weren’t sure whether our agents were behaving as expected, so we built a local dev playground that lets you curl agents/workflows, chat with agents, view evals and traces across runs, and iterate on prompts with an assistant. The playground uses a local storage layer powered by libsql (thanks Turso team!) and runs on localhost with `npm run dev` (no Docker).

Mastra agents originally ran inside a Next.js app. But we noticed that AI teams’ development was increasingly decoupled from the rest of their organization, so we built Mastra so that you can also run it as a standalone endpoint or service.

Some things people have been building so far: one user automates support for an iOS app he owns with tens of thousands of paying users. Another bundled Mastra inside an Electron app that ingests aerospace PDFs and outputs CAD diagrams. Another is building WhatsApp bots that let you chat with objects like your house.

We did (for now) adopt an Elastic v2 license. The agent space is pretty new, and we wanted to let users do whatever they want with Mastra but prevent, eg, AWS from grabbing it.

If you want to get started: - On npm: npm create mastra@latest - Github repo: https://github.com/mastra-ai/mastra - Demo video: https://www.youtube.com/watch?v=8o_Ejbcw5s8 - Our website homepage: https://mastra.ai (includes some nice diagrams and code samples on agents, RAG, and links to examples) - And our docs: https://mastra.ai/docs

Excited to share Mastra with everyone here – let us know what you think!

1. Palmik ◴[] No.43111545[source]
The example from the landing page does not exactly spark joy:

    testWorkflow
     .step(llm)
       .then(decider)
       .then(agentOne)
       .then(workflow)
     .after(decider)
       .then(agentTwo)
       .then(workflow)
      .commit();

On a first glance, this looks like a very awkward way to represent the graph from the picture. And this is just a simple "workflow" (the structure of the graph does not depend on the results of the execution), not an agent.
replies(5): >>43111621 #>>43113904 #>>43113922 #>>43114354 #>>43116216 #
2. calcsam ◴[] No.43111621[source]
Thanks! The conditional `when` clauses live on the steps, rather than being represented in the workflow, and in fact when we built this for an example, the last step being called depended on the results of the previous two steps.

How would you simplify this?

replies(2): >>43113743 #>>43113935 #
3. anentropic ◴[] No.43113743[source]
I think the problem is that a 'fluent' chain of calls already expresses a sequence, so the way that 'after' resets the context to start a new branch feels very awkward ... like a GOTO or something

It's telling that the example relies on arbitrary indentation (which a linter will get rid of) to have some hope of comprehending it

Possibly this was all motivated by a desire to avoid nested structures above all?

But for a branching graph a nested structure is more natural. It'd also probably be nicer if the methods were on the task nodes instead of on the workflow, then you could avoid the 'step'/'then' distinction and have something like:

e.g.

    testWorkflow(
        llm
        .then(decider)
        .then(
            agentOne.then(workflow),
            agentTwo.then(workflow),
        )
    )
replies(1): >>43115706 #
4. ◴[] No.43113904[source]
5. jumski ◴[] No.43113922[source]
Yeah, I also found this a bit unintuitive at first. I’m building a workflow engine myself (https://pgflow.dev/pgflow, not released yet), and I’ve been thinking a lot about how to model the DSL for the graph and decided to make dependencies explicit and use method chaining for expansion with other step types.

Here’s how it would look like in my system:

  new Flow<string>()  
    .step("llm", llmStepHandler)  
    .step("decider", ["llm"], deciderStepHandler)  
    .step("agentOne", ["decider"], agentOneStepHandler)  
    .step("agentTwo", ["decider"], agentTwoStepHandler)  
    .step("workflow", ["agentOne", "agentTwo"], workflowStepHandler);  
Mine is a DAG, so more constrained than the cyclic graph Mastra supports (if I understand correctly).
6. jumski ◴[] No.43113935[source]
I think it is just easier to comprehend if the edges/dependencies are explicit (as an array for example).
replies(1): >>43118303 #
7. campers ◴[] No.43114354[source]
I get the same feeing when I first looked at the LangChain documentation when I wanted to first start tinkering with LLM apps.

I built my own TypeScript AI platform https://typedai.dev with an extensive feature list where I've kept iterating on what I find the most ergonomic way to develop, using standard constructs as much as possible. I've coded enough Java streams, RxJS chains, and JavaScript callbacks and Promise chains to know what kind of code I like to read and debug.

I was having a peek at xstate but after I came across https://docs.dbos.dev/ here recently I'm pretty sure that's that path I'll go down for durable execution to keep building everything with a simple programming model.

replies(3): >>43114882 #>>43116228 #>>43118682 #
8. nwienert ◴[] No.43114882[source]
Kind of similar camp, I checked LangChain and others and ultimately I was like, well, it's not really doing much is it, just adding abstraction on top of what is essentially basic loops and conditional statements, and tbh it feels like in nearly every case I'll never be using them the same way such that some abstraction will help over just making some function helpers myself.

I don't think from first principles there's any broad framework that makes sense to be honest. I'll reach for a specific vector DB, or logging library, but beyond that you'll never convince me your "query-builder" API is going to make me build a better thing when I have the full power of TypeScript already.

Especially when these products start throwing in proprietary features and add-ons with fancy names on top.

9. calcsam ◴[] No.43115706{3}[source]
You’re right that the syntax was inspired by the desire to avoid nested structures. But the syntax here is interesting as well and fairly readable. Worth thinking about!
replies(1): >>43126122 #
10. zeroq ◴[] No.43116216[source]
I knew it will be bad when I seen "by the developers of Gatsby", but this is pure comedy.

JQuery plugin for LLM.

11. jumski ◴[] No.43116228[source]
TypedAI looks solid, was not aware of it! Bookmarked for further research.

Personally I am not fond of the decorator approach and decided to not use it in pgflow (my soon-to-be-released workflow orchestration engine on top of Postgres).

1. I wanted it to be simple to reason about and explicit (being more verbose as a trade-off)

2. There are some issues with supporting decorators (Svelte https://github.com/sveltejs/svelte/issues/11502, and a lot of others).

3. I decided to only support directed acyclic graphs (no loops!) in order to promote simplicity. Will be supporting conditional recursive sub-workflows to provide a way to repeat some steps and be able to branch.

Cheers!

12. calcsam ◴[] No.43118303{3}[source]
We have a ticket to allow this actually!
13. CMCDragonkai ◴[] No.43118682[source]
Can dbos work with CF durable objects?
14. anentropic ◴[] No.43126122{4}[source]
that example syntax is loosely based on CDK code for AWS Step Functions, since I had to write some recently

essentially you're building a DAG so it could be worth checking some other APIs which do a similar thing for inspiration

e.g. it looks like in Airflow you could write it as:

    chain(llm, decider, [agentOne, agentTwo], workflow)
https://airflow.apache.org/docs/apache-airflow/stable/core-c...