←back to thread

161 points segmenta | 1 comments | | HN request time: 0.244s | source

Hi HN! We’re Arjun, Ramnique, and Akhilesh, and we are building Rowboat (https://www.rowboatlabs.com/), an AI-assisted IDE for building and managing multi-agent systems. You start with a single agent, then scale up to teams of agents that work together, use MCP tools, and improve over time - all through a chat-based copilot.

Our repo is https://github.com/rowboatlabs/rowboat, docs are at https://docs.rowboatlabs.com/, and there’s a demo video here: https://youtu.be/YRTCw9UHRbU

It’s becoming clear that real-world agentic systems work best when multiple agents collaborate, rather than having one agent attempt to do everything. This isn’t too surprising - it’s a bit like how good code consists of multiple functions that each do one thing, rather than cramming everything into one function.

For example, a travel assistant works best when different agents handle specialized tasks: one agent finds the best flights, another optimizes hotel selections, and a third organizes the itinerary. This modular approach makes the system easier to manage, debug, and improve over time.

OpenAI’s Agents SDK provides a neat Python library to support this, but building reliable agentic systems requires constant iterations and tweaking - e.g. updating agent instructions (which can quickly get as complex as actual code), connecting tools, and testing the system and incorporating feedback. Rowboat is an AI IDE to do all this. Rowboat is to AI agents what Cursor is to code.

We’ve taken a code-like approach to agent instructions (prompts). There are special keywords to directly reference other agents, tools or prompts - which are highlighted in the UI. The copilot is the best way to create and edit these instructions - each change comes with a code-style diff.

You can give agents access to tools by integrating any MCP server or connecting your own functions through a webhook. You can instruct the agents on when to use specific tools via ‘@mentions’ in the agent instruction. To enable quick testing, we added a way to mock tool responses using LLM calls.

Rowboat playground lets you test and debug the assistants as you build them. You can see agent transfers, tool invocations and tool responses in real-time. The copilot has the context of the chat, and can improve the agent instructions based on feedback. For example, you could say ‘The agent shouldn’t have done x here. Fix this’ and the copilot can go and make this fix.

You can integrate agentic systems built in Rowboat into your application via the HTTP API or the Python SDK (‘pip install rowboat’). For example, you can build user-facing chatbots, enterprise workflows and employee assistants using Rowboat.

We’ve been working with LLMs since GPT-1 launched in 2018. Most recently, we built Coinbase’s support chatbot after our last AI startup was acquired by them.

Rowboat is Apache 2.0 licensed, giving you full freedom to self-host, modify, or extend it however you like.

We’re excited to share Rowboat with everyone here. We’d love to hear your thoughts!

Show context
NitpickLawyer ◴[] No.43768951[source]
This is cool! Seems like this is what AutoGen Studio wanted to be. And what a lot of "agentic" libs fell short of - a way to chain together stuff by using natural language.

Quick questions (I only looked at the demo video and briefly skimmed the docs, sorry if the qs are explained somewhere):

- it looks to me that a lot of the heavy weight "logic" is handled via prompts (when a new agent is created your copilot edits the "prompts"). Have you tested this w/ various models (and especially any open weights ones) to make sure the flows still work? This reminds me of the very early agent libraries that worked w/ oAI GPTs but not much else.

- if the above assumption is correct, are there plans of using newer libs where a lot of the logic / lifting is done by code instead of simply chaining prompts and hope the model can handle it? (A2A, pydantic, griptape, etc)

replies(1): >>43769378 #
1. akhisud ◴[] No.43769378[source]
Thanks!

1. That's right - Rowboat's agent instructions are currently written in structured prompt blocks, and a lot of logic does live there (with @mentions for tools, other agents, and reusable prompts). We support oAI GPTs at the moment (we chose to start with the oAI Agents SDK), but we're actively working on expanding to other LLMs as well. One of our community contributors just created a fork for Rowboat + OpenRouter. Re: performance, we expect other closed LLMs to perform comparably, and (with good prompt hygiene + role instructions) open LLMs as well, if individual agent scope is kept precise.

2. We've been discussing both A2A and pydantic! Right now, Rowboat is designed to be prompt-first, but we’re integrating more typed interfaces. Design-wise, its likely that prompts might stay central - encoding part of the logic and also acting as the glue layer between more code-based components. Similar to how code has comments, config, and DSLs, agent systems could benefit from human-readable intent even when the core logic is more structured.

Does that make sense?