←back to thread

65 points nickpapciak | 5 comments | | HN request time: 1.043s | source

Hey HN! We’re Abhi, Venkat, Tom, and Nick and we are building Datafruit (https://datafruit.dev/), an AI DevOps agent. We’re like Devin for DevOps. You can ask Datafruit to check your cloud spend, look for loose security policies, make changes to your IaC, and it can reason across your deployment standards, design docs, and DevOps practices.

Demo video: https://www.youtube.com/watch?v=2FitSggI7tg.

Right now, we have two main methods to interact with Datafruit:

(1) automated infrastructure audits— agents periodically scan your environment to find cost optimization opportunities, detect infrastructure drift, and validate your infra against compliance requirements.

(2) chat interface (available as a web UI and through slack) — ask the agent questions for real-time insights, or assign tasks directly, such as investigating spend anomalies, reviewing security posture, or applying changes to IaC resources.

Working at FAANG and various high-growth startups, we realized that infra work requires an enormous amount of context, often more than traditional software engineering. The business decisions, codebase, and cloud itself are all extremely important in any task that has been assigned. To maximize the success of the agents, we do a fair amount of context engineering. Not hallucinating is super important!

One thing which has worked incredibly well for us is a multi-agent system where we have specialized sub-agents with access to specific tool calls and documentation for their specialty. Agents choose to “handoff” to each other when they feel like another agent would be more specialized for the task. However, all agents share the same context (https://cognition.ai/blog/dont-build-multi-agents). We’re pretty happy with this approach, and believe it could work in other disciplines which require high amounts of specialized expertise.

Infrastructure is probably the most mission-critical part of any software organization, and needs extremely heavy guardrails to keep it safe. Language models are not yet at the point where they can be trusted to make changes (we’ve talked to a couple of startups where the Claude Code + AWS CLI combo has taken their infra down). Right now, Datafruit receives read-only access to your infrastructure and can only make changes through pull requests to your IaC repositories. The agent also operates in a sandboxed virtual environment so that it could not write cloud CLI commands if it wanted to!

Where LLMs can add significant value is in reducing the constant operational inefficiencies that eat up cloud spend and delay deadlines—the small-but-urgent ops work. Once Datafruit indexes your environment, you can ask it to do things like:

  "Grant @User write access to analytics S3 bucket for 24 hours"
    -> Creates temporary IAM role, sends least-privilege credentials, auto-revokes tomorrow

  "Find where this secret is used so I can rotate it without downtime"
    -> Discovers all instances of your secret, including old cron-jobs you might not know about, so you can safely rotate your keys


  "Why did database costs spike yesterday?"
    -> Identifies expensive queries, shows optimization options, implements fixes

We charge a straightforward subscription model for a managed version, but we also offer a bring-your-own-cloud model. All of Datafruit can be deployed on Kubernetes using Helm charts for enterprise customers where data can’t leave your VPC. For the time being, we’re installing the product ourselves on customers' clouds. It doesn’t exist in a self-serve form yet. We’ll get there eventually, but in the meantime if you’re interested we’d love for you guys to email us at founders@datafruit.dev.

We would love to hear your thoughts! If you work with cloud infra, we are especially interested in learning about what kinds of work you do which you wish could be offloaded onto an agent.

1. 0xbadcafebee ◴[] No.45107840[source]
As someone who's been doing Infra stuff for two decades, this is very exciting. There is a lot of mindless BS we have to deal with due to shitty tools and services, and AI could save us a lot of time that we'd rather use to create meaningful value.

There is still benefit for non-Infra people. But non-Infra people don't understand system design, so the benefits are limited. Imagine a "mechanic AI". Yes, you could ask it all sorts of mechanic questions, and maybe it could even do some work on the car. But if you wanted to, say, replace the entire engine with a different one, that is a systemic change and has farther reaching implications than an AI will explain, much less perform competently. You need a mechanic to stop you and say, uh, no, please don't change the engine; explain to me what you're trying to do and I'll help you find a better solution. Then you need a real mechanic to manage changing the tires on the moving bus so it doesn't crash into the school. But having an AI could make the mechanic do all of that smoother.

Another thing I'd love to see more AI use of, is people asking the AI for advice. Most devs seem to avoid asking Infra people for architectural/design advice. This leads to them putting together a system using their limited knowledge, and it turns out to be an inferior design to what an Infra person would have suggested. Hopefully they will ask AI for advice in the future.

replies(2): >>45108181 #>>45112470 #
2. nickpapciak ◴[] No.45108181[source]
Glad you find it interesting. A surprising way people are using us right now has been people who are technical but don’t have deep infrastructure expertise, asking datafruit questions about how stuff should be done.

Something we’ve been dealing with is trying to get the agents to not over-complicate their designs, because they have a tendency to do so. But with good prompting they can be very helpful assistants!

replies(1): >>45110316 #
3. 0xbadcafebee ◴[] No.45110316[source]
Yeah it's def gonna be hard. So much of engineering is an amalgam of contexts, restrictions, intentions, best practice, and what you can get away with. An agent honed by a team of experts to keep all those things in mind (and force the user to answer important questions) would be invaluable.

Might be good to train multiple "personalities": one's a startup codebro that will tell you the easiest way to do anything; another will only give you the best practice and won't let you cheat yourself. Let the user decide who they want advice from.

Going further: input the business's requirements first, let that help decide? Just today I was on a call where somebody wants to manually deploy a single EC2 instance to run a big service. My first question is, if it goes down and it takes 2+ days to bring it back, is the business okay with that? That'll change my advice.

replies(1): >>45110523 #
4. nickpapciak ◴[] No.45110523{3}[source]
Yes definitely! That's why we do believe the agents, for the time being, will act as great junior devs that you can offload work onto, while as they get better they can slowly get promoted into more active roles.

The personalities approach sounds fun to experiment with. I'm wondering if you could use SAEs to scan for a "startup codebro" feature in language models. Alas this is not something we get to look into until we think that fine-tuning our own models is the best way to make them better. For now we are betting on in-context learning.

Business requirements are also incredibly valuable. Notion, Slack, and Confluence hold a lot of context, but it can be hard to find. This is something that I think the subagents architecture is great for though.

5. paool ◴[] No.45112470[source]
Funnily enough, the same scenario holds true for actual programmers vs vibe coders.

Even if you manage to prompt an app, you'll still have no idea how the system works.