←back to thread

133 points bloppe | 1 comments | | HN request time: 0.211s | source

I've been working with the Featureform team on their new open-source project, [EnrichMCP][1], a Python ORM framework that helps AI agents understand and interact with your data in a structured, semantic way.

EnrichMCP is built on top of [MCP][2] and acts like an ORM, but for agents instead of humans. You define your data model using SQLAlchemy, APIs, or custom logic, and EnrichMCP turns it into a type-safe, introspectable interface that agents can discover, traverse, and invoke.

It auto-generates tools from your models, validates all I/O with Pydantic, handles relationships, and supports schema discovery. Agents can go from user → orders → product naturally, just like a developer navigating an ORM.

We use this internally to let agents query production systems, call APIs, apply business logic, and even integrate ML models. It works out of the box with SQLAlchemy and is easy to extend to any data source.

If you're building agentic systems or anything AI-native, I'd love your feedback. Code and docs are here: https://github.com/featureform/enrichmcp. Happy to answer any questions.

[1]: https://github.com/featureform/enrichmcp

[2]: https://modelcontextprotocol.io/introduction

Show context
polskibus ◴[] No.44321614[source]
This looks very interesting but I’m not sure how to use it well. Would you mind sharing some prompts that use it and solve a real problem that you encountered ?
replies(1): >>44321807 #
simba-k ◴[] No.44321807[source]
Imagine you're building a support agent for DoorDash. A user asks, "Why is my order an hour late?" Most teams today would build a RAG system that surfaces a help center article saying something like, "Here are common reasons orders might be delayed."

That doesn't actually solve the problem. What you really need is access to internal systems. The agent should be able to look up the order, check the courier status, pull the restaurant's delay history, and decide whether to issue a refund. None of that lives in documentation. It lives in your APIs and databases.

LLMs aren't limited by reasoning. They're limited by access.

EnrichMCP gives agents structured access to your real systems. You define your internal data model using Python, similar to how you'd define models in an ORM. EnrichMCP turns those definitions into typed, discoverable tools the LLM can use directly. Everything is schema-aware, validated with Pydantic, and connected by a semantic layer that describes what each piece of data actually means.

You can integrate with SQLAlchemy, REST APIs, or custom logic. Once defined, your agent can use tools like get_order, get_restaurant, or escalate_if_late with no additional prompt engineering.

It feels less like stitching prompts together and more like giving your agent a real interface to your business.

replies(7): >>44321959 #>>44322443 #>>44322755 #>>44322935 #>>44323084 #>>44323147 #>>44325321 #
polskibus ◴[] No.44322443[source]
are you saying that a current gen LLM can answer such queries with EnrichMCP directly? or does it need guidance via prompts (for example tell it which tables to look at, etc. ) ? I did expose a db schema to LLM before, and it was ok-ish, however often times the devil was in the details (one join wrong, etc.), causing the whole thing to deliver junk answers.

what is your experience with non trivial db schemas?

replies(1): >>44322754 #
simba-k ◴[] No.44322754[source]
So one big difference is that we aren't doing text2sql here, and the framework requires clear descriptions on all fields, entities, and relationships (it literally won't run otherwise).

We also generate a few tools for the LLM specifically to explain the data model to it. It works quite well, even on complex schemas.

The use case is more transactional than analytical, though we've seen it used for both.

I recommend running the openai_chat_agent in examples/ (also supports ollama for local run) and connect it to the shop_api server and ask it a question like : "Find and explain fraud transactions"

replies(1): >>44322831 #
polskibus ◴[] No.44322831[source]
So explicit model description (kind of repeating the schema into explicit model definition) provides better results when used with LLM because it’s closer to the business domain(or maybe the extra step from DDL to business model is what confuses the LLM?). I think I’m failing to grasp why does this approach work better than straight schema fed to Llm.
replies(1): >>44323435 #
1. simba-k ◴[] No.44323435[source]
Yeah, think of it as a data analyst. If I give you a Postgres account with all of our tables in it, you wouldn't even know when to start and would spend tons of time just running queries to figure out what you were looking at.

If I explain the semantic graph, entities, relationships, etc. with proper documentations and descriptions you'd be able to reason about it much faster and more accurately.

A postgres schema might have the data type and a name and a table name vs. all the rich metadata that would be required in EnrichMCP.