Most active commenters
  • jellyotsiro(10)

28 points jellyotsiro | 26 comments | | HN request time: 1.271s | source | bottom

Hi HN, I am Arlan and I am building Nia (https://trynia.ai), a context layer for AI coding agents. Nia lets tools like Cursor, Claude Code, and other MCP clients index and query real codebases and documentation so they stop hallucinating against outdated or wrong sources, with applications beyond coding agents to any AI system that requires grounded context across domains.

Coding agents are only as good as the context you give them. General models are trained on public code and documentation that is often old, and they usually have no idea what is inside your actual repo, internal wiki, or the exact version of a third party SDK you use. The result is very familiar: you paste URLs and code snippets into the prompt, the agent confidently uses an outdated API or the wrong framework version, and you spend more time verifying and correcting it than if you had written the code yourself. Once models are good enough at generating code, feeding them precise, up-to-date context becomes the bottleneck.

I ran into this pattern first on my own projects when (a few months ago) I was still in high school in Kazakhstan, obsessed with codegen tools and trying every coding agent I could find. I saw it again when I got into YC and talked to other teams who were also trying to use agents on real work.

The first version of Nia was basically “my personal MCP server that knows my repos and favorite doc sites so I do not have to paste URLs into Cursor anymore.” Once I saw how much smoother my own workflow became, it felt obvious that this should be a product other people could use too.

Under the hood, Nia is an indexing and retrieval service with an MCP interface and an API. You point it at sources like GitHub repositories, framework or provider docs, SDK pages, PDF manuals, etc. We fetch and parse those with some simple heuristics for code structures, headings, and tables, then normalize them into chunks and build several indexes: a semantic index with embeddings for natural language queries; a symbol and usage index for functions, classes, types, and endpoints; a basic reference graph between files, symbols, and external docs; regex and file tree search for cases where you want deterministic matches over raw text.

When an agent calls Nia, it sends a natural language query plus optional hints like the current file path, stack trace, or repository. Nia runs a mix of BM25 style search, embedding similarity, and graph walks to rank relevant snippets, and can also return precise locations like “this function definition in this file and the three places it is used” instead of just a fuzzy paragraph. The calling agent then decides how to use those snippets in its own prompt. One Nia deployment can serve multiple agents and multiple projects at once. For example, you can have Cursor, Claude Code, and a browser based agent all pointed at the same Nia instance that knows about your monorepo, your internal wiki, and the provider docs you care about. We keep an agent agnostic session record that tracks which sources were used and which snippets the user accepted. Any MCP client can attach to that session id, fetch the current context, and extend it, so switching tools does not mean losing what has already been discovered.

A lot of work goes into keeping indexes fresh without reprocessing everything. Background workers periodically refetch configured sources, detect which files or pages changed, and reindex those incrementally. This matters because many of the worst “hallucinations” I have seen are actually the model quoting valid documentation for the wrong version. Fixing that is more about version and change tracking than about model quality.

We ship Nia with a growing set of pre-indexed public sources. Today this includes around 6k packages from common frameworks and provider docs, plus package search over thousands of libraries from ecosystems like PyPI, npm, and RubyGems, as well as pre indexed /explore page where everyone can contribute their sources! The idea is that a new user can install Nia, connect nothing, and still get useful answers for common libraries. Then, as soon as you add your own repos and internal docs, those private sources are merged into the same index. Some examples of how people use Nia so far: - migrating from one payments provider or API version to another by indexing the provider docs plus example repos and letting the agent propose and iterate on patches; - answering “how do I do X in this framework” by indexing the framework source directly instead of relying only on official docs that might be stale; - turning an unfamiliar public codebase into a temporary wiki to self onboard, where you can ask structural questions and jump to specific files, functions, or commits; - building a browser agent that answers questions using up to date code and docs even when the public documentation lags behind.

Nia is a paid product (https://www.trynia.ai/) but we have a free tier that should be enough for individuals to try it on real projects. Above that there is a self-serve paid plan for heavier individual use, and organization plans with higher limits, SOC 2, seat based billing, and options for teams that want to keep indexing inside their own environment. For private GitHub repos we can clone and index locally so code does not leave your infrastructure.

We store account details and basic telemetry like query counts and errors to operate the service, and we store processed representations of content you explicitly connect (chunks, metadata, embeddings, and small graphs) so we can answer queries. We do not train foundation models on customer content and we do not sell user data. Moreover, I can see Nia play out in the larger context of the agents space due to the global problem of providing reliable context to those systems. Early signals show that people are already using Nia for healthcare data, cloning Paul Graham by indexing all of his essays and turning him into an AI agent, using Naval’s archive to build a personalized agent, and more.

I would love to get Nia into the hands of more engineers who are already pushing coding agents hard and see where it breaks. I am especially interested in hearing about failure modes, annoying onboarding steps, places where the retrieval logic is obviously wrong or incomplete, or any security concerns I should address. I will be in the thread to answer questions, share more technical details, and collect any brutal feedback you are willing to give!

1. RomanPushkin ◴[] No.46195184[source]
Having this RAG layer was always another thing to try for me. I haven't coded it myself, and super interested if this gives a real boost while working with Claude. Curious from anyone who have already tried the service, what's your feedback? Did you feel you're getting real improvements?
replies(1): >>46195243 #
2. jellyotsiro ◴[] No.46195243[source]
Wouldn’t call it just RAG though. Agentic discovery and semantic search are the way to go right now, so Nia combines both approaches. For example, you can dynamically search through a documentation tree or grep for specific things.
replies(1): >>46195288 #
3. zwaps ◴[] No.46195263[source]
Absolutely insane that we celebrated coding agents getting rid of RAG, only with the next innovation being RAG
replies(5): >>46195277 #>>46195290 #>>46195303 #>>46195618 #>>46195966 #
4. choilive ◴[] No.46195277[source]
The pendulum swings back.
5. zwaps ◴[] No.46195288{3}[source]
We call it agentic RAG. The retriever is an agent. It’s still RAG
replies(1): >>46195312 #
6. jellyotsiro ◴[] No.46195290[source]
Not exactly just RAG. The shift is agentic discovery paired with semantic search.

Also, most of the coding agents still combine RAG and agentic search. See cursor blog about how semantic search helps them understand and navigate massive codebases: https://cursor.com/blog/semsearch

7. govping ◴[] No.46195303[source]
The context problem with coding agents is real. We've been coordinating multiple agents on builds - they often re-scan the same files or miss cross-file dependencies. Interested in how Nia handles this - knowledge graph or smarter caching?
replies(1): >>46195351 #
8. jellyotsiro ◴[] No.46195312{4}[source]
Which would be much better than the techniques used in 2023. As context windows increase, combining them becomes even easier.

There are a lot of ways of how you can interpret agentic rag, pure rag, etc

9. 6thbit ◴[] No.46195319[source]
This looks neat, we certainly need more ideas and solutions on this space, I work with large codebases daily and the limits on agentic contexts are constantly evident. I've some questions related to how I would consume a tool like this one:

How does this fare with codebases that change very frequently? I presume background agents re-indexing changes must become a bottleneck at some point for large or very active teams.

If I'm working on a large set of changes modifying lots of files, moving definitions around, etc., meaning I've deviated locally quite a bit from the most up to date index, will Nia be able to reconcile what I'm trying to do locally vs the index, despite my local changes looking quite different from the upstream?

replies(1): >>46195406 #
10. jellyotsiro ◴[] No.46195351{3}[source]
hey! knowledge graphs are also used at runtime but paired with other techniques, since graphs are only useful for relationship queries.
11. jellyotsiro ◴[] No.46195406[source]
great question!

For large and active codebases, we avoid full reindexing. Nia tracks diffs and file level changes, so background workers only reindex what actually changed. We are also building “inline agents” that watch pull requests or recent commits and proactively update the index ahead of your agent queries.

Local vs upstream divergence is a real scenario. Today Nia prioritizes providing external context to your coding agents: packages, provider docs, SDK versions, internal wikis, etc. We can still reconcile with your local code if you point the agent at your local workspace (cursor and claude code already provide that path). We look at file paths, symbol names and usage references to map local edits to known context. In cases where the delta is large, we surface both the local version and the latest indexed version so the agent understands what changed.

12. himike ◴[] No.46195482[source]
I love Nia, keep it up Arlan
replies(1): >>46195527 #
13. johnsillings ◴[] No.46195491[source]
super smart. congrats on the launch!
14. mritchie712 ◴[] No.46195493[source]
Cursor promises to do this[0] in the product, so, especially on HN, it'd be best to start with "why this is better than Cursor".

> favorite doc sites so I do not have to paste URLs into Cursor

This is especially confusing, because cursor has a feature for docs you want to scrape regularly.

0 - https://cursor.com/docs/context/codebase-indexing

replies(1): >>46195523 #
15. jellyotsiro ◴[] No.46195523[source]
The goal here is not to replace Cursor’s own local codebase indexing. Cursor already does that part well. What Nia focuses on is external context. It lets agents pull in accurate information from remote sources like docs, packages, APIs, and broader knowledge bases
replies(1): >>46195946 #
16. jellyotsiro ◴[] No.46195527[source]
thank you haha!
17. ramzirafih ◴[] No.46195540[source]
Love it.
18. ModernMech ◴[] No.46195560[source]
Hard to follow gif of the thing working without explanation: check

Carousel of a bunch of random companies "using" the product without an indication how or in what capacity: check

List of a bunch of investors, as if that's meaningful to anyone who will use this product rather than people who would invest in it: check

The audacity to ask for actual money for a product that barely exists and is mostly a wrapper around other technology, who are investing in you: check

Claims of great internal success without any proof: check

Testimonials from random Twitter accounts who may or may not be bots or paid, who knows: check

To try it or even get a sense of how it works or what it is, you have to sign up: check

Congrats! Looks like you're set up for a trillion dollar valuation in the AI space!

To be less flippant and more constructive: if you're going to say the thing reduces hallucinations and provides 10x speedups in development, you need to provide proof immediately, or stop making the claim, otherwise there's 0 credibility for this product.

19. ModernMech ◴[] No.46195618[source]
This is happening over and over and over. The example of prompt engineering is just a form of protocol. Context engineering is just about cache management. People think LLMs will replace programming languages and runtimes entirely, but so far it seems they have been used mostly to write programs in programming languages, and I've found they're very bad interpreters and compilers. So far, I can't really pick out what exactly LLMs are replacing except the need to press the individual keys on the keyboard, so I still struggle to see them as more than super fancy autocomplete. When the hype is peeled away, we're still left with all the same engineering problems but now we have added "Sometimes the tool hallucinates and gaslights you".
20. orliesaurus ◴[] No.46195884[source]
Benchmarks?
replies(2): >>46195947 #>>46195949 #
21. jondwillis ◴[] No.46195946{3}[source]
That’s what GP is saying. This is the Docs feature of Cursor. It covers external docs/arbitrary web content.

`@Docs` — will show a bunch of pre-indexed Docs, and you can add whatever you want and it’ll show up in the list. You can see the state of Docs indexing in Cursor Settings.

The UX leaves a bit to be desired, but that’s a problem Cursor seems to have in general.

replies(1): >>46195994 #
22. dang ◴[] No.46195947[source]
Arlan had this in his text but I cut it for brevity - sorry about that! Here's that bit:

In our internal benchmark on bleeding edge SDK and library features, Nia produced the lowest hallucination rate among the context providers and search tools we tested (context7, exa code, etc), and I wrote up the setup and results in a separate blog post: https://www.nozomio.com/blog/nia-oracle-benchmark

23. jellyotsiro ◴[] No.46195949[source]
https://www.nozomio.com/blog/nia-oracle-benchmark
24. dang ◴[] No.46195966[source]
"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."

"Don't be snarky."

https://news.ycombinator.com/newsguidelines.html

25. jellyotsiro ◴[] No.46195994{4}[source]
yeah ux is pretty bad and overall functionality. it still relies on a static retrieval layer and limited index scope.

+ as I mentioned above there are many more use cases than just coding.Think docs, APIs, research, knowledge bases, even personal or enterprise data sources the agent needs to explore and validate dynamically.