←back to thread

230 points taikon | 8 comments | | HN request time: 0.41s | source | bottom
Show context
isoprophlex ◴[] No.42547133[source]
Fancy, I think, but again no word on the actual work of turning a few bazillion csv files and pdf's into a knowledge graph.

I see a lot of these KG tools pop up, but they never solve the first problem I have, which is actually constructing the KG itself.

replies(11): >>42547488 #>>42547556 #>>42547743 #>>42548481 #>>42549416 #>>42549856 #>>42549911 #>>42550327 #>>42551738 #>>42552272 #>>42562692 #
1. roseway4 ◴[] No.42549856[source]
You may want to take a look at Graphiti, which accepts plaintext or JSON input and automatically constructs a KG. While it’s primarily designed to enable temporal use cases (where data changes over time), it works just as well with static content.

https://github.com/getzep/graphiti

I’m one of the authors. Happy to answer any questions.

replies(3): >>42549922 #>>42555303 #>>42555979 #
2. diggan ◴[] No.42549922[source]
> Graphiti uses OpenAI for LLM inference and embedding. Ensure that an OPENAI_API_KEY is set in your environment. Support for Anthropic and Groq LLM inferences is available, too.

Don't have time to scan the source code myself, but are you using the OpenAI python library, so the server URL can easily be changed? Didn't see it exposed by your library, so hoping it can at least be overridden with a env var, so we could use local LLMs instead.

replies(1): >>42550375 #
3. diggan ◴[] No.42550375[source]
On second look, it seems like you've already rejected a PR trying to add local LLM support: https://github.com/getzep/graphiti/pull/184

> We recommend that you put this on a local fork as we really want the service to be as lightweight and simple as possible as we see this asa good entry point into new developers.

Sadly, it seems like you're recommending forking the library instead of allowing people to use local LLMs. You were smart enough to lock the PR from any further conversation at least :)

replies(1): >>42551027 #
4. roseway4 ◴[] No.42551027{3}[source]
You can override the default OpenAI url using an environment variable (iirc, OPENAI_API_BASE). Any LLM provider / inference server offering an OpenAI-compatible API will work.
replies(1): >>42552851 #
5. diggan ◴[] No.42552851{4}[source]
Granted they use the `openai` python library (or other library/implementation that uses that same env var), hence my question in the previous-previous comment...
6. ganeshkrishnan ◴[] No.42555303[source]
>uses OpenAI for LLM inference and embedding

This becomes a cyclical hallucination problem. The LLM hallucinates and create incorrect graph which in turn creates even more incorrect knowledge.

We are working on this issue of reducing hallucination in knowledge graphs and using LLM is not at all the right way.

replies(1): >>42585953 #
7. dramebaaz ◴[] No.42555979[source]
Excited to try it! Been looking for a temporally-aware way of creating a KG for my journal dataset
8. sc077y ◴[] No.42585953[source]
Actually the rate of hallucination is not constant across the board. For one you're doing a sort of synthesis, not intense reasoning or retrieval with the llm. Second, the problem is segmented into sub problems much like how gpt-o1 or o3 does using CoT. Thus, the risk of hallucinations is significantly lower compared to a zero-shot raw LLM or even a naive RAG approach.