KAG – Knowledge Graph RAG Framework

(github.com)

230 points taikon | 3 comments | 30 Dec 24 02:55 UTC | HN request time: 0.639s | source

Show context

isoprophlex ◴[30 Dec 24 06:49 UTC] No.42547133[source]▶

Fancy, I think, but again no word on the actual work of turning a few bazillion csv files and pdf's into a knowledge graph.

I see a lot of these KG tools pop up, but they never solve the first problem I have, which is actually constructing the KG itself.

replies(11): >>42547488 #>>42547556 #>>42547743 #>>42548481 #>>42549416 #>>42549856 #>>42549911 #>>42550327 #>>42551738 #>>42552272 #>>42562692 #

kergonath ◴[30 Dec 24 08:12 UTC] No.42547488[source]▶

>>42547133 #

> I see a lot of these KG tools pop up, but they never solve the first problem I have, which is actually constructing the KG itself.

I have heard good things about Graphrag [1] (but what a stupid name). I did not have the time to try it properly, but it is supposed to build the knowledge graph itself somewhat transparently, using LLMs. This is a big stumbling block. At least vector stores are easy to understand and trivial to build.

It looks like KAG can do this from the summary on GitHub, but I could not really find how to do it in the documentation.

[1] https://microsoft.github.io/graphrag/

replies(3): >>42547518 #>>42547785 #>>42550262 #

1. isoprophlex ◴[30 Dec 24 08:19 UTC] No.42547518[source]▶

>>42547488 #

Indeed they seem to actually know/show how the sausage is made... but still, no fire and forget approach for any random dataset. check out what you need to do if the default isnt working for you (scroll down to eg. entity_extraction settings). there is so much complexity there to deal with that i'd just roll my own extraction pipeline from the start, rather than learning someone elses complex setup (that you have to tweak for each new usecase)

https://microsoft.github.io/graphrag/config/yaml/

replies(2): >>42549293 #>>42549804 #

2. kergonath ◴[30 Dec 24 13:57 UTC] No.42549293[source]▶

>>42547518 (TP) #

> i'd just roll my own extraction pipeline from the start, rather than learning someone elses complex setup

I have to agree. It’s actually quite a good summary of hacking with AI-related libraries these days. A lot of them get complex fast once you get slightly out of the intended path. I hope it’ll get better, but unfortunately it is where we are.

3. veggieroll ◴[30 Dec 24 14:56 UTC] No.42549804[source]▶

>>42547518 (TP) #

IMO like with most other out-of-the-box LLM frameworks, the value is in looking at their prompts and then doing it yourself.

[1] https://github.com/microsoft/graphrag/tree/main/graphrag/pro...

↑