←back to thread

230 points taikon | 1 comments | | HN request time: 0s | source
Show context
isoprophlex ◴[] No.42547133[source]
Fancy, I think, but again no word on the actual work of turning a few bazillion csv files and pdf's into a knowledge graph.

I see a lot of these KG tools pop up, but they never solve the first problem I have, which is actually constructing the KG itself.

replies(11): >>42547488 #>>42547556 #>>42547743 #>>42548481 #>>42549416 #>>42549856 #>>42549911 #>>42550327 #>>42551738 #>>42552272 #>>42562692 #
1. jeromechoo ◴[] No.42549911[source]
There are two paths to KG generation today and both are problematic in their own ways. 1. Natural Language Processing (NLP) 2. LLM

NLP is fast but requires a model that is trained on an ontology that works with your data. Once you do, it’s a matter of simply feeling the model your bazillion CSVs and PDFs.

LLMs are slow but way easier to start as ontologies can be generated on the fly. This is a double edged sword however as LLMs have a tendency to lose fidelity and consistency on edge naming.

I work in NLP, which is the most used in practice as it’s far more consistent and explainable in very large corpora. But the difficulty in starting a fresh ontology dead ends many projects.