←back to thread

230 points taikon | 5 comments | | HN request time: 1.085s | source
Show context
tessierashpool9 ◴[] No.42548043[source]
a quick look leaves me with the question:

what exactly is being tokenized? RDS, OWL, Neo4j, ...?

how is the knowledge graph serialized?

replies(1): >>42548750 #
tessierashpool9 ◴[] No.42548750[source]
isn't this a key question? anybody here knowledgeable and care to reply?
replies(1): >>42549107 #
1. dartos ◴[] No.42549107[source]
I worked at a small company experimenting with RAG.

We used neo4j as the graph database and used the LLM to generate parts of the spark queries.

replies(1): >>42549132 #
2. tessierashpool9 ◴[] No.42549132[source]
not sure if this is addressing my question. as i understand it the RAG augments the knowledge base by representing its content as a graph. but this graph representation needs to be linguistically represented such that an llm can digest it by tokenizing and embedding.
replies(1): >>42550264 #
3. dartos ◴[] No.42550264[source]
There are lots of ways to go about RAG, many do not require graphs at all.

I recommend looking at some simple spark queries to get an idea of what’s happening.

What I’ve seen is using LLMs to identify what possible relationships some information may have by comparing it to the kinds of relationships in your database.

Then when building the spark query it uses those relationships to query relevant data.

The llm never digests the graph. The system around the llm uses the capabilities of graph data stores to find relevant context for the llm.

What you’ll find with most RAG systems is that the LLM plays a smaller part than you’d think.

It reveals semantic information (such as conceptual relationships) and generates final responses. The system around it is where the far more interesting work happens imo.

replies(1): >>42550436 #
4. tessierashpool9 ◴[] No.42550436{3}[source]
i'm talking about a knowledge graph that explicitly stores data (=knowledge) as a graph and the question is how this solution establishes the connection to the llm. so that the llm uses the data ... anyway, never mind :)
replies(1): >>42558691 #
5. dartos ◴[] No.42558691{4}[source]
You’re not reading my comments…