This article is on using a knowledge graph for improving on a RAG system for complicated interconnected product information. The article will cover the problems with RAG, why knowledge graphs might help, and I will try to cover some design challenges of using knowledge graphs.
Knowledge Graphs (KGs): A graph-based data structure that represents knowledge in a domain. It contains concepts (i.e. products) and the relationships between them (i.e. a products applications or spare parts). Using knowledge graphs, we can provide LLMs useful context needed for answering questions about pumps.
The problems with RAG
The reason why we are looking into knowledge graphs is that we have some challenges with our RAG system. See a description of our current RAG system here: Building an Advanced AI Chatbot for Technical Documentation: A Project Overview and Design Architecture
Some of the challenges we are facing are:
- Questions to product ranges where the information is found in the individual products (Questions that summarize on a group of products)
- Personalization of the answers
- different products is available in different countries
- differences in what products is being pushed and promoted in different countries
- Missing contextual understand means is difficult for them LLM to understand the relationships between different pieces of information in the prompt
- It can be difficult to analyse and fix issues with retrieval in RAG systems based on vector search engines
- Questions that require advanced reasoning or problem solving. Sometimes it is necessary to understand complicated relationships between different pieces of information in the prompt to answer the question.
- Limitation of vector embeddings search means it is not reliable to find the most relevant information
Why knowledge graphs might help
Knowledge graphs can help with some of the challenges we are facing with our RAG system.
Enables advanced querying
Advanced querying on the graph can uncover complex relationships between pieces of information. This is crucial as LLMs often lack domain knowledge to understand these relationships. A knowledge graph provides the necessary structure and context to create sophisticated queries that enable LLMs to comprehend and utilize the information effectively.
Ease of extending the knowledge graph
It is possible to integrate many different sources of information into a knowledge graph and different hierarchies of information can be represented in the graph. New information can be added to the graph and linked to existing information in the graph, so that the new information can be used in the reasoning process. The schema is dynamic and can evolve over time.
Ease of explaining advanced queries
If the RAG system does not give the correct answer, it is possible to analyse and debug the advanced queries that the system used to generate the answer. More importantly, when the system gives a wrong answer, there is a clear way for developers to correct the system by modifying the knowledge graph or refining the queries.
Ease of exploring the graph
Developers and data scientists can explore the graph and understand the relationships between different pieces of information in the graph. This is important because often data scientists and developers do not have the domain knowledge to understand the relationships between different pieces of information in the graph. By exploring the graph, they can generate new ideas for how to improve the system and how to use the information in the graph to improve the system.
Simplifying the system
The knowledge graph can be used to simplify the system. Instead of having a complicated system with many different parts, the knowledge graph can be used to integrate the different parts of the system. There is only a single source of truth, and all the information can be found a single place. This of-cause depends on how the knowledge graph is used in the system if it replaces other component or adds a new layer on top, some design choices add complexity to the system instead of simplifying it.
Stop reliance on vector search
Vector search cannot give reliable results for many types of questions, since similarity can be many different things based on context. Also, the data the embeddings models was trained on might not fit our specific use case. By using a knowledge graph build based on structured data, we can ensure that the information generated is correct and relevant for specific tasks.
Design challenges of using knowledge graphs
There are many interesting design challenges when integrating a knowledge graph with a RAG system. Below are some of the challenges and some possible solutions to them we have considered.
Generating the knowledge graph
The knowledge graph can be generated based on either structured or unstructured data. In the case of unstructured data like pdf documents and indexing process using an LLM to generate the knowledge graph could be an option. LangChain and GraphRag have some tools for this. The indexing can be very expensive since all the data needs to be processed by a large LLM model.
However, if structured information exists, this will provide a good starting point for the knowledge graph. Then the unstructured information can be kept in a specialized vector search engine like Azure AI search or the text embeddings can be added directly into a knowledge graph that supports vector search for instance neo4j.
The task of generating and maintaining a knowledge graph is a major task with many things to consider for a production system. I will leave this for a future article.
Querying the knowledge graph
When the knowledge graph has been created, the next challenge is how to query the knowledge graph based on a user question. Often the user question cannot be mapped directly to a database query in the knowledge graph. We want a chatbot to be able to answer a wide range of very different questions.
Approach 1: Generating the query using an LLM
We can use an LLM to generate the graph query, this is done by generating the schema of the knowledge graph and then generating the query based on the schema. Often the LLM will be poor at generating the correct directions of the relationships in the graph, however if using neo4j database, then methods exist in for langchain to correct for this by using the cypher query corrector.
When testing this I found it to be bad at generating the correct parameters for the graph query. However experimentation also showed that most of these errors can be corrected by doing a vector search on the nodes and adding it into the prompt used for generating the query. I added to the query generation prompt sentences like “{entity from question} maps to {existing entity} in database” and it greatly improved the system. A prerequisite for doing this would be to do named entity extraction and mapping first. I have described how we do this in my other post How to make sense of what a user is asking for in a LLM chatbot.
Another option for correcting error is to use an automated feedback mechanism: showing the LLM an error message and ask it to correct the query. See more details here: Agent - planning with feedback
My approach to generating the query have been using a generic pretrained LLM, however many specialized models for this have been trained that can be fine tuned and might provide better accuracy and performance. Try searching on “Text2Cypher” or “Text2SPARQL” on hugging face depending on your query language.
There are many issues with generating the graph query, and I think makes it unsuitable in many cases. The issues are:
- Security: Lack of control over the generated query
- Cost: A big LLM model is expensive to run
- Complexity: The system is difficult to understand and maintain
- Ease of debugging: Many different queries can be generated, and it can be difficult to get an overview of the usage, problems and why the system is not working as expected
- Database optimization: It is difficult to optimize the database for performance when you do not know what queries are might be generated
- Performance: An LLM can be slow at generating the query
For these reasons I prefer an alternative approach to generating the query.
Approach 2: Having a library of queries that can be used
Another strategy is using a use function calling to select a query and generate a response. However, function calling does not work well when there is too many functions (or sometimes called “tools”) to choose from. For this reason, we can implement a vector index that can be used to search for the top tools to use before using LLM function calling. A function in this case would be a knowledge-graph query wrapped in a function with well defined inputs and outputs.
The benefit is that these tools can be made safe and efficient, since queries are written (or generated) and optimized by developers, and the tool only takes parameters from the LLM. The downside is that it is difficult to make the system answer a wide range of unpredictable user questions. The works best when you have a narrow purpose of what the system should do.
We need make it answer a wide range of unpredictable user questions, so the next challenge therefore is how to grow the library of tools, without requiring to much manual humal effort. Here we can combine the library idea with the first mentioned strategy of generating the query using an LLM.
Approach 3: Combining them
The two approaches can be combined into one. Using a library of queries but generating new queries if a relevant query cannot be found.
The top path with the blue background is using approach 2, using tool calling with a prompt library. When the query is not good, we use approach 1 to generate a new query. If the query provides a good answer, then we save the query in the graph library for future manual inspection and approval or rejection (or we find a way to auto approve them).
In the figure above, you see two decision points, and it is important to answer both of those well. The first decision point “Has good query?” is checking if one of the queries returned from the graph library can generate relevant context for the question. To do this, we need to run the query and measure the relevance of the results to the question.
Similarly for the second decision point, “Save graph query?”, we need to make sure the query is running successfully and are providing relevant and correct responses. Doing this well provide a better answer for the user, but it also enables growing the graph library since we can do quick experimentation and testing of new generated queries. Developing the software functions for evaluating relevance of answers, can provide value and be implemeted multiple places in the system, for filtering as well as offline evaluating and online monitoring of the retrieval system.
Serializing the results into the prompt
The result from the query is presented as nodes and relationships in the knowledge graph. For the LLM to use it the results need to be serialized into a prompt so it can be used to generate the answer. This can be done in many ways.
A good approach is to serialize the returned data into tables and then insert it into the prompt. For instance, a node table with name, description and attributes and a relationship table with the relationship type and the nodes it connects.
Ensuring Contextual Relevance and managing prompt length: There might be many nodes returned from the query, but only a few of them are relevant for the user query. The relevant nodes can be selected by doing a vector similarity scoring or semantic reranking on the nodes and selecting the most similar and relevant nodes to the user question. Tokens can be counted and count of nodes included dynamically controlled to ensure an optimal prompt length.
Conclusion
It is definitely possible to improve on a RAG system by integrating a knowledge graph, however many of the benefits of the knowledge graph can also be achieved by using other approaches to the problem, that might be easier to implement and operationalize. For our project we have decided to try and solve our challenges using function calling and using the advanced search functionality like scoring profiles and filters in our vector search engine (azure AI search).
On the longer term I believe migrating to a KG as the main data source, will simplify the system and make it easier to maintain and improve on the system. It moves the complexity from the inferencing part of the system to the data engineering part. Understanding the data and the relationships between the data is a key part of building a useful value adding system.
By combining the prompt library approach with the query generation approach, we can make a system that is both maintainable and that can evolve and answer a wide and unpredictable range of user questions.
More to come on this topic in the future.