Submission
Privacy Policy
Code of Ethics
Newsletter

Knowledge Graphs and LLMs: How Can We Move from Structured Knowledge to AI-Generated Answers? – Part II.

Knowledge Graphs are therefore an advantage where information does not come from a single source, but is scattered across multiple documents, and when identifying the links between individual data point is essential to provide an accurate response. While vector search primarily measures the similarity in meaning between individual texts, Knowledge Graphs rely on structured relationships between entities and their relations and are therefore suitable for supporting complex queries and deeper analysis.

This method works particularly well for fact-based searches, for example, to retrieve the position of a person or the date of an event. However, for more complex questions that require inference – such as “Who sits on more boards with John Doe?” -, vector-based searches can be less effective. The problem is that vectors cannot detect logical relationships between different sources. If information about John Doe’s board membership is in one document and another person’s board membership is in another, the system cannot automatically link them unless there is a direct textual reference in one document to the other.

A Knowledge Graph, on the other hand, stores not only entities (e.g. people, organizations) but also their relationships. A query run on a Knowledge Graph can leverage structured information from the raw data. For instance, if a graph has relationships structured between John Doe and other board members, a single query can retrieve all the individuals who are on multiple boards with him. This type of search can also reveal relationships that are scattered throughout the text of documents or found in multiple sources. In essence, rather than relying on matches within a single document, a well-constructed graph links all relevant data points, allowing for more efficient searches and more accurate results.

Automated graph building – easier with LLMs, but not perfect

The advent of LLMs has also revolutionized the process of building Knowledge Graphs. Today, tools such as LLMGraphTransformer exist that can automatically extract entities and relationships from documents and display them in a graph-ready form.  It is also true that modern cloud-based graph databases can be set up in minutes and can display the resulting nodes and edges in user-friendly visual interfaces. In this way, much of the manual data processing of the past can be “outsourced” to automation, saving considerable time and human resources.

However, this does not mean that LLM-based graph construction is perfect. In corporate or critical systems, it is still mostly unavoidable to manually validate the information extracted automatically. Machine Learning models usually only reliably identify the most well-known entities (e.g., persons, organizations). As a result, if we were to apply them in a slightly atypical area, further fine-tuning might be necessary, even for individual projects. This is particularly relevant when handling specialized data sets in legal, financial, or medical fields.

In the case of a Knowledge Graph built from legal data, you may want to see not only the relationships between individuals and institutions, but also which contracts they are involved in, which court judgments they are mentioned in, or which legislation they are linked to. For this purpose, the default models may not be precise enough, and it may be necessary to define specific entities and relationships and to tailor the model to the specific legal context.

Although graph building supported by LLMs is faster and more efficient, expert verification and fine-tuning remain essential to ensure reliability. This brings us somewhat back to the initial problem.

LLMs and Knowledge Graphs: Competitors or Allies?

Given the different operating principles of Knowledge Graphs and LLMs, it is legitimate to ask whether these technologies are competitors or allies. We have seen that Knowledge Graphs store information in a structured, hierarchical way, which is important for the accurate, traceable retrieval of context. In contrast, LLMs (like Machine Learning models in general) learn from huge amounts of text based on patterns. This makes them highly flexible in language processing but does not guarantee the reliability of the information. This difference raises the dilemma of whether LLMs can replace Knowledge Graphs, or whether they are needed to make generative models more reliable.

Now, it seems that Knowledge Graphs and LLMs are not competitors but complementary technologies. Knowledge Graphs provide an accurate, source-critically traceable structure, while LLMs support more complex text processing tasks with their linguistic flexibility and generative capabilities. RAG and GraphRAG solutions are excellent examples of how these two worlds can be effectively combined. At its simplest, a system first retrieves relevant information from the Knowledge Graph (or other data source). An LLM then uses this context to provide a comprehensive, coherent answer. Meanwhile, the risk of hallucinations is also reduced, as the answers rely crucially on verified sources.

The combination of Knowledge Graphs and LLMs poses significant data management and resource challenges. The effectiveness of the system depends mostly on the sources it relies on and how accurate, reliable and unbiased they are. If a Knowledge Graph contains erroneous or biased data, the answers generated by the LLM may be misleading.

However, these technologies are constantly evolving. In recent years, solutions have become more accessible and easier to implement, and are expected to support professionals in the legal, business and scientific fields with increasingly reliable and accurate results.

To sum up, current trends suggest that the combination of Knowledge Graphs and LLMs is fundamentally reshaping information processing and question-answering systems. Until a few years ago, building a reliable Knowledge Graph required lengthy, largely manual work. In contrast, automated tools are now available to create such structured knowledge bases in minutes. Cloud-based graph databases and LLM-based entity and relationship mining methods allow data to be quickly and efficiently organized into informative graph structures.

RAG and GraphRAG methods ensure that LLMs work from verifiable sources, thus reducing the rate of hallucinations. Although this does not guarantee the factual infallibility of such systems, it certainly provides more reliable results than simple response generation. This can be of great importance in areas such as law or finance, where the accuracy of information the and traceability of sources are basic requirements.

Although automated graph building is far from perfect and still needs fine-tuning and expert verification, the trends today seem clear. There is a strong possibility that the AI systems of the near future will increasingly be based on hybrid solutions, where Knowledge Graphs provide the structured information backbone and LLMs provide natural-language, context-rich answers.

This approach will open new horizons for corporate decision support, legal analysis, health research and many other areas, exploiting both verifiable, structured knowledge and the capabilities of advanced Language Models.


István ÜVEGES, PhD is a Computational Linguist researcher and developer at MONTANA Knowledge Management Ltd. and a researcher at the HUN-REN Centre for Social Sciences. His main interests include the social impacts of Artificial Intelligence (Machine Learning), the nature of Legal Language (legalese), the Plain Language Movement, and sentiment- and emotion analysis.