Key Words:
NLP, LLM, Generative Pre-training, KGs, Roadmap, Bidirectional Reasoning
Abstract:
LLMs are black models and can't capture and access factual knowledge. KGs are structured knowledge models that explicitly store rich factual knowledge. The combinations of KGs and LLMs have three frameworks,
-
KG-enhanced LLMs, pre-training and inference stages to provide external knowledge, used for analyzing LLMs and providing interpretability.
-
LLM - augmented KGs, KG embedding, KG completion, KG construction, KG-to text generation, KGQA.
-
Synergized LLMs+KGs, enhance performance in knowledge representation and reasoning.
Background
Introduction of LLMs
Encoder-only LLMs
Use the encoder to encode the sentence and understand the relationships between words.
Predict the mask words in an input sentence. Text classification, named entity recognition.
Encoder-decoder LLMs
Adopt both encoder and decoder modules. The encoder module works for encoding the input into a hidden-space, and the decoder is used to generate the target output text. Summarization, translation, question answering.
Decoder-only LLMs
Adopt the decoder module to generate target output text.
Prompt Engineering
Prompt is a sequence of natural language inputs for LLMs that specified for the task, including:
-
Instruction: instructs the model to do a specific task.
-
Context: provides the context for the input text or few-shot examples.
-
Input text: the text that needs to be processed by the model.
Improve the capacity of LLMs in deverse complex tasks. CoT prompt enables complex reasoning capabilities throught intermediate reasoning steps.
Introduction of KGs
Roadmap
KG-enhanced LLMs
-
Pre-training stage
-
Integrating KGs into Training Objective
-
Integrating KGs into LLMs Input
-
KGs Instruction-tuning
-
-
Inference stage
-
Retrieval-Augmented Knowledge Fusion
-
RAG
-
-
KGs Prompting
-
-
Interpretability
-
KGs for LLM probing
-
KGs for LLM Analysis
-
LLM-augmented KGs
Knowledge Graph embedding aims to map each entity and relation into a low-dimensional vector space.
-
Text encoders for KG-related tasks
-
LLM processes the original corpus and entities for KG construction.
-
End-to-End KG Construction
-
Distilling Knowledge Graphs from LLMs
-
-
KG prompt, KG completion and KG reasoning.
-
PaE (LLM as Encoders)
-
PaG (LLM as Generators)
-
-
LLM-augmented KG-to-text Generation
-
Leveraging Knowledge from LLMs
-
Constructing large weakly KG-text aligned Corpus
-
-
LLM-augmented KG Question Answering
-
LLMs as Entity/relation Extractors
-
LLMs as Answer Reasoners
-
Synergized LLMs + KGs
Synergized Knowledge Representation
Aims to design a synergized model can represent knowledge from both LLMs and KGs.
Synergized Reasoning
-
LLM-KG Fusion Reasoning