💥Knowledge Conflicts for LLMs: A Survey

📢News 9/21/2024: Our paper got accepted by EMNLP 2024 as a Main Conference Paper!

📢We've missed your work? Contact Rongwu directly!

This is the repository for the survey paper: Knowledge Conflicts for LLMs: A Survey. ➡️ [中文介绍@机器之心]

🌟Star us for future lookups!🌟

Rongwu Xu¹*, Zehan Qi¹*, Zhijiang Guo², Cunxiang Wang³, Hongru Wang⁴, Yue Zhang³ and Wei Xu¹

1. Tsinghua University; 2. University of Cambridge; 3. Westlake University; 4. The Chinese University of Hong Kong
(* Equal Contribution)

📝 Citation

If you find our survey useful, please consider citing:

@article{xu2024knowledge,
  title={Knowledge Conflicts for LLMs: A Survey},
  author={Xu, Rongwu and Qi, Zehan and Wang, Cunxiang and Wang, Hongru and Zhang, Yue and Xu, Wei},
  journal={arXiv preprint arXiv:2403.08319},
  year={2024}
}

❤️ Recap

We investigate three types of knowledge conflicts: context-memory conflict, inter-context conflict, and intra-memory conflict.

Context-memory conflict: Contextual knowledge (context) can conflict with the parametric knowledge (memory) encapsulated within the LLM's parameters.
Inter-context conflict: Conflict among various pieces of contextual knowledge (e.g., noise, outdated information, misinformation, etc.).
Intra-memory conflict: LLM's parametric knowledge may yield divergent responses to differently phrased queries, which can be attributed to the conflicting knowledge embedded within the LLM's parameters.

This survey reviews the literature on the causes, behaviors, and possible solutions to knowledge conflicts.

Taxonomy of knowledge conflicts: we consider three distinct types of conflicts and analysis causes, behaviors, and solutions.

Type I: Context-memory conflict

I-i: Causes

Temporal Misalignment

Mind the gap: Assessing temporal generalization in neural language models, Lazaridou et al., Neurips 2021.[Paper]
Time Waits for No One! Analysis and Challenges of Temporal Misalignment, Luu et al., NAACL 2022. [Paper]
Time-aware language models as temporal knowledge bases, Dhingra et al., TACL 2022.[Paper]
Towards continual knowledge learning of language models, Jang et al., ICLR 2022, [Paper]
Temporalwiki: A lifelong benchmark for training and evaluating ever-evolving language models, Jang et al., EMNLP 2023, [Paper]
Streamingqa: A benchmark for adaptation to new knowledge over time in question answering models, Liska et al., ICML 2022. [Paper]
Can LMs Generalize to Future Data? An Empirical Analysis on Text Summarization, Cheang et al., EMNLP 2023, [Paper]
RealTime QA: What's the Answer Right Now?, Kasai et al., Neurips 2024, [Paper]

Misinformation Pollution

Attacking open-domain question answering by injecting misinformation, Pan et al., AACL 2023, [Paper]
On the risk of misinformation pollution with large language models, Pan et al., EMNLP 2023, [Paper]
Defending against misinformation attacks in open-domain question answering, Weller et al., EACL 2024, [Paper]
The earth is flat because...: Investigating llms’ belief towards misinformation via persuasive conversation, Xu et al., ACL 2024, [Paper]
Prompt injection attack against llm-integrated applications, Liu et al., arXiv 2024, [Paper]
Benchmarking and defending against indirect prompt injection attacks on large language models, Yi et al., arXiv 2024, [Paper]
Adaptive chameleon or stubborn sloth: Unraveling the behavior of large language models in knowledge conflicts, Xie et al., ICLR 2024, [Paper]
Poisoning web-scale training datasets is practical, Carlini et al., S&P 2024, [Paper]
Can llm-generated misinformation be detected, Chen and Shu, ICLR 2024, [Paper]

I-ii: (Behavior) Analysis

ODQA

Entity-Based Knowledge Conflicts in Question Answering, Longpre et al., EMNLP 2021, [Paper]
Rich Knowledge Sources Bring Complex Knowledge Conflicts: Recalibrating Models to Reflect Conflicting Evidence, Chen et al., EMNLP 2022, [Paper]
Blinded by Generated Contexts: How Language Models Merge Generated and Retrieved Contexts When Knowledge Conflicts, Tan et al., arXiv 2024, [Paper]

General

Adaptive chameleon or stubborn sloth: Unraveling the behavior of large language models in knowledge conflicts, Xie et al., ICLR 2024, [Paper]
RESOLVING KNOWLEDGE CONFLICTS IN LARGE LANGUAGE MODELS, Wang et al., arXiv 2023, [Paper]
Intuitive or Dependent? Investigating LLMs’ Behavior Style to Conflicting Prompts, Ying et al., arXiv 2024, [Paper]
“Merge Conflicts!” Exploring the Impacts of External Distractors to Parametric Knowledge Graphs, Qian et al., arXiv 2023, [Paper]
Studying Large Language Model Behaviors Under Realistic Knowledge Conflicts, arXiv 2024, [Paper]
Characterizing mechanisms for factual recall in language models, EMNLP 2023, [Paper]
Context versus Prior Knowledge in Language Models, ACL 2024, [Paper]
Cutting Off the Head Ends the Conflict: A Mechanism for Interpreting and Mitigating Knowledge Conflicts in Language Models, arXiv 2024, [Paper]
ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM, Su et al., arXiv 2024, [Paper]

I-iii: (Mitigating) Solutions

Faithful to context

Fine-tuning

Large Language Models with Controllable Working Memory, Li et al., ACL 2023, [Paper]
TrueTeacher: Learning Factual Consistency Evaluation with Large Language Models, Gekhman et al., EMNLP 2023, [Paper]
Improving Factual Consistency for Knowledge-Grounded Dialogue Systems via Knowledge Enhancement and Alignment, Xue et al., EMNLP 2023, [Paper]
Improving Temporal Generalization of Pre-trained Language Models with Lexical Semantic Change, Su et al., EMNLP 2022, [Paper]

Prompting

Context-faithful Prompting for Large Language Models, Zhou et al., EMNLP 2023, [Paper]

Decoding

Trusting Your Evidence: Hallucinate Less with Context-aware Decoding, Shi et al., NAACL 2024, [Paper]
Contrastive Decoding: Open-ended Text Generation as Optimization, Li et al., ACL 2023, [Paper]
Tug-of-war between knowledge: Exploring and resolving knowledge conflicts in retrieval-augmented language models, COLING 2024, [Paper]

Inference-time intervention (e.g., tuning heads)

Characterizing mechanisms for factual recall in language models, EMNLP 2023, [Paper]
Cutting Off the Head Ends the Conflict: A Mechanism for Interpreting and Mitigating Knowledge Conflicts in Language Models, arXiv 2024, [Paper]

Type II: Inter-context conflict

II-i: Causes

Misinformation

Synthetic lies: Understanding ai-generated misinformation and evaluating algorithmic and human solutions, Zhou et al., CHI 2023, [Paper]
Comparing GPT-4 and Open-Source Language Models in Misinformation Mitigation, Vergho et al., arXiv 2024, [Paper https://arxiv.org/abs/2401.06920]

Outdated information

A dataset for answering time-sensitive questions, Chen et al., Neurips 2021, [Paper]
SituatedQA: Incorporating extra-linguistic contexts into QA, Zhang et al., EMNLP 2021, [Paper]
Streamingqa: A benchmark for adaptation to new knowledge over time in question answering models, Liska et al., ICML 2022. [Paper]
RealTime QA: What's the Answer Right Now?, Kasai et al., Neurips 2024, [Paper]

II-ii: (Behavior) Analysis

Performance impact

SituatedQA: Incorporating extra-linguistic contexts into QA, Zhang et al., EMNLP 2021, [Paper]
Synthetic Disinformation Attacks on Automated Fact Verification Systems Authors, AAAI 2022, [Paper]
Attacking open-domain question answering by injecting misinformation, Pan et al., AACL 2023, [Paper]
Rich Knowledge Sources Bring Complex Knowledge Conflicts: Recalibrating Models to Reflect Conflicting Evidence, Chen et al., EMNLP 2022, [Paper]
Tug-of-war between knowledge: Exploring and resolving knowledge conflicts in retrieval-augmented language models, Jin et al., LREC-COLING 2024, [Paper]
ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM, Su et al., arXiv 2024, [Paper]

Detection ability

CDConv: A Benchmark for Contradiction Detection in Chinese Conversations, Zheng et al., EMNLP 2022, [Paper]
ContraDoc: understanding self-contradictions in documents with large language models, Li et al., arXiv 2023, [Paper https://arxiv.org/abs/2311.09182]
What Evidence Do Language Models Find Convincing?, Wan et al., ACL 2024, [Paper]
Tug-of-war between knowledge: Exploring and resolving knowledge conflicts in retrieval-augmented language models, Jin et al., LREC-COLING 2024, [Paper]

II-III: (Mitigating) Solutions

Eliminating Conflict

WikiContradiction: Detecting Self-Contradiction Articles on Wikipedia, Hsu et al., IEEE Big Data 2021, [Paper]
Topological analysis of contradictions in text, Wu et al., SIGIR 2022, [Paper]
FACTOOL: Factuality Detection in Generative AI-A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios, Chern et al., arXiv 2023, [Paper]
Detecting Misinformation with LLM-Predicted Credibility Signals and Weak Supervision, Leite et al., arXiv 2023, [Paper]

Improving Robustness

Why So Gullible? Enhancing the Robustness of Retrieval-Augmented Models against Counterfactual Noise, Hong et al., arXiv 2024, [Paper]
Defending Against Disinformation Attacks in Open-Domain Question Answering, Weller et al., EACL 2024, [Paper]

Type III: Intra-memory conflict

III-i: Causes

Bias in Training Corpora

On the dangers of stochastic parrots: Can language models be too big, Bender et al., FACCT 2021, [Paper]
Ethical and social risks of harm from language models, Weidinger et al., arXiv 2021, [Paper]
Measuring Causal Effects of Data Statistics on Language Model's 'Factual' Predictions, Elazar et al., arXiv 2023, [Paper]
Studying large language model generalization with influence functions, Grosse et al., arXiv 2023, [Paper]
How pre-trained language models capture factual knowledge? a causal-inspired analysis, Li et al., ACL 2022, [Paper]
Impact of co-occurrence on factual knowledge of large language models, Kang and Choi, EMNLP 2023, [Paper]

Decoding Strategy

Factuality enhanced language models for open-ended text generation, Lee et al., NeurIPS 2022, [Paper]
A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions, Huang et al., arXiv 2023, [Paper]

Knowledge Editing

Unveiling the pitfalls of knowledge editing for large language models, Li et al., ICLR 2024, [Paper]
Editing large language models: Problems, methods, and opportunities, Yao et al., EMNLP 2023, [Paper]

III-ii: (Behavior) Analysis

Self-Inconsistency

Measuring and improving consistency in pretrained language models, Elazar et al., TACL 2021, [Paper]
Methods for measuring, updating, and visualizing factual beliefs in language models, Hase et al., EACL 2023, [Paper]
Knowing what llms do not know: A simple yet effective self-detection method, Zhao et al., NAACL 2024, [Paper]
Statistical knowledge assessment for large language models, Dong et al., NeurIPS 2023, [Paper]
Benchmarking and improving generator-validator consistency of language models, Li et al., ICLR 2024, [Paper]
How pre-trained language models capture factual knowledge? a causal-inspired analysis, Li et al., ACL 2022, [Paper]
Impact of co-occurrence on factual knowledge of large language models, Kang and Choi, EMNLP 2023, [Paper]
ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM, Su et al., arXiv 2024, [Paper]

Latent Representation of Knowledge

Dola: Decoding by contrasting layers improves factuality in large language models, Chuang et al., ICLR 2024, [Paper]
Inferencetime intervention: Eliciting truthful answers from a language model, Li et al., NeurIPS 2023, [Paper]

Cross-lingual Inconsistency

Cross-lingual knowledge editing in large language models, Wan et al., arXiv 2023, [Paper]
Cross-lingual consistency of factual knowledge in multilingual language models, Qi et al., EMNLP 2023, [Paper]

III-iii: (Mitigating) Solutions

Improving Consistency

Measuring and improving consistency in pretrained language models, Elazar et al., TACL 2021, [Paper]
Benchmarking and improving generator-validator consistency of language models, Li et al., ICLR 2024, [Paper]
Improving language models meaning understanding and consistency by learning conceptual roles from dictionary, Jang et al., EMNLP 2023, [Paper]
Enhancing selfconsistency and performance of pre-trained language models through natural language inference, Mitchell et al., EMNLP 2022, [Paper]
Knowing what llms do not know: A simple yet effective self-detection method, Zhao et al., NAACL 2024, [Paper]

Improving Factuality

Decoding by contrasting layers improves factuality in large language models, Chuang et al., ICLR 2024, [Paper]
Inferencetime intervention: Eliciting truthful answers from a language model, Li et al., NeurIPS 2023, [Paper]

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
figures		figures
README.md		README.md

pillowsofwind/Knowledge-Conflicts-Survey

Folders and files

Latest commit

History

Repository files navigation