Awesome-LLM-Watermark

An UP-TO-DATE collection list for Large Language Model (LLM) Watermark

1. LLM watermark
- 1.1. Token-level watermark
- 1.2. Sentence-level watermark (sentence embedding-based watermark)
- 1.3. Model-level watermark
- 1.4. Watermarking detection
1. Attack for watermark
- 2.1. Watermark stealing attack
- 2.2. Watermark removal attack
- 2.3. Watermark spoofing attack
- 2.4. Robust watermark
1. Multi-bit watermark
1. Unbiased watermark
1. Analysis of LLM watermark
1. Survey

1. LLM watermark

1.1. Token-level watermark

A Watermark for Large Language Models
- ICML 2023
- http://arxiv.org/abs/2301.10226
Publicly Detectable Watermarking for Language Models paper
An Unforgeable Publicly Verifiable Watermark for Large Language Models
- ICLR 2024
- https://openreview.net/forum?id=gMLQwKDY3N
On the Reliability of Watermarks for Large Language Models
- ICLR 2024
- https://openreview.net/forum?id=DEJIDCmWOz
Improving the Generation Quality of Watermarked Large Language Models via Word Importance Scoring paper
WatME: Towards Lossless Watermarking Through Lexical Redundancy
- ACL 2024
- https://aclanthology.org/2024.acl-long.496/
- Alias X-Mark: Towards Lossless Watermarking Through Lexical Redundancy paper
Towards Optimal Statistical Watermarking paper
Who Wrote this Code? Watermarking for Code Generation
- ACL 2024
- https://aclanthology.org/2024.acl-long.268
Natural language watermarking via paraphraser-based lexical substitution
- Artificial Intelligence
- https://linkinghub.elsevier.com/retrieve/pii/S000437022300005X
Adaptive Text Watermark for Large Language Models
- ICML 2024
- paper
Duwak: Dual Watermarks in Large Language Models
- ACL findings 2024
- https://aclanthology.org/2024.findings-acl.678
Permute-and-Flip: An optimally stable and watermarkable decoder for LLMs
- NeurIPS workshop 2024
- https://arxiv.org/pdf/2402.05864

1.2. Sentence-level watermark (sentence embedding-based watermark)

WaterPool: A Watermark Mitigating Trade-offs among Imperceptibility, Efficacy and Robustness
- http://arxiv.org/abs/2405.13517
SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation
- NAACL 2024
- http://arxiv.org/abs/2310.03991
k-SemStamp: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text
- ACL Findings 2024
- http://arxiv.org/abs/2402.11399
A Semantic Invariant Robust Watermark for Large Language Models
- ICLR 2024
- http://arxiv.org/abs/2310.06356
A Robust Semantics-based Watermark for Large Language Model against Paraphrasing
- NAACL Findings 2024
- https://aclanthology.org/2024.findings-naacl.40
Context-aware Watermark with Semantic Balanced Green-red Lists for Large Language Models
- EMNLP 2024
- https://aclanthology.org/2024.emnlp-main.1260
Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models
- ICML 2024
- http://arxiv.org/abs/2402.18059
SEFD: Semantic-Enhanced Framework for Detecting LLM-Generated Text paper
DeepTextMark: Deep Learning based Text Watermarking for Detection of Large Language Model Generated Text paper
Adversarial Watermarking Transformer: Towards Tracing Text Provenance with Data Hiding
- IEEE S&P 2021
- https://ieeexplore.ieee.org/document/9519400/
PersonaMark: Personalized LLM watermarking for model protection and user attribution paper
REMARK-LLM: A Robust and Efficient Watermarking Framework for Generative Large Language Models
- USENIX Security 2024

1.3. Model-level watermark

Provable Robust Watermarking for AI-Generated Text
- ICLR 2024
- http://arxiv.org/abs/2306.17439
Watermarking LLMs with Weight Quantization paper
EmMark: Robust Watermarks for IP Protection of Embedded Quantized Large Language Models paper
Watermarking Counterfactual Explanations paper
Provably Robust Watermarks for Open-Source Language Models paper
Learning to Watermark LLM-generated Text via Reinforcement Learning paper

1.4. Watermarking detection

An Entropy-based Text Watermarking Detection Method
- ACL 2024
- https://aclanthology.org/2024.acl-long.630.pdf
WaterSeeker: Efficient Detection of Watermarked Segments in Large Documents paper

2. Attack for watermark

2.1. Watermark stealing attack

Large Language Model Watermark Stealing With Mixed Integer Programming
- ACSAC 2024
- http://arxiv.org/abs/2405.19677
Watermark Stealing in Large Language Models
- ICLR 2024 Workshop, ICML 2024
- http://arxiv.org/abs/2402.19361
Bypassing LLM Watermarks with Color-Aware Substitutions
- ACL 2024
- https://aclanthology.org/2024.acl-long.464

2.2. Watermark removal attack

Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense
- NeurIPS 2023
- http://arxiv.org/abs/2303.13408
Can Watermarks Survive Translation? On the Cross-lingual Consistency of Text Watermark for Large Language Models
- ACL 2024
- http://arxiv.org/abs/2402.14007
Watermark Smoothing Attacks against Language Models
- http://arxiv.org/abs/2407.14206
De-mark: Watermark Removal in Large Language Models
- https://arxiv.org/pdf/2410.13808
No Free Lunch in LLM Watermarking: Trade-offs in Watermarking Design Choices
- NeurIPS 2024
- paper
Watermarks in the Sand: Impossibility of Strong Watermarking for Language Models
- ICML 2024
- paper
- offical cite
WaterPark: A Robustness Assessment of Language Model Watermarking
- http://arxiv.org/abs/2411.13425
$B^4$: A Black-Box Scrubbing Attack on LLM Watermarks
- http://arxiv.org/abs/2411.01222
Can AI-Generated Text be Reliably Detected? paper
Lost in Overlap: Exploring Watermark Collision in LLMs paper

2.3. Watermark spoofing attack

Discovering Clues of Spoofed LM Watermarks
- http://arxiv.org/abs/2410.02693
On the Learnability of Watermarks for Language Models
- ICLR 2024
- http://arxiv.org/abs/2312.04469

2.4. Robust watermark

Edit Distance Robust Watermarks for Language Models
- NeurIPS 2024
- https://openreview.net/pdf?id=FZ45kf5pIA
Waterfall: Framework for Robust and Scalable Text Watermarking paper
Pseudorandom Error-Correcting Codes
- http://arxiv.org/abs/2402.09370
Can Watermarked LLMs be Identified by Users via Crafted Prompts?
- https://openreview.net/forum?id=ujpAYpFDEA

3. Multi-bit watermark

Three Bricks to Consolidate Watermarks for Large Language Models paper
Provably Robust Multi-bit Watermarking for AI-generated Text via Error Correction Code paper
Advancing Beyond Identification: Multi-bit Watermark for Large Language Models
- NAACL 2024
- https://aclanthology.org/2024.naacl-long.224
Towards Codable Watermarking for Injecting Multi-bits Information to LLMs paper
Robust Multi-bit Natural Language Watermarking through Invariant Features
- ACL 2023
- https://aclanthology.org/2023.acl-long.117
Multi-Bit Distortion-Free Watermarking for Large Language Models paper
Towards Codable Watermarking for Injecting Multi-bits Information to LLMs
- ICLR 2024
- https://openreview.net/forum?id=JYu5Flqm9D
Robust Multi-bit Text Watermark with LLM-based Paraphrasers paper
PersonaMark: Personalized LLM watermarking for model protection and user attribution paper
CODEIP: A Grammar-Guided Multi-Bit Watermark for Large Language Models of Code
- EMNLP findings 2024
- https://aclanthology.org/2024.findings-emnlp.541
Enhancing Watermarked Language Models to Identify Users paper
CredID: Credible Multi-Bit Watermark for Large Language Models Identification paper

4. Unbiased watermark

Unbiased Watermark for Large Language Models
- ICLR 2023
- http://arxiv.org/abs/2310.10669
Undetectable Watermarks for Language Models
- COLT 2024
- https://proceedings.mlr.press/v247/christ24a
Robust Distortion-free Watermarks for Language Models
- TMLR 2024
- https://openreview.net/forum?id=FpaCL1MO2C
A Watermark for Low-entropy and Unbiased Generation in Large Language Models
- https://openreview.net/forum?id=hTUrBJqECJ
A Resilient and Accessible Distribution-Preserving Watermark for Large Language Models
- ICML 2024
- https://arxiv.org/abs/2310.07710
Watermarking Language Models with Error Correcting Codes paper
Scalable watermarking for identifying large language model outputs
- Nature 2024
- https://www.nature.com/articles/s41586-024-08025-4
Multi-Bit Distortion-Free Watermarking for Large Language Models paper
Distortion-free Watermarks are not Truly Distortion-free under Watermark Key Collisions paper
- Alias Pseudo- vs. True-Randomness: Rethinking Distortion-Free Watermarks of Language Models under Watermark Key Collisions paper

5. Analysis of LLM watermark

Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data?
- ICML workshop
- https://openreview.net/pdf?id=79NfpNZkXW
On Evaluating The Performance of Watermarked Machine-Generated Texts Under Adversarial Attacks
- http://arxiv.org/abs/2407.04794
Optimizing Adaptive Attacks against Content Watermarks for Language Models
- http://arxiv.org/abs/2410.02440
Optimizing Watermarks for Large Language Models
- ICML 2024
- https://proceedings.mlr.press/v235/wouters24a.html
Performance Trade-offs of Watermarking Large Language Models paper
Watermarking Makes Language Models Radioactive paper
WaterJudge: Quality-Detection Trade-off when Watermarking Large Language Models paper
Towards Better Statistical Understanding of Watermarking LLMs paper
Inevitable Trade-off between Watermark Strength and Speculative Sampling Efficiency for Language Models paper

6. Survey

A Survey of Text Watermarking in the Era of Large Language Models
- ACM Computing Surveys 2024
- http://arxiv.org/abs/2312.07913
Mark My Words: Analyzing and Evaluating Language Model Watermarks paper
WaterBench: Towards Holistic Evaluation of Watermarks for Large Language Models
- ACL 2024
- https://aclanthology.org/2024.acl-long.83/
SoK: On the Role and Future of AIGC Watermarking in the Era of Gen-AI paper
SoK: Watermarking for AI-Generated Content paper

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
LLM_watermark.jpg		LLM_watermark.jpg
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome-LLM-Watermark

1. LLM watermark

1.1. Token-level watermark

1.2. Sentence-level watermark (sentence embedding-based watermark)

1.3. Model-level watermark

1.4. Watermarking detection

2. Attack for watermark

2.1. Watermark stealing attack

2.2. Watermark removal attack

2.3. Watermark spoofing attack

2.4. Robust watermark

3. Multi-bit watermark

4. Unbiased watermark

5. Analysis of LLM watermark

6. Survey

About

Releases

Packages

Contributors 2

plll4zzx/Awesome-LLM-Watermark

Folders and files

Latest commit

History

Repository files navigation

Awesome-LLM-Watermark

1. LLM watermark

1.1. Token-level watermark

1.2. Sentence-level watermark (sentence embedding-based watermark)

1.3. Model-level watermark

1.4. Watermarking detection

2. Attack for watermark

2.1. Watermark stealing attack

2.2. Watermark removal attack

2.3. Watermark spoofing attack

2.4. Robust watermark

3. Multi-bit watermark

4. Unbiased watermark

5. Analysis of LLM watermark

6. Survey

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages