Awesome papers on LLMs detection

This repo is a curated list of papers about detection of LLMs-generated content. It includes most lastest papers about detection methods, datasets, attack, etc. We will consistently update this repo to include the most recent papers.

Awesome_papers_on_LLMs_detection
Contents
Training-based Methods
- Black-box
  - 2023
  - 2022
  - 2020
- White-box
  - 2023
  - 2019
Zero-shot Methods
- Black-box
  - 2023
- White-box
  - 2023
  - Before 2020
Watermarking
- Black-box
  - 2023
  - 2022
- White-box
  - 2023
Attack
Datasets
- 2023
- 2022 and before
Misc

Training-based

Black-box

2023

DETECTING MACHINE-GENERATED TEXTS BY MULTI-POPULATION AWARE OPTIMIZATION FOR MAXIMUM MEAN DISCREPANCY [pdf] 02/27/2024
Threads of Subtlety: Detecting Machine-Generated Texts Through Discourse Motifs [pdf] 02/19/2024
LLM-Detector: Improving AI-Generated Chinese Text Detection with Open-Source LLM Instruction Tuning [pdf] 02/04/2024
FEW-SHOT DETECTION OF MACHINE-GENERATED TEXT USING STYLE REPRESENTATIONS [pdf] 01/12, 2024
Token Prediction as Implicit Classification to Identify LLM-Generated Text [pdf] Nov. 15, 2023
AuthentiGPT: Detecting Machine-Generated Text via Black-Box Language Models Denoising [pdf] Nov. 14, 2023
G3Detector: General GPT-Generated Text Detector [pdf]
GPT-Sentinel: Distinguishing Human and ChatGPT Generated Content [pdf]
GPT Paternity Test: GPT Generated Text Detection with GPT Genetic Inheritance [pdf]

2022

OpenAI Text Classifier [link]
GPTZero [link]
CoCo: Coherence-Enhanced Machine-Generated Text Detection Under Data Limitation With Contrastive Learning [pdf]
LLMDet: A Large Language Models Detection Tool [pdf]
Multiscale Positive-Unlabeled Detection of AI-Generated Texts [pdf]
RADAR: Robust AI-Text Detection via Adversarial Learning [pdf]
On the Zero-Shot Generalization of Machine-Generated Text Detectors [pdf]
ConDA: Contrastive Domain Adaptation for AI-generated Text Detection [pdf]
From Text to Source: Results in Detecting Large Language Model-Generated Content [pdf]
Ghostbuster: Detecting Text Ghostwritten by Large Language Models [pdf]
Deepfake Text Detection in the Wild [pdf]

2020

Automatic Detection of Generated Text is Easiest when Humans are Fooled [pdf]

White-box

2023

SeqXGPT: Sentence-Level AI-Generated Text Detection [pdf]
Origin Tracing and Detecting of LLMs [pdf]

2019

GLTR: Statistical Detection and Visualization of Generated Text [pdf]
Release strategies and the social impacts of language models [pdf]

Zero-shot

Black-box

2023

Ten Words Only Still Help: Improving Black-Box AI-Generated Text Detection via Proxy-Guided Efficient Re-Sampling [link] 15/02/2024
Raidar: geneRative AI Detection viA Rewriting [link] 23/01/2024
SPOTTING LLMS WITH BINOCULARS: ZERO-SHOT DETECTION OF MACHINE-GENERATED TEXT [link]
Detectgpt: Zero-shot machine-generated text detection using probability curvature [pdf]
DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text [pdf]
Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense [pdf]
Smaller Language Models are Better Black-box Machine-Generated Text Detectors [pdf]
Intrinsic Dimension Estimation for Robust Detection of AI-Generated Texts [pdf]

White-box

2023

Does DETECTGPT Fully Utilize Perturbation? Selective Perturbation on Model-Based Contrastive Learning Detector would be Better [pdf] 02/03/2024
DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text [pdf]
DetectLLM: Leveraging Log Rank Information for Zero-Shot Detection of Machine-Generated Text [pdf]
Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature [pdf]
GPT-who: An Information Density-based Machine-Generated Text Detector [pdf]
Efficient Detection of LLM-generated Texts with a Bayesian Surrogate Model [pdf]

Before 2020

Detecting Fake Content with Relative Entropy Scoring [pdf]
Computer-generated text detection using machine learning: A systematic review [pdf]
GLTR: Statistical Detection and Visualization of Generated Text [pdf]

Watermarking

Black-box

2023

Watermarking Text Generated by Black-Box Language Models [pdf]

2022

Tracing text provenance via context-aware lexical substitution [pdf]

Before 2020

Natural language watermarking and tamperproofing [pdf]
Natural language watermarking [pdf]
Natural language watermarking via morphosyntactic alterations [pdf]
The hiding virtues of ambiguity: quantifiably resilient watermarking of natural language text through synonym substitutions [pdf]

White-box

2023

Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models [link] 02/28/2024
EmMark: Robust Watermarks for IP Protection of Embedded Quantized Large Language Models [link] 02/28/2024
Multi-Bit Distortion-Free Watermarking for Large Language Models [link] 02/27/2024
GumbelSoft: Diversified Language Model Watermarking via the GumbelMax-trick [link] 20/02/2024
k-SEMSTAMP : A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text [link] 19/02/2024
Permute-and-Flip: An optimally robust and watermarkable decoder for LLMs [link] 08/02/2024
Provably Robust Multi-bit Watermarking for AI-generated Text via Error Correction Code [link] 30/01/2024
Adaptive Text Watermark for Large Language Models [pdf] 26/01/2024
Optimizing watermarks for large language models [pdf] 31/12/2023
Towards Optimal Statistical Watermarking [pdf] 13/12/2023
ON THE LEARNABILITY OF WATERMARKS FOR LANGUAGE MODELS [pdf] 7/12/2023
Mark My Words: Analyzing and Evaluating Language Model Watermarks [pdf] 3/12/2023
I Know You Did Not Write That! A Sampling-Based Watermarking Method for Identifying Machine Generated Text [pdf] 30/11/2023
TOWARDS CODABLE WATERMARKING FOR INJECTING MULTI-BIT INFORMATION TO LLM [pdf] 27/11/2023
Improving the Generation Quality of Watermarked Large Language Models via Word Importance Scoring [pdf] 16/11/2023
Performance Trade-offs of Watermarking Large Language Models [pdf] 16/11/2023
X-Mark: Towards Lossless Watermarking Through Lexical Redundancy [pdf] 16/11/2023
WaterBench: Towards Holistic Evaluation of Watermarks for Large Language Models [pdf] 13/11/2023
Publicly Detectable Watermarking for Language Models [pdf] 25/10/2023
Unbiased Watermark for Large Language Models [pdf] 18/10/2023
A watermark for large language models [pdf]
Undetectable Watermarks for Language Models [pdf]
Provable Robust Watermarking for AI-Generated Text [pdf]
Robust Distortion-free Watermarks for Language Models [pdf]
SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation [pdf]
DiPmark: A Stealthy, Efficient and Resilient Watermark for Large Language Models [pdf]
Watermarking Conditional Text Generation for AI Detection: Unveiling Challenges and a Semantic-Aware Watermark Remedy [pdf]
A Semantic Invariant Robust Watermark for Large Language Models [pdf]
REMARK-LLM: A Robust and Efficient Watermarking Framework for Generative Large Language Models [pdf]
Robust Multi-bit Natural Language Watermarking through Invariant Features [pdf]
Advancing Beyond Identification: Multi-bit Watermark for Language Models [pdf]
Three Bricks to Consolidate Watermarks for Large Language Models [pdf]

2022

My AI safety lecture for UT Effective Altruism [Link]

Code-detection

Zero-Shot Detection of Machine-Generated Codes [pdf]
Who Wrote this Code? Watermarking for Code Generation [pdf]

Attack

Watermark Stealing in Large Language Models [pdf] 02/29/2024
Attacking LLM Watermarks by Exploiting Their Strengths [pdf] 02/27/2024
Can Watermarks Survive Translation? On the Cross-lingual Consistency of Text Watermark for Large Language Models [pdf] 02/22/2024
Machine-generated Text Localization [pdf] 02/19/2024
Stumbling Blocks: Stress Testing the Robustness of Machine-Generated Text Detectors Under Attacks [pdf] 02/19/2024
Authorship Obfuscation in Multilingual Machine-Generated Text Detection [pdf] 01/17/2024
LANGUAGE MODEL DETECTORS ARE EASILY OPTIMIZED AGAINST [pdf] 11/28/2023
A Ship of Theseus: Curious Cases of Paraphrasing in LLM-Generated Texts [pdf] 11/14/2023
Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models [pdf] 11/8/2023
Does Human Collaboration Enhance the Accuracy of Identifying LLM-Generated Deepfake Texts? [pdf]
Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense [pdf]
Red Teaming Language Model Detectors with Language Models [pdf]
Paraphrase Detection: Human vs. Machine Content [pdf]
Large Language Models can be Guided to Evade AI-Generated Text Detection [pdf]
Counter Turing Test CT^2: AI-Generated Text Detection is Not as Easy as You May Think -- Introducing AI Detectability Index [pdf]
How Reliable Are AI-Generated-Text Detectors? An Assessment Framework Using Evasive Soft Prompts [pdf]
On the Reliability of Watermarks for Large Language Models [pdf]

Datasets

2023

M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection [pdf] 02/19/2024
How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection [pdf]
CHEAT: A Large-scale Dataset for Detecting ChatGPT-writtEn AbsTracts [pdf]
Ghostbuster: Detecting Text Ghostwritten by Large Language Models [pdf]
M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection [pdf]
GPT-Sentinel: Distinguishing Human and ChatGPT Generated Content [pdf]
Mgtbench: Benchmarking machine-generated text detection [pdf]
HC3 Plus: A Semantic-Invariant Human ChatGPT Comparison Corpus [pdf]
MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark [pdf]

2022 and before

TURINGBENCH: A Benchmark Environment for Turing Test in the Age of Neural Text Generation [pdf]

Misc

Hidding the Ghostwriters: An Adversarial Evaluation of AI-Generated Student Essay Detection [pdf] 02/01/2024
LLM- Detect AI Generated Text. Kaggle. [link]
Can AI-Generated Text be Reliably Detected? [pdf]
On the Possibilities of AI-Generated Text Detection [pdf]
GPT detectors are biased against non-native English writers [pdf]
ChatLog: Recording and Analyzing ChatGPT Across Time [pdf]
On the Zero-Shot Generalization of Machine-Generated Text Detectors [pdf]

If you find this repo useful, please cite our work.

@article{yang2023survey,
  title={A Survey on Detection of LLMs-Generated Content},
  author={Yang, Xianjun and Pan, Liangming and Zhao, Xuandong and Chen, Haifeng and Petzold, Linda and Wang, William Yang and Cheng, Wei},
  journal={arXiv preprint arXiv:2310.15654},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.jpg		main.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome papers on LLMs detection

Contents

Training-based

Black-box

2023

2022

2020

White-box

2023

2019

Zero-shot

Black-box

2023

White-box

2023

Before 2020

Watermarking

Black-box

2023

2022

Before 2020

White-box

2023

2022

Code-detection

Attack

Datasets

2023

2022 and before

Misc

About

Releases

Packages

License

Jellyfish0029/Awesome_papers_on_LLMs_detection

Folders and files

Latest commit

History

Repository files navigation

Awesome papers on LLMs detection

Contents

Training-based

Black-box

2023

2022

2020

White-box

2023

2019

Zero-shot

Black-box

2023

White-box

2023

Before 2020

Watermarking

Black-box

2023

2022

Before 2020

White-box

2023

2022

Code-detection

Attack

Datasets

2023

2022 and before

Misc

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages