This repo is a curated list of papers about detection of LLMs-generated content. It includes most lastest papers about detection methods, datasets, attack, etc. We will consistently update this repo to include the most recent papers.
- Awesome_papers_on_LLMs_detection
- Contents
- Training-based Methods
- Zero-shot Methods
- Watermarking
- Attack
- Datasets
- Misc
- DETECTING MACHINE-GENERATED TEXTS BY MULTI-POPULATION AWARE OPTIMIZATION FOR MAXIMUM MEAN DISCREPANCY [pdf] 02/27/2024
- Threads of Subtlety: Detecting Machine-Generated Texts Through Discourse Motifs [pdf] 02/19/2024
- LLM-Detector: Improving AI-Generated Chinese Text Detection with Open-Source LLM Instruction Tuning [pdf] 02/04/2024
- FEW-SHOT DETECTION OF MACHINE-GENERATED TEXT USING STYLE REPRESENTATIONS [pdf] 01/12, 2024
- Token Prediction as Implicit Classification to Identify LLM-Generated Text [pdf] Nov. 15, 2023
- AuthentiGPT: Detecting Machine-Generated Text via Black-Box Language Models Denoising [pdf] Nov. 14, 2023
- G3Detector: General GPT-Generated Text Detector [pdf]
- GPT-Sentinel: Distinguishing Human and ChatGPT Generated Content [pdf]
- GPT Paternity Test: GPT Generated Text Detection with GPT Genetic Inheritance [pdf]
- OpenAI Text Classifier [link]
- GPTZero [link]
- CoCo: Coherence-Enhanced Machine-Generated Text Detection Under Data Limitation With Contrastive Learning [pdf]
- LLMDet: A Large Language Models Detection Tool [pdf]
- Multiscale Positive-Unlabeled Detection of AI-Generated Texts [pdf]
- RADAR: Robust AI-Text Detection via Adversarial Learning [pdf]
- On the Zero-Shot Generalization of Machine-Generated Text Detectors [pdf]
- ConDA: Contrastive Domain Adaptation for AI-generated Text Detection [pdf]
- From Text to Source: Results in Detecting Large Language Model-Generated Content [pdf]
- Ghostbuster: Detecting Text Ghostwritten by Large Language Models [pdf]
- Deepfake Text Detection in the Wild [pdf]
- Automatic Detection of Generated Text is Easiest when Humans are Fooled [pdf]
- SeqXGPT: Sentence-Level AI-Generated Text Detection [pdf]
- Origin Tracing and Detecting of LLMs [pdf]
- GLTR: Statistical Detection and Visualization of Generated Text [pdf]
- Release strategies and the social impacts of language models [pdf]
- Ten Words Only Still Help: Improving Black-Box AI-Generated Text Detection via Proxy-Guided Efficient Re-Sampling [link] 15/02/2024
- Raidar: geneRative AI Detection viA Rewriting [link] 23/01/2024
- SPOTTING LLMS WITH BINOCULARS: ZERO-SHOT DETECTION OF MACHINE-GENERATED TEXT [link]
- Detectgpt: Zero-shot machine-generated text detection using probability curvature [pdf]
- DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text [pdf]
- Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense [pdf]
- Smaller Language Models are Better Black-box Machine-Generated Text Detectors [pdf]
- Intrinsic Dimension Estimation for Robust Detection of AI-Generated Texts [pdf]
- Does DETECTGPT Fully Utilize Perturbation? Selective Perturbation on Model-Based Contrastive Learning Detector would be Better [pdf] 02/03/2024
- DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text [pdf]
- DetectLLM: Leveraging Log Rank Information for Zero-Shot Detection of Machine-Generated Text [pdf]
- Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature [pdf]
- GPT-who: An Information Density-based Machine-Generated Text Detector [pdf]
- Efficient Detection of LLM-generated Texts with a Bayesian Surrogate Model [pdf]
- Detecting Fake Content with Relative Entropy Scoring [pdf]
- Computer-generated text detection using machine learning: A systematic review [pdf]
- GLTR: Statistical Detection and Visualization of Generated Text [pdf]
- Watermarking Text Generated by Black-Box Language Models [pdf]
- Tracing text provenance via context-aware lexical substitution [pdf]
- Natural language watermarking and tamperproofing [pdf]
- Natural language watermarking [pdf]
- Natural language watermarking via morphosyntactic alterations [pdf]
- The hiding virtues of ambiguity: quantifiably resilient watermarking of natural language text through synonym substitutions [pdf]
- Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models [link] 02/28/2024
- EmMark: Robust Watermarks for IP Protection of Embedded Quantized Large Language Models [link] 02/28/2024
- Multi-Bit Distortion-Free Watermarking for Large Language Models [link] 02/27/2024
- GumbelSoft: Diversified Language Model Watermarking via the GumbelMax-trick [link] 20/02/2024
- k-SEMSTAMP : A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text [link] 19/02/2024
- Permute-and-Flip: An optimally robust and watermarkable decoder for LLMs [link] 08/02/2024
- Provably Robust Multi-bit Watermarking for AI-generated Text via Error Correction Code [link] 30/01/2024
- Adaptive Text Watermark for Large Language Models [pdf] 26/01/2024
- Optimizing watermarks for large language models [pdf] 31/12/2023
- Towards Optimal Statistical Watermarking [pdf] 13/12/2023
- ON THE LEARNABILITY OF WATERMARKS FOR LANGUAGE MODELS [pdf] 7/12/2023
- Mark My Words: Analyzing and Evaluating Language Model Watermarks [pdf] 3/12/2023
- I Know You Did Not Write That! A Sampling-Based Watermarking Method for Identifying Machine Generated Text [pdf] 30/11/2023
- TOWARDS CODABLE WATERMARKING FOR INJECTING MULTI-BIT INFORMATION TO LLM [pdf] 27/11/2023
- Improving the Generation Quality of Watermarked Large Language Models via Word Importance Scoring [pdf] 16/11/2023
- Performance Trade-offs of Watermarking Large Language Models [pdf] 16/11/2023
- X-Mark: Towards Lossless Watermarking Through Lexical Redundancy [pdf] 16/11/2023
- WaterBench: Towards Holistic Evaluation of Watermarks for Large Language Models [pdf] 13/11/2023
- Publicly Detectable Watermarking for Language Models [pdf] 25/10/2023
- Unbiased Watermark for Large Language Models [pdf] 18/10/2023
- A watermark for large language models [pdf]
- Undetectable Watermarks for Language Models [pdf]
- Provable Robust Watermarking for AI-Generated Text [pdf]
- Robust Distortion-free Watermarks for Language Models [pdf]
- SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation [pdf]
- DiPmark: A Stealthy, Efficient and Resilient Watermark for Large Language Models [pdf]
- Watermarking Conditional Text Generation for AI Detection: Unveiling Challenges and a Semantic-Aware Watermark Remedy [pdf]
- A Semantic Invariant Robust Watermark for Large Language Models [pdf]
- REMARK-LLM: A Robust and Efficient Watermarking Framework for Generative Large Language Models [pdf]
- Robust Multi-bit Natural Language Watermarking through Invariant Features [pdf]
- Advancing Beyond Identification: Multi-bit Watermark for Language Models [pdf]
- Three Bricks to Consolidate Watermarks for Large Language Models [pdf]
- My AI safety lecture for UT Effective Altruism [Link]
- Zero-Shot Detection of Machine-Generated Codes [pdf]
- Who Wrote this Code? Watermarking for Code Generation [pdf]
- Watermark Stealing in Large Language Models [pdf] 02/29/2024
- Attacking LLM Watermarks by Exploiting Their Strengths [pdf] 02/27/2024
- Can Watermarks Survive Translation? On the Cross-lingual Consistency of Text Watermark for Large Language Models [pdf] 02/22/2024
- Machine-generated Text Localization [pdf] 02/19/2024
- Stumbling Blocks: Stress Testing the Robustness of Machine-Generated Text Detectors Under Attacks [pdf] 02/19/2024
- Authorship Obfuscation in Multilingual Machine-Generated Text Detection [pdf] 01/17/2024
- LANGUAGE MODEL DETECTORS ARE EASILY OPTIMIZED AGAINST [pdf] 11/28/2023
- A Ship of Theseus: Curious Cases of Paraphrasing in LLM-Generated Texts [pdf] 11/14/2023
- Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models [pdf] 11/8/2023
- Does Human Collaboration Enhance the Accuracy of Identifying LLM-Generated Deepfake Texts? [pdf]
- Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense [pdf]
- Red Teaming Language Model Detectors with Language Models [pdf]
- Paraphrase Detection: Human vs. Machine Content [pdf]
- Large Language Models can be Guided to Evade AI-Generated Text Detection [pdf]
- Counter Turing Test CT^2: AI-Generated Text Detection is Not as Easy as You May Think -- Introducing AI Detectability Index [pdf]
- How Reliable Are AI-Generated-Text Detectors? An Assessment Framework Using Evasive Soft Prompts [pdf]
- On the Reliability of Watermarks for Large Language Models [pdf]
- M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection [pdf] 02/19/2024
- How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection [pdf]
- CHEAT: A Large-scale Dataset for Detecting ChatGPT-writtEn AbsTracts [pdf]
- Ghostbuster: Detecting Text Ghostwritten by Large Language Models [pdf]
- M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection [pdf]
- GPT-Sentinel: Distinguishing Human and ChatGPT Generated Content [pdf]
- Mgtbench: Benchmarking machine-generated text detection [pdf]
- HC3 Plus: A Semantic-Invariant Human ChatGPT Comparison Corpus [pdf]
- MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark [pdf]
- TURINGBENCH: A Benchmark Environment for Turing Test in the Age of Neural Text Generation [pdf]
- Hidding the Ghostwriters: An Adversarial Evaluation of AI-Generated Student Essay Detection [pdf] 02/01/2024
- LLM- Detect AI Generated Text. Kaggle. [link]
- Can AI-Generated Text be Reliably Detected? [pdf]
- On the Possibilities of AI-Generated Text Detection [pdf]
- GPT detectors are biased against non-native English writers [pdf]
- ChatLog: Recording and Analyzing ChatGPT Across Time [pdf]
- On the Zero-Shot Generalization of Machine-Generated Text Detectors [pdf]
If you find this repo useful, please cite our work.
@article{yang2023survey,
title={A Survey on Detection of LLMs-Generated Content},
author={Yang, Xianjun and Pan, Liangming and Zhao, Xuandong and Chen, Haifeng and Petzold, Linda and Wang, William Yang and Cheng, Wei},
journal={arXiv preprint arXiv:2310.15654},
year={2023}
}