Model Extraction(Stealing) Attacks and Defenses on Machine Learning Models Literature

This repository contains a curated list of research papers on model extraction attacks and defenses in machine learning, organized by year of publication.

Papers are sorted by their released dates in descending order.

Survey Papers

Year	Title	Venue	Paper Link
2024	A Survey of Graph Neural Networks in Real world: Imbalance, Noise, Privacy and OOD Challenges	arXiv	Link
2024	Graph neural networks: a survey on the links between privacy and security	Artificial Intelligence Review	Link
2024	Trustworthy Graph Neural Networks: Aspects, Methods and Trends	Proceedings of the IEEE	Link
2024	Safety in Graph Machine Learning: Threats and Safeguards	arXiv	Link
2024	SoK: All You Need to Know About On-Device ML Model Extraction - The Gap Between Research and Practice	USENIX	Link
2024	Unique Security and Privacy Threats of Large Language Model: A Comprehensive Survey	arXiv	Link
2023	A Survey on Privacy in Graph Neural Networks: Attacks, Preservation, and Applications	arXiv	Link
2023	A Comprehensive Survey on Trustworthy Graph Neural Networks: Privacy, Robustness, Fairness, and Explainability	arXiv	Link
2023	A survey on large language model (LLM) security and privacy: The Good, The Bad, and The Ugly	ScienceDirect	Link
2023	I Know What You Trained Last Summer: A Survey on Stealing Machine Learning Models and Defences	ACM Computing Surveys	Link
2023	A Taxonomic Survey of Model Extraction Attacks	IEEE International Conference on Cyber Security and Resilience (CSR)	Link
2023	A Survey of Privacy Attacks in Machine Learning	ACM Computing Surveys	Link
2023	Adversarial Attack and Defense on Graph Data: A Survey	IEEE Transactions on Knowledge and Data Engineering	Link
2023	Model Extraction Attacks Revisited	arXiv	https://arxiv.org/abs/2312.05386
2022	Privacy and Robustness in Federated Learning: Attacks and Defenses	IEEE Transactions on Neural Networks and Learning Systems	Link
2022	Towards Security Threats of Deep Learning Systems: A Survey	IEEE Transactions on Software Engineering	Link
2021	A Critical Overview of Privacy in Machine Learning	IEEE Security & Privacy	Link
2021	When Machine Learning Meets Privacy: A Survey and Outlook	ACM Computing Surveys	Link
2020	An Overview of Privacy in Machine Learning	arXiv	Link
2020	Privacy-preserving deep learning on machine learning as a service—a comprehensive survey	IEEE Access	Link
2020	Privacy and security issues in deep learning: A survey	IEEE Access	Link

Model Extraction Attack

2024

Year	Title	Target Model	Venue	Paper Link
2024	Special Characters Attack: Toward Scalable Training Data Extraction From Large Language Models	Large Language Models	arXiv	Link
2024	LLM-FIN: Large Language Models Fingerprinting Attack on Edge Devices	Large Language Models	ISQED	Link
2024	Model Extraction Attack against On-device Deep Learning with Power Side Channel	Deep Neural Networks	ISQED	Link
2024	Efficient Model-Stealing Attacks Against Inductive Graph Neural Networks	Graph Neural Networks	arXiv	Link
2024	A realistic model extraction attack against graph neural networks	Graph Neural Networks	Knowledge-Based Systems	Link
2024	Large Language Models for Link Stealing Attacks Against Graph Neural Networks	Graph Neural Networks	arXiv	Link
2024	Link Stealing Attacks Against Inductive Graph Neural Networks	Graph Neural Networks	arXiv	Link
2024	Stealing the Invisible: Unveiling Pre-Trained CNN Models through Adversarial Examples and Timing Side-Channels	CNN	arXiv	Link
2024	SwiftTheft: A Time-Efficient Model Extraction Attack Framework Against Cloud-Based Deep Neural Networks	Deep Neural Networks	Chinese Journal of Electronics	Link
2024	Model Extraction Attack Without Natural Images	Deep Neural Networks	ACNS Workshops
2024	DEMISTIFY: Identifying On-device Machine Learning Models Stealing and Reuse Vulnerabilities in Mobile Apps	Mobile ML Models	ICSE	Link
2024	Army of Thieves: Enhancing Black-Box Model Extraction via Ensemble Based Sample Selection	Deep Neural Networks	WACV	Link
2024	Trained to Leak: Hiding Trojan Side-Channels in Neural Network Weights	Neural Networks	HOST	Link
2024	A Stealthy Wrongdoer: Feature-Oriented Reconstruction Attack against Split Learning	Split Learning Models	CVPR	Link
2024	Prompt Stealing Attacks Against Large Language Models	Large Language Models	arXiv	Link
2024	Stealing Part of a Production Language Model	Large Language Models	arXiv	Link
2024	Fully Exploiting Every Real Sample: SuperPixel Sample Gradient Model Stealing	Deep Neural Networks	CVPR	Link
2024	Teach LLMs to Phish: Stealing Private Information from Language Models	Large Language Models	arXiv	Link
2024	Data-Free Hard-Label Robustness Stealing Attack	Deep Neural Networks	AAAI	Link
2024	Large Language Model Watermark Stealing With Mixed Integer Programming	Large Language Models	arXiv	Link
2024	PRSA: PRompt Stealing Attacks against Large Language Models	Large Language Models	arXiv	Link
2024	QuantumLeak: Stealing Quantum Neural Networks from Cloud-based NISQ Machines	Quantum Neural Networks	arXiv	Link
2024	Privacy Backdoors: Stealing Data with Corrupted Pretrained Models	Pretrained Models	arXiv	Link
2024	AugSteal: Advancing Model Steal With Data Augmentation in Active Learning Frameworks	Active Learning Models	IEEE TIFS	Link
2024	Towards Model Extraction Attacks in GAN-Based Image Translation via Domain Shift Mitigation	GANs	AAAI	Link
2024	Stealing Image-to-Image Translation Models With a Single Query	Image-to-Image Translation Models	arXiv	Link
2024	A two-stage model extraction attack on GANs with a small collected dataset	GANs	Computers & Security	Link
2024	Layer Sequence Extraction of Optimized DNNs Using Side-Channel Information Leaks	Deep Neural Networks	IEEE TCAD	Link
2024	COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability	Large Language Models	arXiv	Link
2024	Prompt Stealing Attacks Against Text-to-Image Generation Models	Text-to-Image Models	arXiv	Link
2024	An Empirical Evaluation of the Data Leakage in Federated Graph Learning	Graph Neural Networks	IEEE Transactions on Network Science and Engineering	Paper
2024	AttacKG+:Boosting Attack Knowledge Graph Construction with Large Language Models	Knowledge Graphs	arXiv	Link
2024	Adversarial Attacks on Fairness of Graph Neural Networks	Graph Neural Networks	arXiv	Link
2024	Unveiling Memorization in Code Models	Code Models	ICSE	Link
2024	Unveiling the Secrets without Data: Can Graph Neural Networks Be Exploited through Data-Free Model Extraction Attacks?	Graph Neural Networks	USENIX Security	Link

2023

Year	Title	Target Model	Venue	Paper Link
2023	Explanation leaks: Explanation-guided model extraction attacks	Explainable AI Models	Information Sciences	Link
2023	Model Leeching: An Extraction Attack Targeting LLMs	Large Language Models	arXiv	Link
2023	Are Diffusion Models Vulnerable to Membership Inference Attacks?	Diffusion Models	ICML	Link
2023	Ethicist: Targeted Training Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confidence Estimation	Large Language Models	arXiv	Link
2023	Model Extraction Attacks on Split Federated Learning	Federated Learning Models	arXiv	Link
2023	Can't Steal? Cont-Steal! Contrastive Stealing Attacks Against Image Encoders	Image Encoders	arXiv	Link
2023	Model Stealing Attack against Graph Classification with Authenticity, Uncertainty and Diversity	Graph Neural Networks	arXiv	Link
2023	Marich: A Query-efficient Distributionally Equivalent Model Extraction Attack	Machine Learning Models	NeurIPS	Link
2023	D-DAE: Defense-Penetrating Model Extraction Attacks	Deep Neural Networks	IEEE S&P	Link
2023	Good Artists Copy, Great Artists Steal: Model Extraction Attacks Against Image Translation Models	Image Translation Models	arXiv	Link
2023	AUTOLYCUS: Exploiting Explainable AI (XAI) for Model Extraction Attacks against White-Box Models	White-Box Models	arXiv	Link
2023	Efficient Model Extraction by Data Set Stealing, Balancing, and Filtering	Machine Learning Models	IEEE IoT Journal	Link
2023	DivTheft: An Ensemble Model Stealing Attack by Divide-and-Conquer	Machine Learning Models	IEEE TDSC	Link
2023	Can Sensitive Information Be Deleted From LLMs? Objectives for Defending Against Extraction Attacks	Large Language Models	arXiv	Link
2023	A Theoretical Insight into Attack and Defense of Gradient Leakage in Transformer	Transformer Models	arXiv	Link
2023	When the Curious Abandon Honesty: Federated Learning Is Not Private	Federated Learning Models	IEEE European Symposium on Security and Privacy (EuroS&P)	Link
2023	Making Watermark Survive Model Extraction Attacks in Graph Neural Networks	Graph Neural Networks	IEEE International Conference on Communications	Link
2023	A Plot is Worth a Thousand Words: Model Information Stealing Attacks via Scientific Plots	Machine Learning Models	arXiv	Link
2023	Private Graph Extraction via Feature Explanations	Graph Neural Networks	Proceedings on Privacy Enhancing Technologies	Link
2023	Extracting Privacy-Preserving Subgraphs in Federated Graph Learning using Information Bottleneck	Federated Graph Models	ACM Asia CCS	Link

2022

Year	Title	Target Model	Venue	Paper Link
2022	Canary Extraction in Natural Language Understanding Models	NLU Models	arXiv	Link
2022	Are Large Pre-Trained Language Models Leaking Your Personal Information?	Large Language Models	arXiv	Link
2022	Text Revealer: Private Text Reconstruction via Model Inversion Attacks against Transformers	Transformers	arXiv	Link
2022	Precise Extraction of Deep Learning Models via Side-Channel Attacks on Edge/Endpoint Devices	Deep Learning Models	ESORICS
2022	DeepSteal: Advanced Model Extractions Leveraging Efficient Weight Stealing in Memories	Deep Neural Networks	IEEE S&P	Link
2022	DualCF: Efficient Model Extraction Attack from Counterfactual Explanations	Machine Learning Models	FAccT	Link
2022	GAME: Generative-Based Adaptive Model Extraction Attack	Machine Learning Models	ESORICS
2022	Are You Stealing My Model? Sample Correlation for Fingerprinting Deep Neural Networks	Deep Neural Networks	NeurIPS	Link
2022	On the Difficulty of Defending Self-Supervised Learning against Model Extraction	Self-Supervised Learning Models	ICML	Link
2022	Black-Box Dissector: Towards Erasing-Based Hard-Label Model Stealing Attack	Machine Learning Models	ECCV
2022	Imitated Detectors: Stealing Knowledge of Black-box Object Detectors	Object Detectors	ACM MM	Link
2022	StolenEncoder: Stealing Pre-trained Encoders in Self-supervised Learning	Self-Supervised Learning Models	ACM CCS	Link
2022	Model Stealing Attacks Against Vision-Language Models	Vision-Language Models	OpenReview	Link
2022	ES Attack: Model Stealing Against Deep Neural Networks Without Data Hurdles	Deep Neural Networks	IEEE TETCI	Link
2022	Data Stealing Attack on Medical Images: Is It Safe to Export Networks from Data Lakes?	Medical Image Models	MICCAI
2022	Towards Data-Free Model Stealing in a Hard Label Setting	Machine Learning Models	CVPR	Link
2022	Careful What You Wish For: on the Extraction of Adversarially Trained Models	Adversarially Trained Models	PST	Link
2022	Enhance Model Stealing Attack via Label Refining	Machine Learning Models	ICSP	Link
2022	StolenEncoder: Stealing Pre-trained Encoders in Self-supervised Learning	Self-Supervised Learning Models	arXiv	Link
2022	Demystifying Arch-hints for Model Extraction: An Attack in Unified Memory System	Machine Learning Models	arXiv	Link
2022	SNIFF: Reverse Engineering of Neural Networks With Fault Attacks	Neural Networks	IEEE Transactions on Reliability	Link
2022	High-Fidelity Model Extraction Attacks via Remote Power Monitors	Machine Learning Models	AICAS	Link
2022	User-Level Label Leakage from Gradients in Federated Learning	Federated Learning Models	arXiv	Link
2022	Student Surpasses Teacher: Imitation Attack for Black-Box NLP APIs	NLP Models	arXiv	Link
2022	Effect Verification of a Feature Extraction Method Based on Graph Convolutional Networks	Graph Convolutional Networks	International Conference on Machine Learning, Cloud Computing and Intelligent Mining (MLCCIM)	Link
2022	Towards Extracting Graph Neural Network Models via Prediction Queries (Student Abstract)	Graph Neural Networks	Proceedings of the AAAI Conference on Artificial Intelligence	Link

2021

Year	Title	Target Model	Venue	Paper Link
2021	GraphMI: Extracting Private Graph Data from Graph Neural Networks	Graph Neural Networks	IJCAI	Link
2021	Extracting Training Data from Large Language Models	Large Language Models	USENIX Security	Link
2021	MAZE: Data-Free Model Stealing Attack Using Zeroth-Order Gradient Estimation	Machine Learning Models	CVPR	Link
2021	Data-Free Model Extraction	Machine Learning Models	CVPR	Link
2021	Model Extraction and Adversarial Transferability, Your BERT is Vulnerable!	BERT	arXiv	Link
2021	Model Extraction Attacks on Graph Neural Networks: Taxonomy and Realization	Graph Neural Networks	arXiv	Link
2021	Watermarking Graph Neural Networks by Random Graphs	Graph Neural Networks	ISDFS	Link
2021	Model Stealing Attacks Against Inductive Graph Neural Networks	Graph Neural Networks	arXiv	Link
2021	Tenet: A Neural Network Model Extraction Attack in Multi-core Architecture	Neural Networks	GLSVLSI	Link
2021	Towards Extracting Graph Neural Network Models via Prediction Queries (Student Abstract)	Graph Neural Networks	AAAI	Link
2021	GraphMI: Extracting Private Graph Data from Graph Neural Networks	Graph Neural Networks	arXiv	Link
2021	Stealing Machine Learning Parameters via Side Channel Power Attacks	Machine Learning Models	ISVLSI	Link
2021	Stealing Machine Learning Models: Attacks and Countermeasures for Generative Adversarial Networks	GANs	ACSAC	Link
2021	Thief, Beware of What Get You There: Towards Understanding Model Extraction Attack	Machine Learning Models	arXiv	Link
2021	SEAT: Similarity Encoder by Adversarial Training for Detecting Model Extraction Attack Queries	Machine Learning Models	AISec	Link
2021	Stealing Deep Reinforcement Learning Models for Fun and Profit	Deep Reinforcement Learning Models	ASIA CCS	Link
2021	InverseNet: Augmenting Model Extraction Attacks with Training Data Inversion	Machine Learning Models	IJCAI	Link
2021	Black-Box Attacks on Sequential Recommenders via Data-Free Model Extraction	Sequential Recommenders	RecSys	Link
2021	Hermes Attack: Steal DNN Models with Lossless Inference Accuracy	Deep Neural Networks	USENIX Security	Link
2021	Deep Neural Network Fingerprinting by Conferrable Adversarial Examples	Deep Neural Networks	arXiv	Link
2021	MEGEX: Data-Free Model Extraction Attack against Gradient-Based Explainable AI	Explainable AI Models	arXiv	Link
2021	Model Extraction and Adversarial Transferability, Your BERT is Vulnerable!	BERT	NAACL	Link
2021	Leveraging Partial Model Extractions using Uncertainty Quantification	Machine Learning Models	CloudNet	Link
2021	Model Extraction and Adversarial Attacks on Neural Networks Using Switching Power Information	Neural Networks	ICANN
2021	Killing One Bird with Two Stones: Model Extraction and Attribute Inference Attacks against BERT-based APIs	BERT Models	arXiv	Link
2021	Dataset Inference: Ownership Resolution in Machine Learning	Machine Learning Models	arXiv	Link

2020

Year	Title	Target Model	Venue	Paper Link
2020	DeepSniffer: A DNN Model Extraction Framework Based on Learning Architectural Hints	Deep Neural Networks	ASPLOS	Link
2020	Practical Side-Channel Based Model Extraction Attack on Tree-Based Machine Learning Algorithm	Tree-Based ML Models	ACNS Workshops
2020	Stealing Links from Graph Neural Networks	Graph Neural Networks	arXiv	Link
2020	Special-Purpose Model Extraction Attacks: Stealing Coarse Model with Fewer Queries	Machine Learning Models	TrustCom	Link
2020	Extraction of Complex DNN Models: Real Threat or Boogeyman?	Deep Neural Networks	EDSMLS
2020	ActiveThief: Model Extraction Using Active Learning and Unannotated Public Data	Machine Learning Models	AAAI	Link
2020	Cryptanalytic Extraction of Neural Network Models	Neural Networks	CRYPTO
2020	Stealing Your Data from Compressed Machine Learning Models	Machine Learning Models	DAC	Link
2020	Model Extraction Attacks on Recurrent Neural Networks	Recurrent Neural Networks	Journal of Information Processing
2020	Neural Network Model Extraction Attacks in Edge Devices by Hearing Architectural Hints	Neural Networks	ASPLOS	Link
2020	Leaky DNN: Stealing Deep-Learning Model Secret with GPU Context-Switching Side-Channel	Deep Neural Networks	DSN	Link
2020	Best-Effort Adversarial Approximation of Black-Box Malware Classifiers	Malware Classifiers	arXiv	Link
2020	High Accuracy and High Fidelity Extraction of Neural Networks	Neural Networks	arXiv	Link
2020	Thieves on Sesame Street! Model Extraction of BERT-based APIs	BERT	ICLR	Link
2020	Exploring connections between active learning and model extraction	Machine Learning Models	USENIX Security
2020	Black-Box Ripper: Copying black-box models using generative evolutionary algorithms	Machine Learning Models	arXiv	Link
2020	Cache Telepathy: Leveraging Shared Resource Attacks to Learn DNN Architectures	Deep Neural Networks	USENIX Security	Link
2020	Open DNN Box by Power Side-Channel Attack	Deep Neural Networks	IEEE TCAS II	Link
2020	Reverse-engineering deep ReLU networks	Deep ReLU Networks	ICML	Link
2020	Model extraction from counterfactual explanations	Machine Learning Models	arXiv	Link
2020	GANRED: GAN-based Reverse Engineering of DNNs via Cache Side-Channel	Deep Neural Networks	CCSW	Link
2020	DeepEM: Deep Neural Networks Model Recovery through EM Side-Channel Information Leakage	Deep Neural Networks	HOST	Link
2020	A Framework for Evaluating Gradient Leakage Attacks in Federated Learning	Federated Learning Models	arXiv	Link
2020	Quantifying (Hyper) Parameter Leakage in Machine Learning	Machine Learning Models	IEEE International Conference on Multimedia Big Data (BigMM)	Link
2020	CloudLeak: Large-Scale Deep Learning Models Stealing Through Adversarial Examples	Deep Learning Models	NDSS	Link
2020	Stealing Black-Box Functionality Using The Deep Neural Tree Architecture	Black-Box Models	arXiv	Link
2020	Model Extraction Attacks and Defenses on Cloud-Based Machine Learning Models	Cloud-Based ML Models	IEEE Communications Magazine	Link

2019

Year	Title	Target Model	Venue	Paper Link
2019	Stealing Neural Networks via Timing Side Channels	Neural Networks	arXiv	Link
2019	PRADA: Protecting Against DNN Model Stealing Attacks	Deep Neural Networks	EuroS&P	Link
2019	A framework for the extraction of Deep Neural Networks by leveraging public data	Deep Neural Networks	arXiv	Link
2019	Adversarial Model Extraction on Graph Neural Networks	Graph Neural Networks	arXiv	Link
2019	Efficiently Stealing your Machine Learning Models	Machine Learning Models	WPES	Link
2019	GDALR: An Efficient Model Duplication Attack on Black Box Machine Learning Models	Machine Learning Models	ICSCAN	Link
2019	Stealing Knowledge from Protected Deep Neural Networks Using Composite Unlabeled Data	Deep Neural Networks	IJCNN	Link
2019	Knockoff Nets: Stealing Functionality of Black-Box Models	Machine Learning Models	CVPR	Link
2019	Model Reconstruction from Model Explanations	Machine Learning Models	FAT*	Link
2019	Model Weight Theft With Just Noise Inputs: The Curious Case of the Petulant Attacker	Machine Learning Models	arXiv	Link
2019	Model-Extraction Attack Against FPGA-DNN Accelerator Utilizing Correlation Electromagnetic Analysis	DNN Accelerators	FCCM	Link
2019	CSI NN: Reverse Engineering of Neural Network Architectures Through Electromagnetic Side Channel	Neural Networks	USENIX Security	Link
2019	Topology Attack and Defense for Graph Neural Networks: An Optimization Perspective	Graph Neural Networks	arXiv	Link
2019	Adversarial Examples on Graph Data: Deep Insights into Attack and Defense	Graph Neural Networks	arXiv	Link

2018

Year	Title	Target Model	Venue	Paper Link
2018	Model Extraction Warning in MLaaS Paradigm	Machine Learning Models	ACSAC	Link
2018	Copycat CNN: Stealing Knowledge by Persuading Confession with Random Non-Labeled Data	CNNs	IJCNN	Link
2018	Active Deep Learning Attacks under Strict Rate Limitations for Online API Calls	Machine Learning Models	HST	Link
2018	Stealing Hyperparameters in Machine Learning	Machine Learning Models	IEEE S&P	Link
2018	Generative Adversarial Networks for Black-Box API Attacks with Limited Training Data	Machine Learning APIs	ISSPIT	Link
2018	Towards Reverse-Engineering Black-Box Neural Networks	Neural Networks	arXiv	Link
2018	Reverse engineering convolutional neural networks through side-channel information leaks	CNNs	DAC	Link
2018	Model Extraction and Active Learning	Machine Learning Models	arXiv	Link
2018	Adversarial Attacks on Neural Networks for Graph Data	Graph Neural Networks	SIGKDD	Link

2017

Year	Title	Target Model	Venue	Paper Link
2017	Practical Black-Box Attacks against Machine Learning	Machine Learning Models	ASIA CCS	Link
2017	Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models	Machine Learning Models	arXiv	Link
2017	How to steal a machine learning classifier with deep learning	Machine Learning Classifiers	HST	Link
2017	Ensemble Adversarial Training: Attacks and Defenses	Machine Learning Models	arXiv	Link

2016

Year	Title	Target Model	Venue	Paper Link	Code Link
2016	Stealing Machine Learning Models via Prediction APIs	Machine Learning Models	arXiv	Link

2014

Year	Title	Target Model	Venue	Paper Link	Code Link
2014	Adding Robustness to Support Vector Machines Against Adversarial Reverse Engineering	Support Vector Machines	CIKM	Link

Model Extraction Defense

2024

Year	Title	Target Model	Venue	Paper Link
2024	Defense Against Model Extraction Attacks on Recommender Systems	Recommender Systems	WSDM	Link
2024	GNNGuard: A Fingerprinting Framework for Verifying Ownerships of Graph Neural Networks	Graph Neural Networks	OpenReview	Link
2024	Privacy-Enhanced Graph Neural Network for Decentralized Local Graphs	Graph Neural Networks	IEEE TIFS	Link
2024	GENIE: Watermarking Graph Neural Networks for Link Prediction	Graph Neural Networks	arXiv	Link
2024	PreGIP: Watermarking the Pretraining of Graph Neural Networks for Deep Intellectual Property Protection	Graph Neural Networks	arXiv	Link
2024	Inversion-Guided Defense: Detecting Model Stealing Attacks by Output Inverting	Machine Learning Models	IEEE TIFS	Link
2024	Defending against model extraction attacks with OOD feature learning and decision boundary confusion	Machine Learning Models	Computers & Security	Link
2024	SAME: Sample Reconstruction against Model Extraction Attacks	Machine Learning Models	AAAI	Link
2024	Model Stealing Detection for IoT Services Based on Multi-Dimensional Features	IoT Models	IEEE IoT Journal	Link
2024	MisGUIDE : Defense Against Data-Free Deep Learning Model Extraction	Deep Learning Models	arXiv	Link
2024	Adversarial Sparse Teacher: Defense Against Distillation-Based Model Stealing Attacks Using Adversarial Examples	Machine Learning Models	arXiv	Link
2024	Construct a Secure CNN Against Gradient Inversion Attack	CNN	PAKDD
2024	A Comprehensive Defense Framework Against Model Extraction Attacks	Machine Learning Models	IEEE TDSC	Link
2024	HODA: Hardness-Oriented Detection of Model Extraction Attacks	Machine Learning Models	IEEE TIFS	Link
2024	Adaptive and robust watermark against model extraction attack	Machine Learning Models	arXiv	Link
2024	Defense against Model Extraction Attack by Bayesian Active Watermarking	Machine Learning Models	OpenReview	Link
2024	Poisoning-Free Defense Against Black-Box Model Extraction	Machine Learning Models	ICASSP	Link
2024	Reliable Model Watermarking: Defending Against Theft without Compromising on Evasion	Machine Learning Models	arXiv	Link
2024	Efficient Model Stealing Defense with Noise Transition Matrix	Machine Learning Models	CVPR	Link
2024	Making models more secure: An efficient model stealing detection method	Machine Learning Models	Computers and Electrical Engineering	Link
2024	TransLinkGuard: Safeguarding Transformer Models Against Model Stealing in Edge Deployment	Transformer Models	arXiv	Link
2024	Not Just Change the Labels, Learn the Features: Watermarking Deep Neural Networks with Multi-View Data	Deep Neural Networks	arXiv	Link
2024	QUEEN: Query Unlearning against Model Extraction	Machine Learning Models	arXiv	Link
2024	Bident Structure for Neural Network Model Protection	Neural Networks	SCITEPRESS	Link
2024	ModelGuard: Information-Theoretic Defense Against Model Extraction Attacks	Machine Learning Models	USENIX Security	Link

2023

Year	Title	Target Model	Venue	Paper Link
2023	GrOVe: Ownership Verification of Graph Neural Networks using Embeddings	Graph Neural Networks	arXiv	Link
2023	Making Watermark Survive Model Extraction Attacks in Graph Neural Networks	Graph Neural Networks
2023	Exposing Model Theft: A Robust and Transferable Watermark for Thwarting Model Extraction Attacks	Machine Learning Models	CIKM	Link
2023	Defending against model extraction attacks with physical unclonable function	Machine Learning Models	Information Sciences	Link
2023	Isolation and Induction: Training Robust Deep Neural Networks against Model Stealing Attacks	Deep Neural Networks	ACM MM	Link
2023	APMSA: Adversarial Perturbation Against Model Stealing Attacks	Machine Learning Models	IEEE TIFS	Link
2023	Deep Neural Network Watermarking against Model Extraction Attack	Deep Neural Networks	ACM MM	Link
2023	Bucks for Buckets (B4B): Active Defenses Against Stealing Encoders	Encoder Models	NeurIPS	Link
2023	Categorical Inference Poisoning: Verifiable Defense Against Black-Box DNN Model Stealing Without Constraining Surrogate Data and Query Times	Deep Neural Networks	IEEE TIFS	Link
2023	{GAP}: Differentially Private Graph Neural Networks with Aggregation Perturbation	Graph Neural Networks	USENIX Security Symposium	Link

2022

Year	Title	Target Model	Venue	Paper Link
2022	CATER: Intellectual Property Protection on Text Generation APIs via Conditional Watermarks	Text Generation Models	arXiv	Link
2022	Defending against Model Stealing via Verifying Embedded External Features	Machine Learning Models	AAAI	Link
2022	Monitoring-Based Differential Privacy Mechanism Against Query Flooding-Based Model Extraction Attack	Machine Learning Models	IEEE TDSC	Link
2022	DynaMarks: Defending Against Deep Learning Model Extraction Using Dynamic Watermarking	Deep Learning Models	arXiv	Link
2022	SeInspect: Defending Model Stealing via Heterogeneous Semantic Inspection	Machine Learning Models	ESORICS
2022	Model Stealing Defense against Exploiting Information Leak through the Interpretation of Deep Neural Nets	Deep Neural Networks	IJCAI	Link
2022	How to Steer Your Adversary: Targeted and Efficient Model Stealing Defenses with Gradient Redirection	Machine Learning Models	arXiv	Link
2022	A Framework for Understanding Model Extraction Attack and Defense	Machine Learning Models	arXiv	Link
2022	Increasing the Cost of Model Extraction with Calibrated Proof of Work	Machine Learning Models	arXiv	Link

2021

Year	Title	Target Model	Venue	Paper Link
2021	Watermarking Graph Neural Networks by Random Graphs	Graph Neural Networks	ISDFS	Link
2021	BODAME: Bilevel Optimization for Defense Against Model Extraction	Machine Learning Models	arXiv	Link
2021	A protection method of trained CNN model with a secret key from unauthorized access	CNN	APSIPA TSIP	Link
2021	SEAT: Similarity Encoder by Adversarial Training for Detecting Model Extraction Attack Queries	Machine Learning Models	AISec	Link
2021	Entangled Watermarks as a Defense against Model Extraction	Machine Learning Models	USENIX Security	Link
2021	Stateful Detection of Model Extraction Attacks	Machine Learning Models	arXiv	Link
2021	NeurObfuscator: A Full-stack Obfuscation Tool to Mitigate Neural Architecture Stealing	Neural Networks	HOST	Link
2021	DAS-AST: Defending Against Model Stealing Attacks Based on Adaptive Softmax Transformation	Machine Learning Models	ISC	Link
2021	DAWN: Dynamic Adversarial Watermarking of Neural Networks	Neural Networks	ACM MM	Link

2020

Year	Title	Target Model	Venue	Paper Link
2020	Prediction Poisoning: Towards Defenses Against DNN Model Stealing Attacks	Deep Neural Networks	arXiv	Link
2020	Perturbing Inputs to Prevent Model Stealing	Machine Learning Models	CNS	Link
2020	Defending Against Model Stealing Attacks With Adaptive Misinformation	Machine Learning Models	CVPR	Link
2020	Protecting DNNs from Theft using an Ensemble of Diverse Models	Deep Neural Networks	OpenReview	Link
2020	A Protection against the Extraction of Neural Network Models	Neural Networks	arXiv	Link
2020	Preventing Neural Network Weight Stealing via Network Obfuscation	Neural Networks	Intelligent Computing
2020	Information Laundering for Model Privacy	Machine Learning Models	arXiv	Link

2019

Year	Title	Target Model	Venue	Paper Link
2019	Defending Against Neural Network Model Stealing Attacks Using Deceptive Perturbations	Neural Networks	SPW	Link
2019	PRADA: Protecting Against DNN Model Stealing Attacks	Deep Neural Networks	EuroS&P	Link
2019	BDPL: A Boundary Differentially Private Layer Against Machine Learning Model Extraction Attacks	Machine Learning Models	ESORICS
2019	MCNE: An End-to-End Framework for Learning Multiple Conditional Network Representations of Social Network	Social Network Models	arXiv	Link

2018

Year	Title	Target Model	Venue	Paper Link	Code Link
2018	Model Extraction Warning in MLaaS Paradigm	Machine Learning Models	ACSAC	Link
2018	Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks	Deep Neural Networks	Research in Attacks, Intrusions, and Defenses

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
README.md		README.md
reference		reference

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Model Extraction(Stealing) Attacks and Defenses on Machine Learning Models Literature

Survey Papers

Model Extraction Attack

Table of Contents

2024

2023

2022

2021

2020

2019

2018

2017

2016

2014

Model Extraction Defense

Table of Contents

2024

2023

2022

2021

2020

2019

2018

About

Releases

Packages

kzhao5/Model-Extraction-Stealing-Attacks-Machine-Learning-Literature

Folders and files

Latest commit

History

Repository files navigation

Model Extraction(Stealing) Attacks and Defenses on Machine Learning Models Literature

Survey Papers

Model Extraction Attack

Table of Contents

2024

2023

2022

2021

2020

2019

2018

2017

2016

2014

Model Extraction Defense

Table of Contents

2024

2023

2022

2021

2020

2019

2018

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages