Data-Free Inversion of Suspicious Trojan Inputs via Latent Diffusion

DISTIL is an innovative trigger-inversion method for deep neural networks that reconstructs malicious backdoor triggers without relying on extensive datasets or strong assumptions about trigger appearance. By employing a diffusion-based generator guided by the target classifier, DISTIL iteratively produces candidate triggers that align with the model's internal representations associated with malicious behavior. This approach effectively narrows the search space and enhances the reliability of trigger reconstruction, making it capable of distinguishing between clean and trojaned models. Empirical evaluations demonstrate that DISTIL significantly outperforms existing methods, achieving notable improvements in accuracy on benchmark datasets, thereby providing a robust and adaptable defense against backdoor attacks.

🚀 Accepted to ICCV 2025!

Demos

Citation

Please cite our work if you use the codebase:

Hossein Mirzaei, Zeinab Sadat Taghavi, Sepehr Rezaee, Masoud Hadi, Moein Madadi, Mackenzie W Mathis
DISTIL: Data-Free Inversion of Suspicious Trojan Inputs via Latent Diffusion
ICCV 2025

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
DISTIL		DISTIL
Figs		Figs
Notebooks		Notebooks
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Data-Free Inversion of Suspicious Trojan Inputs via Latent Diffusion

Demos

Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

AdaptiveMotorControlLab/DISTIL

Folders and files

Latest commit

History

Repository files navigation

Data-Free Inversion of Suspicious Trojan Inputs via Latent Diffusion

Demos

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages