8 November 2022 Increasing depth, distribution distillation, and model soup: erasing backdoor triggers for deep neural networks
Yijian Zhang, Tianxing Zhang, Qi Liu, Guangling Sun, Hanzhou Wu
Author Affiliations +
Abstract

Deep neural networks are vulnerable to backdoor attacks, in which the adversary injects a trigger embedded set into the training process. Inputs marked with the trigger provide incorrect predictions, whereas clean inputs remain unaffected. To erase the latent triggers in models, increasing depth, distribution distillation, and model soup (ID3MS), a defensive solution that operates without prior knowledge of triggers and relies on a small clean set is introduced. The depth of the backdoor model is increased by adding fully connected layer(s) at the penultimate layer. Without a classification layer, the original backdoor and increased depth models are considered as teacher and student, respectively. The student model applies distribution distillation to refit the distribution of the clean set and erase the backdoor triggers. The distilled student model is then recovered with the classification layer and model soup is used to ensemble a collection of models generated by various fine-tuning hyperparameters. The experimental results validate the superior performance of the ID3MS compared with existing defensive techniques against several attacks across datasets.

© 2022 SPIE and IS&T
Yijian Zhang, Tianxing Zhang, Qi Liu, Guangling Sun, and Hanzhou Wu "Increasing depth, distribution distillation, and model soup: erasing backdoor triggers for deep neural networks," Journal of Electronic Imaging 31(6), 063005 (8 November 2022). https://doi.org/10.1117/1.JEI.31.6.063005
Received: 25 June 2022; Accepted: 17 October 2022; Published: 8 November 2022
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Tumor growth modeling

Data modeling

Performance modeling

Statistical modeling

Fourier transforms

Defense and security

Neural networks

Back to Top