Defense against adversarial attack by feature distillation and metric learning

Xiang Xiang; Yi Xu; Pengfei Zhang; Xiaoming Ju

doi:10.1117/12.2645603

23 August 2022 Defense against adversarial attack by feature distillation and metric learning

Xiang Xiang, Yi Xu, Pengfei Zhang, Xiaoming Ju

Proceedings Volume 12305, International Symposium on Artificial Intelligence Control and Application Technology (AICAT 2022); 123051N (2022) https://doi.org/10.1117/12.2645603
Event: International Symposium on Artificial Intelligence Control and Application Technology (AICAT 2022), 2022, Hangzhou, China

Abstract

In recent years, deep neural networks have achieved high accuracy in many classification tasks, including speech recognition, object detection, and image classification. Although the deep neural network is robust to random noise, when some special disturbances that cannot be detected by the human eye are added to the neural network input, these special disturbances will still cause the deep neural network model to output wrong predictions. For the defense method against adversarial samples, we propose an adversarial training method based on the combination of feature distillation and metric learning. This method is to pretrain a fixed teacher network training method and use clean sample training. The student network uses adversarial samples for adversarial training. During the training process, the clean samples are used in the middle layer features of the teacher network to guide the adversarial samples in the middle layer features of the student network, and the middle layer features of the adversarial samples are repaired in the student network to achieve good results. At the same time, considering the relationship between adversarial samples and clean samples in the student network, a metric learning loss is introduced in¹ the middle layer features of the student network, so that the distance between the adversarial samples and the clean samples is closer than that between the adversarial samples and the confused samples. This makes the deep neural network model more robust. Finally, we perform gray-box, white-box and black-box attacks to verify the effectiveness of our method. Our algorithm significantly outperforms state-of-the-art adversarial training algorithms.

Citation Download Citation

Xiang Xiang, Yi Xu, Pengfei Zhang, and Xiaoming Ju "Defense against adversarial attack by feature distillation and metric learning", Proc. SPIE 12305, International Symposium on Artificial Intelligence Control and Application Technology (AICAT 2022), 123051N (23 August 2022); https://doi.org/10.1117/12.2645603

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
9 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Data modeling

Defense and security

Statistical modeling

Neural networks

Feature extraction

Detection and tracking algorithms

Image classification

Show All Keywords

Keywords/Phrases

Search In:

Publication Years