Using non-linear activation functions to increase robustness of AI models to adversarial attacks

Itai Dror; Raz Birman; Aviram Lachmani; David Shmailov; Ofer Hadar

doi:10.1117/12.2638358

28 October 2022 Using non-linear activation functions to increase robustness of AI models to adversarial attacks

Itai Dror, Raz Birman, Aviram Lachmani, David Shmailov, Ofer Hadar

Proceedings Volume 12275, Counterterrorism, Crime Fighting, Forensics, and Surveillance Technologies VI; 122750I (2022) https://doi.org/10.1117/12.2638358
Event: SPIE Security + Defence, 2022, Berlin, Germany

Abstract

Image classification tasks leverage CNN to yield accurate results that supersede their predecessor human-crafted algorithms. Applicable use cases include Autonomous, Face, Medical Imaging, and more. Along with the growing use of AI image classification applications, we see emerging research on the robustness of such models to adversarial attacks, which take advantage of the unique vulnerabilities of the Artificial Intelligence (AI) models to skew their classification results. While not visible to the Human Visual System (HVS), these attacks mislead the algorithms and yield wrong classification results. To be incorporated securely enough in real-world applications, AI-based image classification algorithms require protection that will increase their robustness to adversarial attacks. We propose replacing the commonly used Rectifier Linear Unit (ReLU) Activation Function (AF), which is piecewise linear, with non-linear AF to increase their robustness to adversarial attacks. This approach has been considered in recent research and is motivated by the observation that non-linear AF tends to diminish the effect of adversarial perturbations in the DNN layers. To gain credibility of the approach, we have applied Fast Sign Gradient Method (FGSM), and Hop-Skip- Jump (HSJ) attacks to a trained classification model of the MNIST dataset. We then replaced the AF of the model with non-linear AF (Sigmoid, GeLU, ELU, SeLU, and Tanh). We concluded that while attacks on the original model have a 100% success rate, the attack success rate is dropped by an average of 10% when non-linear AF is used.

Conference Presentation

Citation Download Citation

Itai Dror, Raz Birman, Aviram Lachmani, David Shmailov, and Ofer Hadar "Using non-linear activation functions to increase robustness of AI models to adversarial attacks", Proc. SPIE 12275, Counterterrorism, Crime Fighting, Forensics, and Surveillance Technologies VI, 122750I (28 October 2022); https://doi.org/10.1117/12.2638358

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available