Evaluating the efficacy of different adversarial training strategies

Roger Sengphanith; Diego Marez; Jane N. Berk; Shibin Parameswaran

doi:10.1117/12.2678377

12 June 2023 Evaluating the efficacy of different adversarial training strategies

Roger Sengphanith, Diego Marez, Jane N. Berk, Shibin Parameswaran

Proceedings Volume 12538, Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications V; 125381W (2023) https://doi.org/10.1117/12.2678377
Event: SPIE Defense + Commercial Sensing, 2023, Orlando, Florida, United States

Abstract

Adversarial training (AT) is considered the most effective strategy to defend a machine learning model against adversarial attacks. There are many different methods to perform AT, but the underlying principle is the same, namely, augment the training data with adversarial examples. In this work, we investigate the efficacy of four different adversarial example generation strategies on AT of a given classification model. The four methods represent different categories of attack and data. Specifically, two of the adversarial generation algorithms perform attacks in the pixel domain, while others operate in the latent space of the data. On the other hand, two of the methods generate adversarial data samples designed to be near the model decision boundaries, while the other two generate generic adversarial examples (not necessarily at the boundary). The adversarial examples from these methods are used to adversarially train models on MNIST and CIFAR10. In the absence of a good metric to measure robustness of a model, capturing the effect of AT using a single metric can be a challenge. Hence, we opt to evaluate the robustness improvements resulting of the adversarially trained model using a variety of empirical metrics introduced in the literature that measure local Lipshitz value of a network (CLEVER), smoothness of decision boundaries, robustness to adversarial perturbations and defense transferability

Citation Download Citation

Roger Sengphanith, Diego Marez, Jane N. Berk, and Shibin Parameswaran "Evaluating the efficacy of different adversarial training strategies", Proc. SPIE 12538, Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications V, 125381W (12 June 2023); https://doi.org/10.1117/12.2678377

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available