An Empirical Look at Gradient-based Black-box Adversarial Attacks on Deep Neural Networks Using One-point Residual Estimates

More Info
expand_more

Abstract

In recent years, there has been a great deal of studies about the optimisation of generating adversarial examples for Deep Neural Networks (DNNs) in a black-box environment. The use of gradient-based techniques to get the adversarial images in a minimal amount of input-output correspondence with the attacked model has been extensively studied. However, existing studies have not been discussing the effect of different gradient estimation techniques coherently. In this paper, a new one-point residual estimate is compared to the known two-point estimates. The findings in this paper show that the one-point residual estimate is not a viable option to decrease the number of queries to the attacked model. The accuracy of the attacks with the use of an one-point residual estimate maintains the same for weaker models. For stronger models, there is a slight decrease in accuracy at identical distortion levels. All estimates are tested on different PGD attacks on the MNIST and F-MNIST datasets using a 3-layer convolutional network.

Files