Defending against Backdoor Attack on Deep Neural Networks

Kaidi Xu; Sijia Liu; Pin-Yu Chen; Pu Zhao; Xue Lin

doi:10.48550/arxiv.2002.12162

Back

Preprint

Open access

Defending against Backdoor Attack on Deep Neural Networks

Kaidi Xu, Sijia Liu, Pin-Yu Chen, Pu Zhao and Xue Lin

arXiv.org

21 Jun 2021

DOI: https://doi.org/10.48550/arxiv.2002.12162

Files and links (1)

url

https://doi.org/10.48550/arxiv.2002.12162View

Preprint (Author's original)arXiv.org - Non-exclusive license to distribute, Open

Abstract

Computer Science - Cryptography and Security

Computer Science - Learning

Although deep neural networks (DNNs) have achieved a great success in various computer vision tasks, it is recently found that they are vulnerable to adversarial attacks. In this paper, we focus on the so-called \textit{backdoor attack}, which injects a backdoor trigger to a small portion of training data (also known as data poisoning) such that the trained DNN induces misclassification while facing examples with this trigger. To be specific, we carefully study the effect of both real and synthetic backdoor attacks on the internal response of vanilla and backdoored DNNs through the lens of Gard-CAM. Moreover, we show that the backdoor attack induces a significant bias in neuron activation in terms of the $\ell_\infty$ norm of an activation map compared to its $\ell_1$ and $\ell_2$ norm. Spurred by our results, we propose the \textit{$\ell_\infty$-based neuron pruning} to remove the backdoor from the backdoored DNN. Experiments show that our method could effectively decrease the attack success rate, and also hold a high classification accuracy for clean images.

Metrics

4 Record Views

Details

Title: Defending against Backdoor Attack on Deep Neural Networks
Creators: Kaidi Xu
Sijia Liu
Pin-Yu Chen
Pu Zhao
Xue Lin
Publication Details: arXiv.org
Resource Type: Preprint
Language: English
Academic Unit: Computer Science (Computing)
Other Identifier: 991021871353304721

Defending against Backdoor Attack on Deep Neural Networks

Files and links (1)

Abstract

Metrics

Details

Drexel University Social media