Computer Science - Cryptography and Security Computer Science - Learning
Although deep neural networks (DNNs) have achieved a great success in various
computer vision tasks, it is recently found that they are vulnerable to
adversarial attacks. In this paper, we focus on the so-called \textit{backdoor
attack}, which injects a backdoor trigger to a small portion of training data
(also known as data poisoning) such that the trained DNN induces
misclassification while facing examples with this trigger. To be specific, we
carefully study the effect of both real and synthetic backdoor attacks on the
internal response of vanilla and backdoored DNNs through the lens of Gard-CAM.
Moreover, we show that the backdoor attack induces a significant bias in neuron
activation in terms of the $\ell_\infty$ norm of an activation map compared to
its $\ell_1$ and $\ell_2$ norm. Spurred by our results, we propose the
\textit{$\ell_\infty$-based neuron pruning} to remove the backdoor from the
backdoored DNN. Experiments show that our method could effectively decrease the
attack success rate, and also hold a high classification accuracy for clean
images.
Metrics
4 Record Views
Details
Title
Defending against Backdoor Attack on Deep Neural Networks
Creators
Kaidi Xu
Sijia Liu
Pin-Yu Chen
Pu Zhao
Xue Lin
Publication Details
arXiv.org
Resource Type
Preprint
Language
English
Academic Unit
Computer Science (Computing)
Other Identifier
991021871353304721
Research Home Page
Browse by research and academic units
Learn about the ETD submission process at Drexel
Learn about the Libraries’ research data management services