Please use this identifier to cite or link to this item:
https://doi.org/10.21256/zhaw-3863
Publication type: | Conference paper |
Type of review: | Peer review (publication) |
Title: | Trace and detect adversarial attacks on CNNs using feature response maps |
Authors: | Amirian, Mohammadreza Schwenker, Friedhelm Stadelmann, Thilo |
DOI: | 10.1007/978-3-319-99978-4_27 10.21256/zhaw-3863 |
Proceedings: | Artificial Neural Networks in Pattern Recognition |
Page(s): | 346 |
Pages to: | 358 |
Conference details: | 8th IAPR TC3 Workshop on Artificial Neural Networks in Pattern Recognition (ANNPR), Siena, Italy, 19-21 September 2018 |
Issue Date: | 2018 |
Series: | Lecture Notes in Computer Science |
Series volume: | 11081 |
Publisher / Ed. Institution: | Springer |
ISBN: | 978-3-319-99977-7 978-3-319-99978-4 |
Language: | English |
Subjects: | Model interpretability; Feature visualization; Diagnostic |
Subject (DDC): | 005: Computer programming, programs and data |
Abstract: | The existence of adversarial attacks on convolutional neural networks (CNN) questions the fitness of such models for serious applications. The attacks manipulate an input image such that misclassification is evoked while still looking normal to a human observer – they are thus not easily detectable. In a different context, backpropagated activations of CNN hidden layers – “feature responses” to a given input – have been helpful to visualize for a human “debugger” what the CNN “looks at” while computing its output. In this work, we propose a novel detection method for adversarial examples to prevent attacks. We do so by tracking adversarial perturbations in feature responses, allowing for automatic detection using average local spatial entropy. The method does not alter the original network architecture and is fully human-interpretable. Experiments confirm the validity of our approach for state-of-the-art attacks on large-scale models trained on ImageNet. |
URI: | https://digitalcollection.zhaw.ch/handle/11475/8027 |
Fulltext version: | Accepted version |
License (according to publishing contract): | Licence according to publishing contract |
Departement: | School of Engineering |
Organisational Unit: | Institute of Computer Science (InIT) |
Published as part of the ZHAW project: | QualitAI - Quality control of industrial products via deep learning on images |
Appears in collections: | Publikationen School of Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
ANNPR_2018c.pdf | Accepted Version | 2.95 MB | Adobe PDF | View/Open |
Show full item record
Amirian, M., Schwenker, F., & Stadelmann, T. (2018). Trace and detect adversarial attacks on CNNs using feature response maps [Conference paper]. Artificial Neural Networks in Pattern Recognition, 346–358. https://doi.org/10.1007/978-3-319-99978-4_27
Amirian, M., Schwenker, F. and Stadelmann, T. (2018) ‘Trace and detect adversarial attacks on CNNs using feature response maps’, in Artificial Neural Networks in Pattern Recognition. Springer, pp. 346–358. Available at: https://doi.org/10.1007/978-3-319-99978-4_27.
M. Amirian, F. Schwenker, and T. Stadelmann, “Trace and detect adversarial attacks on CNNs using feature response maps,” in Artificial Neural Networks in Pattern Recognition, 2018, pp. 346–358. doi: 10.1007/978-3-319-99978-4_27.
AMIRIAN, Mohammadreza, Friedhelm SCHWENKER und Thilo STADELMANN, 2018. Trace and detect adversarial attacks on CNNs using feature response maps. In: Artificial Neural Networks in Pattern Recognition. Conference paper. Springer. 2018. S. 346–358. ISBN 978-3-319-99977-7
Amirian, Mohammadreza, Friedhelm Schwenker, and Thilo Stadelmann. 2018. “Trace and Detect Adversarial Attacks on CNNs Using Feature Response Maps.” Conference paper. In Artificial Neural Networks in Pattern Recognition, 346–58. Springer. https://doi.org/10.1007/978-3-319-99978-4_27.
Amirian, Mohammadreza, et al. “Trace and Detect Adversarial Attacks on CNNs Using Feature Response Maps.” Artificial Neural Networks in Pattern Recognition, Springer, 2018, pp. 346–58, https://doi.org/10.1007/978-3-319-99978-4_27.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.