Trace and detect adversarial attacks on CNNs using feature response maps

Amirian, Mohammadreza; Schwenker, Friedhelm; Stadelmann, Thilo

doi:10.1007/978-3-319-99978-4_27

Bitte benutzen Sie diese Kennung, um auf die Ressource zu verweisen: https://doi.org/10.21256/zhaw-3863

Publikationstyp:	Konferenz: Paper
Art der Begutachtung:	Peer review (Publikation)
Titel:	Trace and detect adversarial attacks on CNNs using feature response maps
Autor/-in:	Amirian, Mohammadreza Schwenker, Friedhelm Stadelmann, Thilo
DOI:	10.1007/978-3-319-99978-4_27 10.21256/zhaw-3863
Tagungsband:	Artificial Neural Networks in Pattern Recognition
Seite(n):	346
Seiten bis:	358
Angaben zur Konferenz:	8th IAPR TC3 Workshop on Artificial Neural Networks in Pattern Recognition (ANNPR), Siena, Italy, 19-21 September 2018
Erscheinungsdatum:	2018
Reihe:	Lecture Notes in Computer Science
Reihenzählung:	11081
Verlag / Hrsg. Institution:	Springer
ISBN:	978-3-319-99977-7 978-3-319-99978-4
Sprache:	Englisch
Schlagwörter:	Model interpretability; Feature visualization; Diagnostic
Fachgebiet (DDC):	005: Computerprogrammierung, Programme und Daten
Zusammenfassung:	The existence of adversarial attacks on convolutional neural networks (CNN) questions the fitness of such models for serious applications. The attacks manipulate an input image such that misclassification is evoked while still looking normal to a human observer – they are thus not easily detectable. In a different context, backpropagated activations of CNN hidden layers – “feature responses” to a given input – have been helpful to visualize for a human “debugger” what the CNN “looks at” while computing its output. In this work, we propose a novel detection method for adversarial examples to prevent attacks. We do so by tracking adversarial perturbations in feature responses, allowing for automatic detection using average local spatial entropy. The method does not alter the original network architecture and is fully human-interpretable. Experiments confirm the validity of our approach for state-of-the-art attacks on large-scale models trained on ImageNet.
URI:	https://digitalcollection.zhaw.ch/handle/11475/8027
Volltext Version:	Akzeptierte Version
Lizenz (gemäss Verlagsvertrag):	Lizenz gemäss Verlagsvertrag
Departement:	School of Engineering
Organisationseinheit:	Institut für Informatik (InIT)
Publiziert im Rahmen des ZHAW-Projekts:	QualitAI - Quality control of industrial products via deep learning on images
Enthalten in den Sammlungen:	Publikationen School of Engineering

Dateien zu dieser Ressource:

Datei	Beschreibung	Größe	Format
ANNPR_2018c.pdf	Accepted Version	2.95 MB	Adobe PDF	Öffnen/Anzeigen

Zur Langanzeige

Amirian, M., Schwenker, F., & Stadelmann, T. (2018). Trace and detect adversarial attacks on CNNs using feature response maps [Conference paper]. Artificial Neural Networks in Pattern Recognition, 346–358. https://doi.org/10.1007/978-3-319-99978-4_27

Amirian, M., Schwenker, F. and Stadelmann, T. (2018) ‘Trace and detect adversarial attacks on CNNs using feature response maps’, in Artificial Neural Networks in Pattern Recognition. Springer, pp. 346–358. Available at: https://doi.org/10.1007/978-3-319-99978-4_27.

M. Amirian, F. Schwenker, and T. Stadelmann, “Trace and detect adversarial attacks on CNNs using feature response maps,” in Artificial Neural Networks in Pattern Recognition, 2018, pp. 346–358. doi: 10.1007/978-3-319-99978-4_27.

AMIRIAN, Mohammadreza, Friedhelm SCHWENKER und Thilo STADELMANN, 2018. Trace and detect adversarial attacks on CNNs using feature response maps. In: Artificial Neural Networks in Pattern Recognition. Conference paper. Springer. 2018. S. 346–358. ISBN 978-3-319-99977-7

Amirian, Mohammadreza, Friedhelm Schwenker, and Thilo Stadelmann. 2018. “Trace and Detect Adversarial Attacks on CNNs Using Feature Response Maps.” Conference paper. In Artificial Neural Networks in Pattern Recognition, 346–58. Springer. https://doi.org/10.1007/978-3-319-99978-4_27.

Amirian, Mohammadreza, et al. “Trace and Detect Adversarial Attacks on CNNs Using Feature Response Maps.” Artificial Neural Networks in Pattern Recognition, Springer, 2018, pp. 346–58, https://doi.org/10.1007/978-3-319-99978-4_27.

Alle Ressourcen in diesem Repository sind urheberrechtlich geschützt, soweit nicht anderweitig angezeigt.