A generic machine learning framework for fully-unsupervised anomaly detection with contaminated data

Ulmer, Markus; Zgraggen, Jannik; Goren Huber, Lilach

doi:10.36001/ijphm.2024.v15i1.3589

Bitte benutzen Sie diese Kennung, um auf die Ressource zu verweisen: https://doi.org/10.21256/zhaw-30284

Publikationstyp:	Beitrag in wissenschaftlicher Zeitschrift
Art der Begutachtung:	Peer review (Publikation)
Titel:	A generic machine learning framework for fully-unsupervised anomaly detection with contaminated data
Autor/-in:	Ulmer, Markus Zgraggen, Jannik Goren Huber, Lilach
et. al:	No
DOI:	10.36001/ijphm.2024.v15i1.3589 10.21256/zhaw-30284
Erschienen in:	International Journal of Prognostics and Health Management
Band(Heft):	15
Heft:	1
Erscheinungsdatum:	26-Jan-2024
Verlag / Hrsg. Institution:	Prognostics and Health Management Society
ISSN:	2153-2648
Sprache:	Englisch
Schlagwörter:	Deep learning; Machine learning; Anomaly detection; Fully unsupervised learning; Contaminated data; Time series; Data refinement; Fault detection; Acoustic sensor data; Aircraft engine
Fachgebiet (DDC):	006: Spezielle Computerverfahren
Zusammenfassung:	Anomaly detection (AD) tasks have been solved using machine learning algorithms in various domains and applications. The great majority of these algorithms use normal data to train a residual-based model, and assign anomaly scores to unseen samples based on their dissimilarity with the learned normal regime. The underlying assumption of these approaches is that anomaly-free data is available for training. This is, however, often not the case in real-world operational settings, where the training data may be contaminated with a certain fraction of abnormal samples. Training with contaminated data, in turn, inevitably leads to a deteriorated AD performance of the residual-based algorithms. In this paper we introduce a framework for a fully unsupervised refinement of contaminated training data for AD tasks. The framework is generic and can be applied to any residual-based machine learning model. We demonstrate the application of the framework to two public datasets of multivariate time series machine data from different application fields. We show its clear superiority over the naive approach of training with contaminated data without refinement. Moreover, we compare it to the ideal, unrealistic reference in which anomaly-free data would be available for training. Since the approach exploits information from the anomalies, and not only from the normal regime, it is comparable and often outperforms the ideal baseline as well.
URI:	https://digitalcollection.zhaw.ch/handle/11475/30284
Volltext Version:	Publizierte Version
Lizenz (gemäss Verlagsvertrag):	CC BY 3.0: Namensnennung 3.0 Unported
Departement:	School of Engineering
Organisationseinheit:	Institut für Datenanalyse und Prozessdesign (IDP)
Enthalten in den Sammlungen:	Publikationen School of Engineering

Dateien zu dieser Ressource:

Datei	Beschreibung	Größe	Format
2024_Ulmer-etal_Generic-ML-framework-for-fully-unsupervised-AD.pdf		1.55 MB	Adobe PDF	Öffnen/Anzeigen

Zur Langanzeige

Ulmer, M., Zgraggen, J., & Goren Huber, L. (2024). A generic machine learning framework for fully-unsupervised anomaly detection with contaminated data. International Journal of Prognostics and Health Management, 15(1). https://doi.org/10.36001/ijphm.2024.v15i1.3589

Ulmer, M., Zgraggen, J. and Goren Huber, L. (2024) ‘A generic machine learning framework for fully-unsupervised anomaly detection with contaminated data’, International Journal of Prognostics and Health Management, 15(1). Available at: https://doi.org/10.36001/ijphm.2024.v15i1.3589.

M. Ulmer, J. Zgraggen, and L. Goren Huber, “A generic machine learning framework for fully-unsupervised anomaly detection with contaminated data,” International Journal of Prognostics and Health Management, vol. 15, no. 1, Jan. 2024, doi: 10.36001/ijphm.2024.v15i1.3589.

ULMER, Markus, Jannik ZGRAGGEN und Lilach GOREN HUBER, 2024. A generic machine learning framework for fully-unsupervised anomaly detection with contaminated data. International Journal of Prognostics and Health Management. 26 Januar 2024. Bd. 15, Nr. 1. DOI 10.36001/ijphm.2024.v15i1.3589

Ulmer, Markus, Jannik Zgraggen, and Lilach Goren Huber. 2024. “A Generic Machine Learning Framework for Fully-Unsupervised Anomaly Detection with Contaminated Data.” International Journal of Prognostics and Health Management 15 (1). https://doi.org/10.36001/ijphm.2024.v15i1.3589.

Ulmer, Markus, et al. “A Generic Machine Learning Framework for Fully-Unsupervised Anomaly Detection with Contaminated Data.” International Journal of Prognostics and Health Management, vol. 15, no. 1, Jan. 2024, https://doi.org/10.36001/ijphm.2024.v15i1.3589.

Alle Ressourcen in diesem Repository sind urheberrechtlich geschützt, soweit nicht anderweitig angezeigt.