Please use this identifier to cite or link to this item: https://doi.org/10.21256/zhaw-30284
Publication type: Article in scientific journal
Type of review: Peer review (publication)
Title: A generic machine learning framework for fully-unsupervised anomaly detection with contaminated data
Authors: Ulmer, Markus
Zgraggen, Jannik
Goren Huber, Lilach
et. al: No
DOI: 10.36001/ijphm.2024.v15i1.3589
10.21256/zhaw-30284
Published in: International Journal of Prognostics and Health Management
Volume(Issue): 15
Issue: 1
Issue Date: 26-Jan-2024
Publisher / Ed. Institution: Prognostics and Health Management Society
ISSN: 2153-2648
Language: English
Subjects: Deep learning; Machine learning; Anomaly detection; Fully unsupervised learning; Contaminated data; Time series; Data refinement; Fault detection; Acoustic sensor data; Aircraft engine
Subject (DDC): 006: Special computer methods
Abstract: Anomaly detection (AD) tasks have been solved using machine learning algorithms in various domains and applications. The great majority of these algorithms use normal data to train a residual-based model, and assign anomaly scores to unseen samples based on their dissimilarity with the learned normal regime. The underlying assumption of these approaches is that anomaly-free data is available for training. This is, however, often not the case in real-world operational settings, where the training data may be contaminated with a certain fraction of abnormal samples. Training with contaminated data, in turn, inevitably leads to a deteriorated AD performance of the residual-based algorithms. In this paper we introduce a framework for a fully unsupervised refinement of contaminated training data for AD tasks. The framework is generic and can be applied to any residual-based machine learning model. We demonstrate the application of the framework to two public datasets of multivariate time series machine data from different application fields. We show its clear superiority over the naive approach of training with contaminated data without refinement. Moreover, we compare it to the ideal, unrealistic reference in which anomaly-free data would be available for training. Since the approach exploits information from the anomalies, and not only from the normal regime, it is comparable and often outperforms the ideal baseline as well.
URI: https://digitalcollection.zhaw.ch/handle/11475/30284
Fulltext version: Published version
License (according to publishing contract): CC BY 3.0: Attribution 3.0 Unported
Departement: School of Engineering
Organisational Unit: Institute of Data Analysis and Process Design (IDP)
Appears in collections:Publikationen School of Engineering

Files in This Item:
File Description SizeFormat 
2024_Ulmer-etal_Generic-ML-framework-for-fully-unsupervised-AD.pdf1.55 MBAdobe PDFThumbnail
View/Open
Show full item record
Ulmer, M., Zgraggen, J., & Goren Huber, L. (2024). A generic machine learning framework for fully-unsupervised anomaly detection with contaminated data. International Journal of Prognostics and Health Management, 15(1). https://doi.org/10.36001/ijphm.2024.v15i1.3589
Ulmer, M., Zgraggen, J. and Goren Huber, L. (2024) ‘A generic machine learning framework for fully-unsupervised anomaly detection with contaminated data’, International Journal of Prognostics and Health Management, 15(1). Available at: https://doi.org/10.36001/ijphm.2024.v15i1.3589.
M. Ulmer, J. Zgraggen, and L. Goren Huber, “A generic machine learning framework for fully-unsupervised anomaly detection with contaminated data,” International Journal of Prognostics and Health Management, vol. 15, no. 1, Jan. 2024, doi: 10.36001/ijphm.2024.v15i1.3589.
ULMER, Markus, Jannik ZGRAGGEN und Lilach GOREN HUBER, 2024. A generic machine learning framework for fully-unsupervised anomaly detection with contaminated data. International Journal of Prognostics and Health Management. 26 Januar 2024. Bd. 15, Nr. 1. DOI 10.36001/ijphm.2024.v15i1.3589
Ulmer, Markus, Jannik Zgraggen, and Lilach Goren Huber. 2024. “A Generic Machine Learning Framework for Fully-Unsupervised Anomaly Detection with Contaminated Data.” International Journal of Prognostics and Health Management 15 (1). https://doi.org/10.36001/ijphm.2024.v15i1.3589.
Ulmer, Markus, et al. “A Generic Machine Learning Framework for Fully-Unsupervised Anomaly Detection with Contaminated Data.” International Journal of Prognostics and Health Management, vol. 15, no. 1, Jan. 2024, https://doi.org/10.36001/ijphm.2024.v15i1.3589.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.