Please use this identifier to cite or link to this item: https://doi.org/10.21256/zhaw-3761
Title: Speaker identification and clustering using convolutional neural networks
Authors : Lukic, Yanick
Vogt, Carlo
Dürr, Oliver
Stadelmann, Thilo
Proceedings: 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP),
Conference details: 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP), Vietri sul Mare, Italy, 13-16 Sept. 2016
Publisher / Ed. Institution : IEEE
Issue Date: Sep-2016
License (according to publishing contract) : Licence according to publishing contract
Type of review: Peer review (Publication)
Language : English
Subjects : Datalab; Speaker identification; Speaker clustering; Deep learning
Subject (DDC) : 004: Computer science
Abstract: Deep learning, especially in the form of convolutional neural networks (CNNs), has triggered substantial improvements in computer vision and related fields in recent years. This progress is attributed to the shift from designing features and subsequent individual sub-systems towards learning features and recognition systems end to end from nearly unprocessed data. For speaker clustering, however, it is still common to use handcrafted processing chains such as MFCC features and GMM-based models. In this paper, we use simple spectrograms as input to a CNN and study the optimal design of those networks for speaker identification and clustering. Furthermore, we elaborate on the question how to transfer a network, trained for speaker identification, to speaker clustering. We demonstrate our approach on the well known TIMIT dataset, achieving results comparable with the state of the art– without the need for handcrafted features.
Departement: School of Engineering
Organisational Unit: Institute of Applied Information Technology (InIT)
Institute of Data Analysis and Process Design (IDP)
Publication type: Conference Paper
DOI : 10.1109/MLSP.2016.7738816
10.21256/zhaw-3761
ISBN: 978-1-5090-0746-2
URI: https://digitalcollection.zhaw.ch/handle/11475/7087
Other identifiers : INSPEC Accession Number: 16449884
Appears in Collections:Publikationen School of Engineering

Files in This Item:
File Description SizeFormat 
MLSP_2016.pdf897.9 kBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.