Please use this identifier to cite or link to this item: https://doi.org/10.21256/zhaw-3761
Full metadata record
DC FieldValueLanguage
dc.contributor.authorLukic, Yanick-
dc.contributor.authorVogt, Carlo-
dc.contributor.authorDürr, Oliver-
dc.contributor.authorStadelmann, Thilo-
dc.date.accessioned2018-06-19T12:32:11Z-
dc.date.available2018-06-19T12:32:11Z-
dc.date.issued2016-
dc.identifier.isbn978-1-5090-0746-2de_CH
dc.identifier.otherINSPEC Accession Number: 16449884de_CH
dc.identifier.urihttps://digitalcollection.zhaw.ch/handle/11475/7087-
dc.description.abstractDeep learning, especially in the form of convolutional neural networks (CNNs), has triggered substantial improvements in computer vision and related fields in recent years. This progress is attributed to the shift from designing features and subsequent individual sub-systems towards learning features and recognition systems end to end from nearly unprocessed data. For speaker clustering, however, it is still common to use handcrafted processing chains such as MFCC features and GMM-based models. In this paper, we use simple spectrograms as input to a CNN and study the optimal design of those networks for speaker identification and clustering. Furthermore, we elaborate on the question how to transfer a network, trained for speaker identification, to speaker clustering. We demonstrate our approach on the well known TIMIT dataset, achieving results comparable with the state of the art – without the need for handcrafted features.de_CH
dc.language.isoende_CH
dc.publisherIEEEde_CH
dc.rightsLicence according to publishing contractde_CH
dc.subjectDatalabde_CH
dc.subjectSpeaker identificationde_CH
dc.subjectSpeaker clusteringde_CH
dc.subjectDeep learningde_CH
dc.subject.ddc006: Spezielle Computerverfahrende_CH
dc.titleSpeaker identification and clustering using convolutional neural networksde_CH
dc.typeKonferenz: Paperde_CH
dcterms.typeTextde_CH
zhaw.departementSchool of Engineeringde_CH
zhaw.organisationalunitInstitut für Informatik (InIT)de_CH
zhaw.organisationalunitInstitut für Datenanalyse und Prozessdesign (IDP)de_CH
dc.identifier.doi10.21256/zhaw-3761-
dc.identifier.doi10.1109/MLSP.2016.7738816de_CH
zhaw.conference.details26th IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2016), Vietri sul Mare, Italy, 13-16 Sept. 2016de_CH
zhaw.funding.euNode_CH
zhaw.originated.zhawYesde_CH
zhaw.publication.statusacceptedVersionde_CH
zhaw.publication.reviewPeer review (Publikation)de_CH
zhaw.title.proceedings2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP),de_CH
zhaw.webfeedDatalabde_CH
zhaw.webfeedInformation Engineeringde_CH
zhaw.webfeedMachine Perception and Cognitionde_CH
Appears in collections:Publikationen School of Engineering

Files in This Item:
File Description SizeFormat 
MLSP_2016.pdf897.9 kBAdobe PDFThumbnail
View/Open
Show simple item record
Lukic, Y., Vogt, C., Dürr, O., & Stadelmann, T. (2016). Speaker identification and clustering using convolutional neural networks. 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP),. https://doi.org/10.21256/zhaw-3761
Lukic, Y. et al. (2016) ‘Speaker identification and clustering using convolutional neural networks’, in 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP),. IEEE. Available at: https://doi.org/10.21256/zhaw-3761.
Y. Lukic, C. Vogt, O. Dürr, and T. Stadelmann, “Speaker identification and clustering using convolutional neural networks,” in 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP), 2016. doi: 10.21256/zhaw-3761.
LUKIC, Yanick, Carlo VOGT, Oliver DÜRR und Thilo STADELMANN, 2016. Speaker identification and clustering using convolutional neural networks. In: 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP),. Conference paper. IEEE. 2016. ISBN 978-1-5090-0746-2
Lukic, Yanick, Carlo Vogt, Oliver Dürr, and Thilo Stadelmann. 2016. “Speaker Identification and Clustering Using Convolutional Neural Networks.” Conference paper. In 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP),. IEEE. https://doi.org/10.21256/zhaw-3761.
Lukic, Yanick, et al. “Speaker Identification and Clustering Using Convolutional Neural Networks.” 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP), IEEE, 2016, https://doi.org/10.21256/zhaw-3761.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.