Audiomate : a Python package for working with audio datasets

Büchi, Matthias; Ahlenstorf, Andreas

doi:10.21105/joss.02135

Please use this identifier to cite or link to this item: https://doi.org/10.21256/zhaw-22925

Publication type:	Article in scientific journal
Type of review:	Peer review (publication)
Title:	Audiomate : a Python package for working with audio datasets
Authors:	Büchi, Matthias Ahlenstorf, Andreas
et. al:	No
DOI:	10.21105/joss.02135 10.21256/zhaw-22925
Published in:	Journal of Open Source Software
Volume(Issue):	5
Issue:	52
Page(s):	2135
Issue Date:	2020
Publisher / Ed. Institution:	Open Journals
ISSN:	2475-9066
Language:	English
Subject (DDC):	005: Computer programming, programs and data
Abstract:	Machine learning tasks in the audio domain frequently require large datasets with training data. Over the last years, numerous datasets have been made available for various purposes, for example, (Snyder, Chen, & Povey, 2015) and (Ardila et al., 2019). Unfortunately, most of the datasets are stored in widely differing formats. As a consequence, machine learning practitioners have to convert datasets into other formats before they can be used or combined. Furthermore, common tasks like reading, partitioning, or shuffling of datasets have to be developed over and over again for each format and require intimate knowledge of the formats. We purpose Audiomate, a Python toolkit, to solve this problem. Audiomate provides a uniform programming interface to work with numerous datasets. Knowledge about the structure or on-disk format of the datasets is not necessary. Audiomate facilitates and simplifies a wide range of tasks: • Reading and writing of numerous dataset formats using a uniform programming interface, for example (Snyder et al., 2015), (Panayotov, Chen, Povey, & Khudanpur, 2015) and (Ardila et al., 2019) • Accessing metadata, like speaker information and labels • Reading audio data (single files, batches of files) • Retrieval of information about the data (e.g., number of speakers, total duration). • Merging of multiple datasets (e.g., combine two speech datasets). • Splitting data into smaller subsets (e.g., create training, validation, and test sets with a reasonable distribution of classes). • Validation of data for specific requirements (e.g., check whether all samples were assigned a label)
URI:	https://digitalcollection.zhaw.ch/handle/11475/22925
Fulltext version:	Published version
License (according to publishing contract):	CC BY 4.0: Attribution 4.0 International
Departement:	School of Engineering
Organisational Unit:	Institute of Computer Science (InIT)
Appears in collections:	Publikationen School of Engineering

Files in This Item:

File	Description	Size	Format
2021_Buechi-Ahlenstorf_audiomate-Python-package.pdf		165.98 kB	Adobe PDF	View/Open

Show full item record

Büchi, M., & Ahlenstorf, A. (2020). Audiomate : a Python package for working with audio datasets. Journal of Open Source Software, 5(52), 2135. https://doi.org/10.21105/joss.02135

Büchi, M. and Ahlenstorf, A. (2020) ‘Audiomate : a Python package for working with audio datasets’, Journal of Open Source Software, 5(52), p. 2135. Available at: https://doi.org/10.21105/joss.02135.

M. Büchi and A. Ahlenstorf, “Audiomate : a Python package for working with audio datasets,” Journal of Open Source Software, vol. 5, no. 52, p. 2135, 2020, doi: 10.21105/joss.02135.

BÜCHI, Matthias und Andreas AHLENSTORF, 2020. Audiomate : a Python package for working with audio datasets. Journal of Open Source Software. 2020. Bd. 5, Nr. 52, S. 2135. DOI 10.21105/joss.02135

Büchi, Matthias, and Andreas Ahlenstorf. 2020. “Audiomate : A Python Package for Working with Audio Datasets.” Journal of Open Source Software 5 (52): 2135. https://doi.org/10.21105/joss.02135.

Büchi, Matthias, and Andreas Ahlenstorf. “Audiomate : A Python Package for Working with Audio Datasets.” Journal of Open Source Software, vol. 5, no. 52, 2020, p. 2135, https://doi.org/10.21105/joss.02135.