Please use this identifier to cite or link to this item:
|Publication type:||Article in scientific journal|
|Type of review:||Peer review (publication)|
|Title:||Audiomate : a Python package for working with audio datasets|
|Published in:||Journal of Open Source Software|
|Publisher / Ed. Institution:||Open Journals|
|Subject (DDC):||005: Computer programming, programs and data|
|Abstract:||Machine learning tasks in the audio domain frequently require large datasets with training data. Over the last years, numerous datasets have been made available for various purposes, for example, (Snyder, Chen, & Povey, 2015) and (Ardila et al., 2019). Unfortunately, most of the datasets are stored in widely differing formats. As a consequence, machine learning practitioners have to convert datasets into other formats before they can be used or combined. Furthermore, common tasks like reading, partitioning, or shuffling of datasets have to be developed over and over again for each format and require intimate knowledge of the formats. We purpose Audiomate, a Python toolkit, to solve this problem. Audiomate provides a uniform programming interface to work with numerous datasets. Knowledge about the structure or on-disk format of the datasets is not necessary. Audiomate facilitates and simplifies a wide range of tasks: • Reading and writing of numerous dataset formats using a uniform programming interface, for example (Snyder et al., 2015), (Panayotov, Chen, Povey, & Khudanpur, 2015) and (Ardila et al., 2019) • Accessing metadata, like speaker information and labels • Reading audio data (single files, batches of files) • Retrieval of information about the data (e.g., number of speakers, total duration). • Merging of multiple datasets (e.g., combine two speech datasets). • Splitting data into smaller subsets (e.g., create training, validation, and test sets with a reasonable distribution of classes). • Validation of data for specific requirements (e.g., check whether all samples were assigned a label)|
|Fulltext version:||Published version|
|License (according to publishing contract):||CC BY 4.0: Attribution 4.0 International|
|Departement:||School of Engineering|
|Organisational Unit:||Institute of Applied Information Technology (InIT)|
|Appears in collections:||Publikationen School of Engineering|
Files in This Item:
|2021_Buechi-Ahlenstorf_audiomate-Python-package.pdf||165.98 kB||Adobe PDF|
Show full item record
Büchi, M., & Ahlenstorf, A. (2020). Audiomate : a Python package for working with audio datasets. Journal of Open Source Software, 5(52), 2135. https://doi.org/10.21105/joss.02135
Büchi, M. and Ahlenstorf, A. (2020) ‘Audiomate : a Python package for working with audio datasets’, Journal of Open Source Software, 5(52), p. 2135. Available at: https://doi.org/10.21105/joss.02135.
M. Büchi and A. Ahlenstorf, “Audiomate : a Python package for working with audio datasets,” Journal of Open Source Software, vol. 5, no. 52, p. 2135, 2020, doi: 10.21105/joss.02135.
Büchi, Matthias, and Andreas Ahlenstorf. “Audiomate : A Python Package for Working with Audio Datasets.” Journal of Open Source Software, vol. 5, no. 52, 2020, p. 2135, https://doi.org/10.21105/joss.02135.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.