Bitte benutzen Sie diese Kennung, um auf die Ressource zu verweisen: https://doi.org/10.21256/zhaw-30250
Publikationstyp: Konferenz: Paper
Art der Begutachtung: Peer review (Publikation)
Titel: Text-to-speech pipeline for Swiss German : a comparison
Autor/-in: Bollinger, Tobias
Deriu, Jan Milan
Vogel, Manfred
et. al: No
DOI: 10.48550/arXiv.2305.19750
10.21256/zhaw-30250
Angaben zur Konferenz: 8th Swiss Text Analytics Conference – SwissText 2023, Neuchâtel, Switzerland, 12-14 June 2023
Erscheinungsdatum: Jun-2023
Verlag / Hrsg. Institution: arXiv
Sprache: Englisch
Schlagwörter: Speech synthesis; Text to speech
Fachgebiet (DDC): 410.285: Computerlinguistik
430: Deutsch
Zusammenfassung: In this work, we studied the synthesis of Swiss German speech using different Text-to-Speech (TTS) models. We evaluated the TTS models on three corpora, and we found, that VITS models performed best, hence, using them for further testing. We also introduce a new method to evaluate TTS models by letting the discriminator of a trained vocoder GAN model predict whether a given waveform is human or synthesized. In summary, our best model delivers speech synthesis for different Swiss German dialects with previously unachieved quality.
URI: https://digitalcollection.zhaw.ch/handle/11475/30250
Volltext Version: Publizierte Version
Lizenz (gemäss Verlagsvertrag): CC BY 4.0: Namensnennung 4.0 International
Departement: School of Engineering
Organisationseinheit: Centre for Artificial Intelligence (CAI)
Publiziert im Rahmen des ZHAW-Projekts: End-to-End Low-Resource Speech Translation for Swiss German Dialects
Enthalten in den Sammlungen:Publikationen School of Engineering

Dateien zu dieser Ressource:
Datei Beschreibung GrößeFormat 
2023_Bollinger-etal_Text-to-speech-pipeline-for-Swiss-German.pdf1.03 MBAdobe PDFMiniaturbild
Öffnen/Anzeigen
Zur Langanzeige
Bollinger, T., Deriu, J. M., & Vogel, M. (2023, June). Text-to-speech pipeline for Swiss German : a comparison. 8th Swiss Text Analytics Conference – SwissText 2023, Neuchâtel, Switzerland, 12-14 June 2023. https://doi.org/10.48550/arXiv.2305.19750
Bollinger, T., Deriu, J.M. and Vogel, M. (2023) ‘Text-to-speech pipeline for Swiss German : a comparison’, in 8th Swiss Text Analytics Conference – SwissText 2023, Neuchâtel, Switzerland, 12-14 June 2023. arXiv. Available at: https://doi.org/10.48550/arXiv.2305.19750.
T. Bollinger, J. M. Deriu, and M. Vogel, “Text-to-speech pipeline for Swiss German : a comparison,” in 8th Swiss Text Analytics Conference – SwissText 2023, Neuchâtel, Switzerland, 12-14 June 2023, Jun. 2023. doi: 10.48550/arXiv.2305.19750.
BOLLINGER, Tobias, Jan Milan DERIU und Manfred VOGEL, 2023. Text-to-speech pipeline for Swiss German : a comparison. In: 8th Swiss Text Analytics Conference – SwissText 2023, Neuchâtel, Switzerland, 12-14 June 2023. Conference paper. arXiv. Juni 2023
Bollinger, Tobias, Jan Milan Deriu, and Manfred Vogel. 2023. “Text-to-Speech Pipeline for Swiss German : A Comparison.” Conference paper. In 8th Swiss Text Analytics Conference – SwissText 2023, Neuchâtel, Switzerland, 12-14 June 2023. arXiv. https://doi.org/10.48550/arXiv.2305.19750.
Bollinger, Tobias, et al. “Text-to-Speech Pipeline for Swiss German : A Comparison.” 8th Swiss Text Analytics Conference – SwissText 2023, Neuchâtel, Switzerland, 12-14 June 2023, arXiv, 2023, https://doi.org/10.48550/arXiv.2305.19750.


Alle Ressourcen in diesem Repository sind urheberrechtlich geschützt, soweit nicht anderweitig angezeigt.