Evaluating audiovisual source separation in the context of video conferencing

Inan, Berkay; Cernak, Milos; Grabner, Helmut; Tukuljac, Helena Peic; Pena, Rodrigo C. G.; Ricaud, Benjamin

doi:10.21437/Interspeech.2019-2671

Publikationstyp:	Konferenz: Paper
Art der Begutachtung:	Peer review (Publikation)
Titel:	Evaluating audiovisual source separation in the context of video conferencing
Autor/-in:	Inan, Berkay Cernak, Milos Grabner, Helmut Tukuljac, Helena Peic Pena, Rodrigo C. G. Ricaud, Benjamin
et. al:	No
DOI:	10.21437/Interspeech.2019-2671
Tagungsband:	Proceedings Interspeech 2019
Seite(n):	4579
Seiten bis:	4583
Angaben zur Konferenz:	Interspeech 2019, Graz, Austria, 15-19 September 2019
Erscheinungsdatum:	2019
Verlag / Hrsg. Institution:	International Speech Communication Association (ISCA)
Sprache:	Englisch
Schlagwörter:	Speech enhancement; Source separation; Multi-modal; Aaudiovisual
Fachgebiet (DDC):	621.3: Elektro-, Kommunikations-, Steuerungs- und Regelungstechnik
Zusammenfassung:	Source separation involving mono-channel audio is a challenging problem, in particular for speech separation where source contributions overlap both in time and frequency. This task is of high interest for applications such as video conferencing. Recent progress in machine learning has shown that the combination of visual cues, coming from the video, can increase the source separation performance. Starting from a recently designed deep neural network, we assess its ability and robustness to separate the visible speakers’ speech from other interfering speeches or signals. We test it for different configuration of video recordings where the speaker’s face may not be fully visible. We also asses the performance of the network with respect to different sets of visual features from the speakers’ faces.
URI:	https://digitalcollection.zhaw.ch/handle/11475/18478
Volltext Version:	Publizierte Version
Lizenz (gemäss Verlagsvertrag):	Lizenz gemäss Verlagsvertrag
Departement:	School of Engineering
Organisationseinheit:	Institut für Datenanalyse und Prozessdesign (IDP)
Enthalten in den Sammlungen:	Publikationen School of Engineering

Dateien zu dieser Ressource:

Es gibt keine Dateien zu dieser Ressource.

Zur Langanzeige

Inan, B., Cernak, M., Grabner, H., Tukuljac, H. P., Pena, R. C. G., & Ricaud, B. (2019). Evaluating audiovisual source separation in the context of video conferencing [Conference paper]. Proceedings Interspeech 2019, 4579–4583. https://doi.org/10.21437/Interspeech.2019-2671

Inan, B. et al. (2019) ‘Evaluating audiovisual source separation in the context of video conferencing’, in Proceedings Interspeech 2019. International Speech Communication Association (ISCA), pp. 4579–4583. Available at: https://doi.org/10.21437/Interspeech.2019-2671.

B. Inan, M. Cernak, H. Grabner, H. P. Tukuljac, R. C. G. Pena, and B. Ricaud, “Evaluating audiovisual source separation in the context of video conferencing,” in Proceedings Interspeech 2019, 2019, pp. 4579–4583. doi: 10.21437/Interspeech.2019-2671.

INAN, Berkay, Milos CERNAK, Helmut GRABNER, Helena Peic TUKULJAC, Rodrigo C. G. PENA und Benjamin RICAUD, 2019. Evaluating audiovisual source separation in the context of video conferencing. In: Proceedings Interspeech 2019. Conference paper. International Speech Communication Association (ISCA). 2019. S. 4579–4583

Inan, Berkay, Milos Cernak, Helmut Grabner, Helena Peic Tukuljac, Rodrigo C. G. Pena, and Benjamin Ricaud. 2019. “Evaluating Audiovisual Source Separation in the Context of Video Conferencing.” Conference paper. In Proceedings Interspeech 2019, 4579–83. International Speech Communication Association (ISCA). https://doi.org/10.21437/Interspeech.2019-2671.

Inan, Berkay, et al. “Evaluating Audiovisual Source Separation in the Context of Video Conferencing.” Proceedings Interspeech 2019, International Speech Communication Association (ISCA), 2019, pp. 4579–83, https://doi.org/10.21437/Interspeech.2019-2671.

Alle Ressourcen in diesem Repository sind urheberrechtlich geschützt, soweit nicht anderweitig angezeigt.