Leveraging large amounts of weakly supervised data for multi-language sentiment classification

Deriu, Jan Milan; Lucchi, Aurelien; De Luca, Valeria; Severyn, Aliaksei; Müller, Simone; Cieliebak, Mark; Hofmann, Thomas; Jaggi, Martin

doi:10.1145/3038912.3052611

Bitte benutzen Sie diese Kennung, um auf die Ressource zu verweisen: https://doi.org/10.21256/zhaw-1525

Publikationstyp:	Konferenz: Paper
Art der Begutachtung:	Keine Angabe
Titel:	Leveraging large amounts of weakly supervised data for multi-language sentiment classification
Autor/-in:	Deriu, Jan Milan Lucchi, Aurelien De Luca, Valeria Severyn, Aliaksei Müller, Simone Cieliebak, Mark Hofmann, Thomas Jaggi, Martin
DOI:	10.1145/3038912.3052611 10.21256/zhaw-1525
Tagungsband:	Proceedings of the 26th International Conference on World Wide Web
Seite(n):	1045
Seiten bis:	1052
Angaben zur Konferenz:	26th International World Wide Web Conference Committee (IW3C2), Perth, Australia, 3-7 April 2017
Erscheinungsdatum:	2017
Verlag / Hrsg. Institution:	Association for Computing Machinery
ISBN:	9781450349130
Sprache:	Englisch
Schlagwörter:	Sentiment Analysis
Fachgebiet (DDC):	006: Spezielle Computerverfahren
Zusammenfassung:	This paper presents a novel approach for multi-lingual sentiment classification in short texts. This is a challenging task as the amount of training data in languages other than English is very limited. Previously proposed multi-lingual approaches typically require to establish a correspondence to English for which powerful classifiers are already available. In contrast, our method does not require such supervision. We leverage large amounts of weakly-supervised data in various languages to train a multi-layer convolutional network and demonstrate the importance of using pre-training of such networks. We thoroughly evaluate our approach on various multi-lingual datasets, including the recent SemEval-2016 sentiment prediction benchmark (Task 4), where we achieved state-of-the-art performance. We also compare the performance of our model trained individually for each language to a variant trained for all languages at once. We show that the latter model reaches slightly worse – but still acceptable – performance when compared to the single language model, while benefiting from better generalization properties across languages.
URI:	https://digitalcollection.zhaw.ch/handle/11475/1851
Volltext Version:	Publizierte Version
Lizenz (gemäss Verlagsvertrag):	Lizenz gemäss Verlagsvertrag
Departement:	School of Engineering
Organisationseinheit:	Institut für Informatik (InIT)
Publiziert im Rahmen des ZHAW-Projekts:	DeepText: Intelligente Textanalyse mit Deep Learning
Enthalten in den Sammlungen:	Publikationen School of Engineering

Dateien zu dieser Ressource:

Datei	Beschreibung	Größe	Format
p1045-deriu.pdf		3.78 MB	Adobe PDF	Öffnen/Anzeigen

Zur Langanzeige

Deriu, J. M., Lucchi, A., De Luca, V., Severyn, A., Müller, S., Cieliebak, M., Hofmann, T., & Jaggi, M. (2017). Leveraging large amounts of weakly supervised data for multi-language sentiment classification [Conference paper]. Proceedings of the 26th International Conference on World Wide Web, 1045–1052. https://doi.org/10.1145/3038912.3052611

Deriu, J.M. et al. (2017) ‘Leveraging large amounts of weakly supervised data for multi-language sentiment classification’, in Proceedings of the 26th International Conference on World Wide Web. Association for Computing Machinery, pp. 1045–1052. Available at: https://doi.org/10.1145/3038912.3052611.

J. M. Deriu et al., “Leveraging large amounts of weakly supervised data for multi-language sentiment classification,” in Proceedings of the 26th International Conference on World Wide Web, 2017, pp. 1045–1052. doi: 10.1145/3038912.3052611.

DERIU, Jan Milan, Aurelien LUCCHI, Valeria DE LUCA, Aliaksei SEVERYN, Simone MÜLLER, Mark CIELIEBAK, Thomas HOFMANN und Martin JAGGI, 2017. Leveraging large amounts of weakly supervised data for multi-language sentiment classification. In: Proceedings of the 26th International Conference on World Wide Web. Conference paper. Association for Computing Machinery. 2017. S. 1045–1052. ISBN 9781450349130

Deriu, Jan Milan, Aurelien Lucchi, Valeria De Luca, Aliaksei Severyn, Simone Müller, Mark Cieliebak, Thomas Hofmann, and Martin Jaggi. 2017. “Leveraging Large Amounts of Weakly Supervised Data for Multi-Language Sentiment Classification.” Conference paper. In Proceedings of the 26th International Conference on World Wide Web, 1045–52. Association for Computing Machinery. https://doi.org/10.1145/3038912.3052611.

Deriu, Jan Milan, et al. “Leveraging Large Amounts of Weakly Supervised Data for Multi-Language Sentiment Classification.” Proceedings of the 26th International Conference on World Wide Web, Association for Computing Machinery, 2017, pp. 1045–52, https://doi.org/10.1145/3038912.3052611.

Alle Ressourcen in diesem Repository sind urheberrechtlich geschützt, soweit nicht anderweitig angezeigt.