Please use this identifier to cite or link to this item:
Title: Author profiling with bidirectional RNNs using attention with GRUs : notebook for PAN at CLEF 2017
Authors : Kodiyan, Don
Hardegger, Florin
Neuhaus, Stephan
Cieliebak, Mark
Proceedings: CLEF 2017 Evaluation Labs and Workshop – Working Notes Papers
Volume(Issue) : 1866
Conference details: CLEF 2017 Evaluation Labs and Workshop – Working Notes Papers, Dublin, Ireland, 11-14 September 2017
Publisher / Ed. Institution : RWTH Aachen
Issue Date: 2017
License (according to publishing contract) : Licence according to publishing contract
Type of review: Peer review (Publication)
Language : English
Subjects : Gender Classification; Author Profiling
Subject (DDC) : 004: Computer science
005: Computer programming, programs and data
Abstract: This paper describes our approach for the Author Profiling Shared Task at PAN 2017. The goal was to classify the gender and language variety of a Twitter user solely by their tweets. Author Profiling can be applied in various fields like marketing, security and forensics. Twitter already uses similar techniques to deliver personalized advertisement for their users. PAN 2017 provided a corpus for this purpose in the languages: English, Spanish, Portuguese and Arabic. To solve the problem we used a deep learning approach, which has shown recent success in Natural Language Processing. Our submitted model consists of a bidirectional Recurrent Neural Network implemented with a Gated Recurrent Unit (GRU) combined with an Attention Mechanism. We achieved an average accuracy over all languages of 75,31% in gender classification and 85,22% in language variety classification.
Departement: School of Engineering
Organisational Unit: Institute of Applied Information Technology (InIT)
Publication type: Conference Paper
DOI : 10.21256/zhaw-1531
ISSN: 1613-0073
Appears in Collections:Publikationen School of Engineering

Files in This Item:
File Description SizeFormat 
kodiyan17-notebook.pdf361.5 kBAdobe PDFThumbnail

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.