|Publication type:||Conference other|
|Type of review:||Peer review (abstract)|
|Title:||Predicting CEFR levels of student essays in placement tests using an automated essay scoring tool in R : a corpus-based approach|
|Conference details:||8th International Conference on Writing Analytics, Zürich, 5-6 September 2019|
|Subjects:||Automated essay scoring; Placement testing for writing; Computerized assessment; EFL writing assessment|
|Subject (DDC):||410.285: Computational linguistics |
808: Rhetoric and writing
|Abstract:||There is growing interest in automated essay scoring (AES) and the individual measures that can be automatically calculated and used in AES, including readability, lexical diversity and complexity indices. These are particularly attractive for second-language learning, given their potential to assist teachers in writing skills assessment and diagnostics. Although relatively little has been done specifically with regard to AES and CEFR-level prediction (Yannakoudakis et al. 2018: 253), some CEFR-based AES tools are currently available as standalone tools (e.g., writeandimprove.com - Cambridge English) or as part of comprehensive placement tests (e.g., Pearson English Placement test). These tools, however, may not adequately meet an institution's needs for quick rating of large numbers of texts. Free tools require manual entry of individual texts, and are thus inefficient to use. While bulk grading is possible with tools that have subscription fees, and full placement tests provide automatic assessment, these may be cost-prohibitive. This paper describes the design, results and further development of an AES tool and CEFR-level prediction algorithm created and experimentally implemented at a major University of Applied Sciences in Switzerland as part of an online English placement test. In line with current research, the algorithm was developed employing a prediction-accuracy pseudo-black box approach (see Vanhove et al. 2019, Yannakoudakis 2013) using a small training corpus of texts with known CEFR levels (N=50). Written and run entirely in an R environment (R Core Team 2017) using the koRpus package (Michalke 2018) as the tool's workhorse, it can be integrated with other advanced text analyses possible in R. As the tool can handle bulk grading of large numbers of texts, it is ideal for placement testing. The algorithm is also efficient, calculating the CEFR levels of 400 student essays in 15 minutes of runtime. While gold-standard (human) validation evidence is still required, external validation checks demonstrate the accuracy of the tool. CEFR-level prediction patterns were found to be closer to official CEFR levels of a selection of texts than from other online AES systems. Further research perspectives and dissemination in the form of a shiny web-app and an R package will also be discussed.|
|Fulltext version:||Published version|
|License (according to publishing contract):||Licence according to publishing contract|
|Organisational Unit:||Institute of Language Competence (ILC)|
|Appears in collections:||Publikationen Angewandte Linguistik|
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.