Please use this identifier to cite or link to this item:
Publication type: Article in scientific journal
Type of review: Peer review (publication)
Title: Extraction of transforming sequences and sentence histories from writing process data : a first step towards linguistic modeling of writing
Authors: Mahlow, Cerstin
Ulasik, Malgorzata Anna
Tuggener, Don
et. al: No
DOI: 10.1007/s11145-021-10234-6
Published in: Reading and Writing
Issue Date: 1-Jan-2022
Publisher / Ed. Institution: Springer
ISSN: 0922-4777
Language: English
Subjects: Writing process; Keystroke-logging; Transforming sequence; Text history; Sentence history; Written text production; Linguistic modeling
Subject (DDC): 808: Rhetoric and writing
Abstract: Producing written texts is a non-linear process: in contrast to speech, writers are free to change already written text at any place at any point in time. Linguistic considerations are likely to play an important role, but so far, no linguistic models of the writing process exist. We present an approach for the analysis of writing processes with a focus on linguistic structures based on the novel concepts of transforming sequences, text history, and sentence history. The processing of raw keystroke logging data and the application of natural language processing tools allows for the extraction and filtering of product and process data to be stored in a hierarchical data structure. This structure is used to re-create and visualize the genesis and history for a text and its individual sentences. Focusing on sentences as primary building blocks of written language and full texts, we aim to complement established writing process analyses and, ultimately, to interpret writing timecourse data with respect to linguistic structures. To enable researchers to explore this view, we provide a fully functional implementation of our approach as an open-source software tool and visualizations of the results. We report on a small scale exploratory study in German where we used our tool. The results indicate both the feasibility of the approach and that writers actually revise on a linguistic level. The latter confirms the need for modeling written text production from the perspective of linguistic structures beyond the word level.
Further description: Online first, part of special issue "Methods for understanding writing process by analysis of writing timecourse" Erworben im Rahmen der Schweizer Nationallizenzen (
Fulltext version: Published version
License (according to publishing contract): CC BY 4.0: Attribution 4.0 International
Departement: Applied Linguistics
School of Engineering
Organisational Unit: Centre for Artificial Intelligence (CAI)
Institute of Language Competence (ILC)
Appears in collections:Publikationen Angewandte Linguistik

Files in This Item:
File Description SizeFormat 
2022_Mahlow-etal_Transforming-sequence-sentence-histories-extraction.pdf4.5 MBAdobe PDFThumbnail

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.