Writing Research
Shedding light onto cognitive processes that underlie writing.
Envisioning the future of learning technologies.
Writing effectively is crucial for success in education and the workplace, but this skill is hard to master. Students who write in a non-native language and those who have dyslexia (the most common learning disability) are especially challenged.
My research on writing (the CyWrite Project) seeks (1) to improve our understanding of cognitive processes that underlie writing, and (2) to envision the next generation of learning technology that can help students become more proficient writers.
Learn about CyWrite
in this one-minute video
Research Findings
Automated writing evaluation (AWE) supports learning by providing precisely targeted formative feedback. However, research into the use of existing AWE systems reveals issues with the performance of automatic error detection and with the way feedback is worded and presented to the learner (see paper [1]).

To improve the accuracy and usefulness of AWE feedback, I began developing my own AWE system, CyWrite, that would serve as a platform for experiments and could be easily modified for research or practice. Pilot work on this system was funded by the ISU's College of Liberal Arts and Sciences through a competitive intramural grant (Signature Research Initiative, 2013–2016, $300K, PI Volker Hegelheimer). My role on this grant was that of a co-investigator leading the development of the system.
The CyWrite Project team (2014). Left to right: Joe Geluso, Dr. Volker Hegelheimer, Aysel Saricaoglu, Dr. Evgeny Chukharev-Hudilainen, Hui-Hsien Feng, Dr. Carol Chapelle, Dr. Jim Ranalli
First, my students and I looked for robust ways of extracting linguistic features from moderately ungrammatical texts produced by English language learners, aiming to give learners formative feedback that was accurate, interpretable, and actionable. We found that statistical parsers (i.e. software that automatically identifies the syntactic structure of a sentence based on a manually-annotated training corpus), such as Stanford CoreNLP, could yield syntactic information that would be usable for detecting certain types of grammatical errors with mal-rules (i.e. explicit, formal, computer-readable descriptions of ungrammatical structures) that combine constituency, dependency, and surface text patterns. The complexity of these mal-rules, however, was prohibitively high in currently available formalisms (e.g. TGrep).

I therefore developed a simple but scalable declarative formalism for writing rules. This formalism turned out to be easy to use for students in applied linguistics, and thus was adopted by 5 PhD students for their dissertation research. It trans-compiles into Prolog programs that operate on syntactic trees, part-of-speech tags, syntactic dependency information, and surface text. A prototype error detection system presented in paper [2] outperforms existing AWE tools in the detection of certain error types.
2. Feng, H.-H.*, Saricaoglu, A.*, & Chukharev-Hudilainen, E. (2016). Automated error detection for developing grammar proficiency of ESL learners. CALICO Journal, 33(1), 49–70.
Using the same formalism, my student Aysel Saricaoglu and I developed a system that captures linguistic expressions of cause-and-effect in essays written by college students. Using the framework of Systemic Functional Linguistics, the system measures the sophistication of the students' causal discourse and provides immediate formative feedback. Development and evaluation of the system is presented in paper [3]. The system's pedagogical effectiveness was empirically demonstrated in Saricaoglu's doctoral dissertation (2015).
3. Chukharev-Hudilainen, E., & Saricaoglu, A.* (2014/2016). Causal discourse analyzer: improving automated feedback on academic ESL writing. Computer Assisted Language Learning, 29(3), 494-516.
A more fundamental limitation of existing AWE systems is that their feedback, like the feedback conventionally provided by human instructors, is only based on the products of writing (i.e. texts that the students submit). Feedback on final texts is blind to accurate but disfluent text production. It provides no learning opportunity where, for example, a student struggles for a long time with particular grammatical construction but eventually ends up producing a well-formed sentence – perhaps avoiding the construction that they were attempting altogether. Disfluent processing of grammar demands cognitive resources that otherwise could be directed to the high-level aspects of writing, and as a result, the writer is likely to produce text with poorer content, structure, and argumentation. A writer who stops in the middle of a sentence to worry about spelling or making verbs agree with nouns may, in a literal sense, forget what they were going to say next. Disfluent production of specific constructions indicates incomplete learning and signals the need for intervention.

A main focus of my research has therefore been to develop a system that will provide process-based feedback to learners. Such a system requires an understanding of how cognitive processes that occur in the writer's mind manifest in observable behaviors that can be captured automatically. Pauses in writing – i.e. points where the writer "stops and thinks" – provide valuable insights into the nature of underlying cognitive processing. Information about where a writer pauses can be obtained by logging the writer's keystrokes, capturing accurate (millisecond) times of each keypress.

Disfluencies in a writer's production are then identified by inter-key intervals above a particular threshold. This threshold, however, is often selected arbitrarily.
In paper [4], which won the 2016 John R. Hayes Award for Excellence in Writing Research, I argue that an empirically justified threshold for the identification of pauses will vary as a function of the student's typing skill. I propose to model interkey intervals with a combined normal and exponential (ex-Gaussian) distribution, which typically provides excellent fit to keystroke data. I argue that parameters of the ex-Gaussian distribution may, with caution, also be interpreted psycholinguistically.
4. Chukharev-Hudilainen, E. (2014). Pauses in spontaneous written communication: a keystroke logging study. Journal of Writing Research, 6(1), 61-84.
John R. Hayes Award for Excellence in Writing Research, 2016. Photo by Carol Chapelle
I believe that understanding how best to support learning-to-write necessarily involves understanding of underlying psycholinguistic processes. This has led to my involvement in a large-scale international collaborative project to collect a carefully controlled sample of keyboarded responses to the picture naming task across 14 languages (presented in paper [5]). This dataset, in particular, provides keystroke timing norms needed for identifying and classifying word-level disfluencies in text production.
5. Torrance, M., Nottbusch, G., Alves, R. A., Arfé, B., Chanquoy, L., Chukharev-Hudilainen, E.,
Dimakos, I., Fidalgo, R., Hyönä, J., Jóhannesson, Ó. I., Madjarov, G., Pauly, D., Uppstad, P. H.,
van Waes, L., Vernon, M., & Wengelin, Å. (forthcoming). Timed written picture naming in 14
European languages. Behavior Research Methods.
Current Funding
My ongoing work on writing has been funded through a National Science Foundation grant that focuses on developing a prototype of a web-based text editor that captures students' engagement in composition processes through combined eye tracking and keystroke logging. Process capture is entirely unobtrusive: the writing experience is identical to that provided by any other low-feature word processor (e.g., Microsoft WordPad). Eye tracking is performed by a consumer-priced ($695) device, GazePoint GP3, mounted under the computer screen.

This grant has allowed us to explore the possibility that such combined eye-tracking and keystroke-logging technology may detect important aspects of the students' writing process and provide real-time feedback. Additionally, recording where learners look and what they type and allowing playback might support instructors of writing to teach some of the techniques that research shows can improve writing skills.
Chukharev-Hudilainen, E. (PI), & Ranalli, J. (Co-PI) (2015–2017). EAGER: Exploiting Keystroke Logging and Eye-Tracking to Support the Learning of Writing. National Science Foundation award No. IIS-1550122, $299,519.