Vocabulary Research
Adaptive spaced repetition.
Terminology extraction from corpora.
Vocabulary is one of the most challenging aspects of a second language. My work has focused on improving the spaced repetition approach (which is widely used in existing software) based on what is known about mental lexicon and language interference.
Research Findings
Paper [1], which won the CALICO Journal Best Article Award, presents a probabilistic finite-state model of learner's lexical memory. The model tracks the transition of words between the following states: new→ being actively learning → learned for the short-term → consolidated.

I designed a prototype web-based learning system to operationalize this model in three-phase adaptive instruction: introduction of new words, initial massed practice, and final distributed (spaced) practice. The system generates multiple types of online activities: multiple-choice translation (definition), listening comprehension, semantic sorting, speeded lexical decision, picture matching, definition matching, fill-inthe-
blank, spelling. The system selects both activity types and target words dynamically according to the learner model, so as to maximize expected learning gains for the student.

This paper further reports a randomized controlled double-blind experiment, where student participants spent, on average, just under three minutes a day on system-generated activities in the course of one semester. The system yielded a 195% increase in long-term vocabulary retention (relative to controls), as assessed by an independent examiner who did not know participants personally and was blind to group assignments.
1. Chukharev-Hudilainen, E., & Klepikova, T. A. (2014/2016). The effectiveness of computer-based spaced repetition in foreign language vocabulary instruction: a double-blind study. CALICO Journal, 33(3), 334-354.
Article [2] is a short paper published in a Russian peer-reviewed journal. In this article, two small-scale experiments are presented that point towards the utility of text-to-speech (TTS) in generating vocabulary-learning activities. No significant effect on learning gains of the quality of TTS was found, although students strongly preferred high-quality voices.
2. Klepikova, T. A., & Chukharev-Hudilainen, E. (2013). Технологии синтеза речи в обучении лексике английского языка (= Speech synthesis technologies in teaching English vocabulary). Известия СПбГУЭФ, 1, 75–79.
Current Funding
A central feature of my vocabulary learning system is that it allows the teacher to specify any word lists, based on the current demands of the curriculum. When activities are generated for students, the system must consider relationships among the words that are to be learned. For example, using synonyms of the target word as distractors in a multiple-choice exercise may confuse the student. WordNet is a reliable source of information about lexical relations for both general and academic vocabulary. However, it lacks coverage of discipline-specific terms. My interest in the automatic extraction of lexical relations from disciplinary corpora has led to my involvement as a co-investigator in an NSF-funded project that seeks to do that for the civil engineering domain.
Jeong, H. D. (PI), Chukharev-Hudilainen, E. (Co-PI), & Gilbert, S. (Co-PI) (2016–2019). A natural language based data retrieval engine for automated digital data extraction for civil infrastructure projects. National Science Foundation award No. 1635309, $285,305.
I have led the development of an online vocabulary learning tool, Linguatorium Lexis, which is informed by the outcomes of my research.

Linguatorium Lexis is available through the A. A. Hudyakov Center for Linguistic Research.