How to build an efficient word list ? What are the limits of word frequency measures ? These issues are relevant to readability.

First, a word about the context : word lists are used to find difficulties and to try to improve the teaching material, whereas word frequency is used in psychological linguistics to measure cognitive processing. Thus, this topic deals with education science, psychological linguistics and corpus linguistics.

Coxhead’s Academic Word List

The academic word list by Averil Coxhead is a good example of this approach. He finds that students are not generally familiar with academic vocabulary, giving following examples : substitute, underlie, establish and inherent (p. 214). According to him, this kind of words are are “supportive” but not “central” (these adjectives could be good examples as well).

He starts from principles from corpus linguistics and states that “a register such as academic texts encompasses a variety of subregisters”, one has to balance the corpus.

Coxhead’s methodology is interesting. As one can see he probably read the works of Douglas Biber or John Sinclair, just to name a few. (AWL stands for Academic Word List.)

« To establish whether the AWL maintains high coverage over academic texts other than those in the Academic Corpus, I compiled a second corpus of academic texts in English, using the same criteria and sources to select texts and dividing them into the same four disciplines. »
« To establish that the AWL is truly an academic word list rather than a general-service word list, I developed a collection of 3,763,733 running words of fiction texts. » (p. 224)

The first test determines if the list is relevant enough, whereas the second one tells if the list is selective enough. Both aim at detecting if it does what it is supposed to do. It seems like a fairly good practice to me, maybe it is no wonder if this article has been cited nearly 800 times !

Word Frequency vs. Contextual Diversity

The next research topic I would like to tackle concerns word frequency and word frequency lists. Adelman, Brown and Quesada give a good picture of it:

« It appears that repeated experience with or exposure to a particular word makes it more readable or identifiable. A key assumption of theoretical explanations of the word frequency (WF) effect is that the effect is due to the number of experiences with a word; each (and every) exposure has a long-term influence on accessibility. » (p. 3)

They distinguish the connectionist models (learning upon each experience of a word) from the lexicon-based ones, where the accessibility of individual lexical entries is governed by frequency. He also refers to the research on memory, in which scholars consider a separation of the exposures in time and context.

They investigate the function of a “contextual diversity”, which they define as follows : « A normative measure of a word’s CD may be obtained by counting the number of passages (documents) in a corpus that contain that word » (p. 4).

In fact, contextual diversity seems to be a better indicator of reaction times, and thus it may also be relevant to assess text readability. Their study comes to the following conclusion (CD stands for contextual diversity and WF for word frequency) :

« In both word naming and lexical decision contextual diversity was more predictive of reaction times than word frequency. Moreover, CD had a unique effect such that high CD led to fast responses, whilst WF had no unique effect or a suppressor effect with high WF leading to slow responses. This implies there is a CD effect, but no facilitatory effect of WF per se. » (p. 11)

Finally, they infer from results that they « motivate a theory of reading based on principles from memory research » (p. 13). Adelman et al. are not the first researchers who study the impact of contextual diversity, but they give a good account of the importance of this phenomenon.

Towards a more efficient word frequency measure

Starting from these results, Brysbaert and New try to provide a more efficient word frequency measure.

Among other interests, they discuss the corpus issue : How big should it be and what kind of texts should be included ? In their opinion, « for most practical purposes, a corpus of 16–30 million words suffices for reliable word frequency norms. » (p. 980).

Previous research from them showed that film and television subtitles, as an alternative source of language use, outperformed measures derived from books and Internet searches. What makes subtitles so particular is their vocabulary : they mostly include tangible words (and few conceptional ones), and long words tend to be avoided.

The next question to arise is whether one should use only lemmas/lemmata or all irregular forms to build the list:

« Our analyses with the entire Elexicon suggest that, for most practical purposes, lemma frequencies in English are not more informative than WF frequencies. This also seems to be the conclusion reached by Baayen in his most recent articles » (p. 984)

They list the practical implications of the superiority of the contextual diversity measure (p. 987). Indeed, corpora collected for this purpose have to account for the superiority of the so-called CD frequency measure.

  • A corpus consisting of a large number of small excerpts is better than a corpus consisting of a small number of large excerpts (at least 3,000 different samples needed, with presumably not much gain to be expected above 10,000 samples)
  • It may not be good to use samples that succeed each other rapidly in time
  • Samples of moderate size are required (a few hundred words to a few thousand words)

As a conclusion, the authors give an idea of the registers to use:

« The two most interesting language registers currently available are Internet discussion groups and subtitles. […] On the basis of the English findings, frequencies based on discussion groups seem to be indicated for words longer than seven letters, whereas for short words subtitle frequencies are better. » (p. 988)

I plan to test their assumptions, using the subtitles corpora available for German.


  • J. S. Adelman, G. D. A. Brown, and J. F. Quesada, “Contextual diversity, not word frequency, determines word-naming and lexical decision times”, Psychological Science, vol. 17, iss. 9, pp. 814-823, 2006.
  • M. Brysbaert and B. New, “Moving beyond Kučera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English”, Behavior Research Methods, vol. 41, iss. 4, pp. 977-990, 2009.
  • A. Coxhead, “A New Academic Word List”, TESOL Quarterly, vol. 34, iss. 2, pp. 213-238, 2000.