Here is what I retain from my reading of this book: * E. Castello, Text Complexity and Reading Comprehension Tests, Bern: Peter Lang, 2008.

Notional framework

To begin with, Castello identifies two types of complexity, and states that research in this field attempts to quantify inherent complexity and receiver-oriented complexity, i.e. complexity or difficulty per se on one side and in terms of reader and text on the other.

He cites C.J. Alderson and L. Merlini Barbaresi (strangely enough, we are not related, as far as I know) for their definition of linguistic complexity, M. Halliday and T. Gibson regarding lexical information, S. Urquhart and C. Weir for their work on different types of reading.


Erik Castello uses a series of measures, most notably:

  1. word-related:
  2. - type/token ratio - word frequency lists - lexical density - lexical variation - lexical density: difference between lexical and grammatical words, multi-word units, research on word families
  3. clause-related:
  4. - clause type ratio, meaning for instance the ratio between hypotactic clauses and clause complexes - grammatical intricacy
  5. sentence-related:
  6. - readability formulas

He mentions an interesting idea: to try and capture the intention of the writer according to a given situation, which can be compared with measures on discourse level (see previous post).

He remarks that grammatical intricacy and lexical density are complementary variables.

Elements of structural analysis

Castello uses discourse semantics knowledge to try to understand ‘how the choice and use of certain cohesive items and structures contribute to the complexity of some tests of the corpus’.

First of all, by using the framework of J.R. Martin’s ‘participants’ (people, entities, places, things) to draw an ‘identification system network’. But also using M. Halliday’s lexico-grammatical approach.

According to Halliday, there are two types of resources: on one hand cohesive ones to ‘manage the flow of discourse’, he also speaks of ‘semantic links’ which are to be found within or across sentences, and on the other hand structural resources which do not go beyond the clause complex and which can be divided in subtypes, mostly in: ‘information structure’ (cohesive chains of grammatical and lexical resources) expressing identity or similarity, and ‘thematic structure’. All this creates ‘texture’.

Within the nominal groups, Halliday defines categories of elements (for instance deictics) and a logical structure.


Having done measures with a corpus of reading comprehension tests, Castello suggests to take advantage of different methodologies, as comparison of different types of data can be revealing. The ‘characteristics of the texts, those of the readers and those of the tasks’ are to be taken into account.

A text can be inherently complex but the task of understanding it rather easy, the contrary does also happen. So, both quantitative and qualitative measures should be made use of. Moreover, they give complementary indications.

My remarks

To me, the linguistic theory and the different frameworks were more important than the tests themselves, so I paid much attention to the references which were sometimes new to me, as German-speaking research has his own frameworks and linguistic theories. I learned a few things about the approach of complexity. In fact, Text Complexity and Reading Comprehension Tests by Erik Castello gives interesting insights on this linguistic phenomenon.

Still, I think the structural elements are mostly theoretical and hard to apply to real-life situations. Moreover, it is a bit difficult to get a whole picture of the book, since the diverse parts may not be enough tied to a global frame. Nonetheless, the difference between ‘text complexity’ and ‘test and task difficulty’ is clear enough.

Although this study is well-documented, the conclusions are not really striking. It is another attempt to quantify and qualify linguistic complexity, which shows there is still a lot a work to do on this particular topic.

