Complexity & Readability - Bits of Language: corpus linguistics, NLP and text analytics

Comparison of Features for Automatic Readability Assessment: review

I read an interesting article, “featuring” an up-to-date comparison of what is being done in the field of readability assessment:

“A Comparison of Features for Automatic Readability Assessment”, Lijun Feng, Martin Jansche, Matt Huenerfauth, Noémie Elhadad, 23rd International Conference on Computational Linguistics (COLING 2010), Poster Volume, pp. 276-284.

I am interested in the features they use. Let’s summarize, I am going to do a quick recension:

Corpus and tools

Corpus: a sample from the Weekly Reader
OpenNLP to extract named entities and resolve co-references
the Weka learning toolkit for machine learning

Features

Four subsets of discourse features:

more ...

«
1
2
3
»