Recently, Jean-Philippe Magué advised me of the newly introduced text stats on Amazon. A good summary by Gabe Habash on the news blog of Publishers Weekly describes the perspectives and the potential interest of this new software : Book Lies: Readability is Impossible to Measure. The stats seem to have been available since last summer. I decided to contribute to the discussion on Amazon’s text readability statistics : to what extent are they reliable and useful ?
Gabe Habash compares several well-known books and concludes that the sentence length is determining in the readability measures used by Amazon. In fact, the readability formulas (Fog Index, Flesch Index and Flesch-Kincaid Index, for an explanation see Amazon’s text readability help) are centered on word length and sentence length, which is convenient but by far not always adapted.
There is another metric named ‘word complexity’, which Amazon defines as follows : ‘A word is considered “complex” if it has three or more syllables’ (source : complexity help). I wonder what happens in the case of proper nouns like (again…) Schwarzenegger. There are cases where the syllable recognition is not that easy for an algorithm that was programmed and tested to perform well on English words ...more ...