Age Suitability and Readability Analysis

Age Suitability Analysis: Are my children old enough to read these books?

Movies are standardly rated and classified as being suitable for a certain age and audience. For books such a rating system has not been established yet. But books differ with respect to their content, writing style, topic, genre and other age related aspects and thus, may not be appropriate for all audiences, too. According to experts in the domain, not only the readability of the text but also the topics addressed, the complexity of the storyline and the emotions that are evoked in the reader have to be taken into account.

The goal of this work is to find an effective method for measuring and representing the age suitability of textual documents, enabling parents and experts to assess the different age relevant aspects without reading the whole book.

Within the project, we developed methods to measure the relevant aspects computationally. The results of the automatic text analysis process are presented to the user in a dashboard-like visualization (in the screenshot below the book “Harry Potter and the Deathly Hallows” is shown). Each contributing aspect is visualized separately. In the upper left panel of the visualization the result of the emotion-detection is depicted, in the middle readability indicators are given, and at the right the result of the topic detection algorithm is shown. The lower part of the visualization screen depicts the storyline complexity of the book. Each row represents a different character and each rectangle in a row shows if and how active a character acted during the plot of the book. The more saturated the color of a cell is, the more frequent is the character in the respective section (only visible in the detail representation). By drilling down to the next level, a more detailed exploration of the different aspects becomes possible.

Visual Readability Analysis

Our tool VisRA for visual readability analysis supports writers in revising a draft-version of a document. In contrast to standard tools for readability analysis VisRA does not only show which sentences need to be revised but also why. Automatic algorithms working in the background determine the reasons for bad readability such as complex sentence structure, difficult vocabulary or too many nominal forms.

From a computer science perspective one of the goals of the project is to find semantically understandable indicators for a text property (in this case readability) (semi-) automatically. Furthermore, different visual representations accounting for differences in the size of the documents and the availability of information about the physical and logical layout of the documents were integrated into the tool.

More information about this and related work can be found in the following publications.