Micro-Linguistic Feature Detection and Annotation

Theoretical (Analytical):

Practical (Implementation):

Literature Work:

Overview and Problem Statement

Text documents include various micro-linguistic features, e.g., sentiment, topics, agreement, emotion, etc. In the visual analysis of linguistic data, researchers often stumble upon the need to analyze the relation of micro-linguistic features in a text. These features build the basis of scores and measures that are based on linguistic data. In this project, the aim is to develop a visual framework for the efficient annotation of micro-linguistic features independent of text type. Features can be generated on many granularities, e.g., words, multi-words, sentences, paragraphs, documents, document-collections, etc.


  • Review frequently used linguistic features (in collaboration with the our partners in linguists).
  • Derive an abstract framework for the simple calculation of micro-linguistic features.
  • Implement methods for the extraction of linguistic features from a text corpus.
  • Implement a visual interface for the interactive categorization, aggregation, and renaming of the extracted features.
  • Generate a tagged document structure as an output of your tool.


  • Good knowledge about information visualization and natural language processing.
  • Good programming skills in Java and JavaScript/D3.


  • Scope: Bachelor/Master
  • 6 Month Project, 3 Month Thesis (Bachelor) / 6 Month Thesis (Master)
  • Start: immediately