Micro-Linguistic Feature Detection and Annotation

Theoretical (Analytical):
Practical (Implementation):
Literature Work:
Overview and Problem Statement
Text documents include various micro-linguistic features, e.g., sentiment, topics, agreement, emotion, etc. In the visual analysis of linguistic data, researchers often stumble upon the need to analyze the relation of micro-linguistic features in a text. These features build the basis of scores and measures that are based on linguistic data. In this project, the aim is to develop a visual framework for the efficient annotation of micro-linguistic features independent of text type. Features can be generated on many granularities, e.g., words, multi-words, sentences, paragraphs, documents, document-collections, etc.
Tasks
- Review frequently used linguistic features (in collaboration with the our partners in linguists).
- Derive an abstract framework for the simple calculation of micro-linguistic features.
- Implement methods for the extraction of linguistic features from a text corpus.
- Implement a visual interface for the interactive categorization, aggregation, and renaming of the extracted features.
- Generate a tagged document structure as an output of your tool.
Requirements
- Good knowledge about information visualization and natural language processing.
- Good programming skills in Java and JavaScript/D3.
Scope/Duration/Start
- Scope: Bachelor/Master
- 6 Month Project, 3 Month Thesis (Bachelor) / 6 Month Thesis (Master)
- Start: immediately