Visual Exploration and Semantic Text Analysis of Multimodal Big Data

Communication Episodes
Communication Episodes

Theoretical (Analytical):

Practical (Implementation):

Literature Work:


Overview

Organized crime units and terrorist cells are often characterized by complex group structures and involved in cross-border activities. As part of criminal investigations, law enforcement collects fast amounts of various multimodal data (text, images, videos, OSINT). This data is of high variability, involving structured and unstructured information, sometimes with metadata, sometimes without.

To analyze this heterogeneous data, it is necessary to prepare and correlate the different data sources and then find the hidden, complex structures hidden within. Here we focus mainly on the semantic analysis of text data and plan to investigate automatic multi-language semantic entity extraction and supporting domain-specific ontologies for intelligent searches. We further plan to design advanced visualization methods allowing for the exploration and analysis of the semantic text data as well as combine these results with the other multimodal data in an interactive Visual Analytics application.

Problem Statements

  • How can entities be extracted from multi-language texts and visualization be used to correlate different entities (using statistics, semantics, rule-based methods)?
  • How can domain-specific ontologies be generated, continuously adapted, and support users when searching multi-language texts.
  • How can knowledge from the semantic text analysis and all the other multimodal data sources be correlated and put into context to generate a complex knowledge graph, detecting hidden structures?

Tasks

  • Named Entity Recognition in multiple languages, with automatic language detection.
  • Correlation of named entities, exploring statistical, semantic, or rule-based methods.
  • Generation and interactive modification of domain-specific ontologies, based on user searched as well as manual user modification.
  • Supporting language-independent search queries for concepts and ontologies.
  • Summarizing and visualizing the results from the above analysis steps interactively.
  • Visual aggregation of the heterogeneous analysis results from semantic text analysis with other multimodal analysis results for exploration, based on a knowledge graph or other knowledge aggregation methods.
  • Use visual analytics to detect hidden structures and interrelations inside the aggregated analysis results.

Opportunities

  • Work on cutting edge research that has important real-life implications.
  • Ability to align topics from seminar, project, and thesis and the ability to continue work in this field with more specialized topics for a later thesis.

Requirements

  • Highly motivated
  • Experience in text analysis (Document Analysis) and NLP recommended (or readiness to get up to speed
  • Excellent programming skills in Python / D3 / Visualization or comparable

Scope/Duration/Start

It is possible to work only on a specific sub-aspect of the proposed problems and tasks. Feel free to discuss your preferences with us!

  • Scope: Bachelor / Master
  • Project / Thesis Duration (Bachelor): 3 months + 3 months
  • Project / Thesis Duration (Master): 6 months + 6 months
  • Start: Planned in February 2020

Contact

References

  • Seebacher, D., Fischer, M. T., Sevastjanova, R., Keim, D. A., & El-Assady, M. (2019). Visual Analytics of Conversational Dynamics. In EuroVis Workshop on Visual Analytics (EuroVA).
  • Sacha, D., Jentner, W., Zhang, L., Stoffel, F., Ellis, G., & Keim, D. (2017). Applying Visual Interactive Dimensionality Reduction to Criminal Intelligence Analysis. VALCRI White Paper Series, 1.
  • El-Assady, M., Sevastjanova, R., Gipp, B., Keim, D. A. und Collins, C. (2017) NEREx: Named-Entity Relationship Exploration in Multi-Party Conversations, Computer Graphics Forum, The Eurographics Association and John Wiley & Sons Ltd., pp. 213-225, 2017.