ClusterFlow - Visual analytics framework for interactive clustering workflows

Theoretical (Analytical):

Practical (Implementation):

Literature Work:


The project will develop a client-server architecture for clustering computations (backend) with a web-based visual interface for clustering workflows. It is inspired by an existing approach for cluster analysis using self-organizing maps.

Problem Statement

Iterative refinement and filter steps are needed to successively approach a desired visualization of a certain dataset. However, a variety of different clustering algorithms exists. This project aims to embed different algorithms into a common clustering workflow approach that allows the user to iteratively switch and compare different results.


  • Get familiar with the state of the art data mining and machine learning algorithms and the current system. 
  • Design and implement several interactive algorithm visualizations and embed them in a workflow.
  • Extend the backend (java) with a variety of algorithms and develop a web-based visual interface (html/javascript/d3). 


  • Basic knowledge about machine learning, data mining algorithms and visual analytics
  • Advanced programming skills in Java
  • Web programming skills (HTML/Javascript/d3.js)
  • Good conceptual skills (software architectures)
  • Basic knowledge in REST, Client-Server communication, Ajax/Jersey, JSON
  • Useful: Git, Maven


  • Scope: Master
  • 6 Month Project
  • 6 Month Thesis



  • D. Sacha and M. Kraus and J. Bernard and M. Behrisch and T. Schreck and Y. Asano and D. A. Keim:
    SOMFlow: Guided Exploratory Cluster Analysis with Self-Organizing Maps and Analytic Provenance. IEEE Transactions on Visualization and Computer Graphics (2017)
  • Dominik Sacha, Michael Sedlmair, Leishi Zhang, John Aldo Lee, Jaakko Peltonen, Daniel Weiskopf, Stephen C. North, Daniel A. Keim:
    What you see is what you can change: Human-centered machine learning by interactive visualization. Neurocomputing 268: 164-175 (2017)