Interactive Multi-objective Decision Tree Building

Theoretical (Analytical):

Practical (Implementation):

Literature Work:


Overview

Binary classification is a common task in machine learning, for instance to detect spam messages. Automatic algorithms that build classifiers by optimizing one measure, such as accuracy or F1-score, can automatically build such models given a large enough dataset. However, there are applications that demand for taking more perspectives into account than one single measure. For instance, in medical diagnosis, not only the accuracy of the diagnostic test is relevant, but also the expected distribution of false positives and false negatives, the time needed to measure all attributes needed for performing the test, and potential side-effects of measurement procedures, for example, caused by radiation in doing a CT scan. In order to find a good classifier balancing these demands, multiple candidate models have to be constructed and compared. A visual analytics tool can help in this process.

Problem Statement

Your task is to implement a visual analytics prototype that enables analysts to construct decision trees interactively. In contrast to previous approaches, your prototype will integrate the different perspectives mentioned above, and focus on the multi-objective comparison between classifier models. In consequence, there will be a strong focus on the interactive parts of the visualization. You will use the machine to offer suggestions, but the model building process will be driven by the human analyst. Thus, it is crucial to design clear and detailed visual interfaces that allow for seemless interactions.

Tasks

  • Design an interface for decision tree building and comparison.
  • Implement the visual analytics prototype.
  • Show the applicability of your prototype in one or two use cases.
  • Optionally, integrate automatic algorithms to compare interactively constructed decision trees to automatically induced models.

Requirements

  • Good programming skills (preferably Python/Flask and D3).
  • Strong interest in designing and implementingvisual interfaces.
  • Some knowledge of binary classification and decision trees.

Scope/Duration/Start

  • Scope: Bachelor / Master
  • Project/Thesis Duration: 3 months/3 months (Bachelor), 6 months / 6 months (Master)
  • Start: immediately

Contact

References

  • Stef van den Elzen and Jarke J. van Wijk (2011) BaobabView: Interactive Construction and Analysis of Decision Trees}, In: Proc. Conf. Visual Analytics Science and Technology}, doi:10.1109/VAST.2011.6102453
  • Thomas Mühlbacher, Lorenz Linhardt, Torsten Möller, and Harald Piringer (2018) TreePOD: Sensitivity-Aware Selection of Pareto-Optimal Decision Trees, In: Trans. Visualization and Computer Graphics 24(1), 174–183, doi:10.1109/TVCG.2017.2745158