Binary classification is a common task in machine learning, for instance to detect spam messages. Automatic algorithms that build classifiers by optimizing one measure, such as accuracy or F1-score, can automatically build such models given a large enough dataset. However, there are applications that demand for taking more perspectives into account than one single measure. For instance, in medical diagnosis, not only the accuracy of the diagnostic test is relevant, but also the expected distribution of false positives and false negatives, the time needed to measure all attributes needed for performing the test, and potential side-effects of measurement procedures, for example, caused by radiation in doing a CT scan. In order to find a good classifier balancing these demands, multiple candidate models have to be constructed and compared. A visual analytics tool can help in this process.
Your task is to implement a visual analytics prototype that enables analysts to construct decision trees interactively. In contrast to previous approaches, your prototype will integrate the different perspectives mentioned above in the workflow and focus on the multi-objective comparison between classifier models. In consequence, there will be a strong focus on the interactive parts of the visualization. You will use the machine to offer suggestions, but the model building process will be driven by the human analyst. Thus, it is crucial to design clear and detailed visual interfaces that allow for seemless interactions.
- Stef van den Elzen and Jarke J. van Wijk (2011) BaobabView: Interactive Construction and Analysis of Decision Trees}, In: Proc. Conf. Visual Analytics Science and Technology}, doi:10.1109/VAST.2011.6102453
- Thomas Mühlbacher, Lorenz Linhardt, Torsten Möller, and Harald Piringer (2018) TreePOD: Sensitivity-Aware Selection of Pareto-Optimal Decision Trees, In: Trans. Visualization and Computer Graphics 24(1), 174–183, doi:10.1109/TVCG.2017.2745158