Classifier models are among the most popular machine-learning algorithms. Classification tasks cover a wide range of applications (e.g. spam filtering, face detection, and fraud detection). Often, the performance of a group of classifier models working together and the combination of their results to achieve a single outcome is better than single models running independently. In this context, diversity among models (e.g. when classifiers produce different errors) is a known aspect that usually favors superior performance with ensembles of classifiers. However, this is not true all the time. It depends on the data, the classification task at hand, and the models. This thesis aims at producing visual representations of ensembles of classifiers with distinct models. We will work with several datasets and distinct classification problems. At the end, we want to search for patterns that help to explain the role of diversity in ensembles of classifiers.
- Good knowledge in Java programming language
- Basic knowledge or the interest in learning about machine-learning classifiers
- Scope: Bachelor (3 months project + 3 months thesis)
- Zhou, Zhi-Hua. Ensemble methods: foundations and algorithms. CRC press, 2012.
- Brown, Gavin, and Ludmila I. Kuncheva. "Good" and "bad" diversity in majority vote ensembles. International Workshop on Multiple Classifier Systems. Springer Berlin Heidelberg, 2010.
- Talbot, Justin, et al. EnsembleMatrix: interactive visualization to support machine learning with multiple classifiers. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 2009.