Analyzing the Behavior of Reinforcement Learning Agents

Theoretical (Analytical):

Practical (Implementation):

Literature Work:


Deep Reinforcement Learning (Deep RL) has attracted large interest in the research community due to its versatility, potential, as well as success at tasks, e.g., game playing, that have previously thought to be very difficult to tackle with machine learning. Yet, understanding, interpreting and validating the learned behavior of agents is very challenging. In this project, you work on a software environment to run, evaluate and compare algorithms in a principled way. Specifically, we are searching for ways to analyze and compare agents based on their behavioral characteristics, and how different architectures and algorithms affect these.

Problem Statement

When evaluating the quality of trained reinforcement learning agents, researchers often only look at training metrics like achieved reward. Especially in more complex scenarios, it is challenging to infer the actual capabilities and behavior of agents from reward alone.

By jointly analyzing both metrics and the actual agent behavior in a dynamic way, we can achieve much better insights in the quality of a trained agent. We furthermore may be able to understand the effect of certain algorithmic or environment design decisions on the learning process and final results.


  • Familiarize yourself with the basic concepts of Reinforcement Learning
  • Select and integrate environments (following the OpenAI Gym Standard) for shared benchmarking of agents
  • Train Agents with varying algorithms. Apply hyper-parameter tuning. This step can be done based on existing code
  • Run trained agents and achieve first observations of behavioral characteristics
  • Implement and integrate a visualization to analyze agent behavior. The visualization can be integrated into an existing software framework.


Good programming skills in Python and Javascript/TypeScript.

Familarity with machine learning concepts.

Basic knowledge of Deep Learning architectures and algorithms




  • Scope: Bachelor/Master
  • 3 Month Project, 3 Month Thesis
  • Start: immediately



[1] DQNViz: A Visual Analytics Approach to Understand Deep Q-Networks, Wang, Gou, Shen, Yang, 2018, IEEE VIS

[2] Unmasking Clever Hans predictors and assessing what machines really learn, Lapuschkin, Sebastian, et al., 2019, Nature communications 10.1 (2019): 1-8.

[3] Deep Reinforcement Learning that Matters, Peter Henderson, Riashat Islam, Philip Bachman, Joelle Pineau, Doina Precup, David Meger, 2017, AAAI