Learning What to Attend to: Using Bisimulation Metrics to Explore and Improve Upon What a Deep Reinforcement Learning Agent Learns

More Info
expand_more

Abstract

Recent years have seen a surge of algorithms and architectures for deep Re-
inforcement Learning (RL), many of which have shown remarkable success for
various problems. Yet, little work has attempted to relate the performance of
these algorithms and architectures to what the resulting deep RL agents actu-
ally learn, and whether this corresponds to what they should ideally learn. Such
a comparison may allow for both an improved understanding of why certain
algorithms or network architectures perform better than others and the devel-
opment of methods that specically address discrepancies between what is and
what should be learned.