Choisir la langue :

Potential-based reward shaping as a tool to safely incorporate auxiliary information

There are many sources of reward in the wild, but it's difficult to integrate them into the Reinforcement Learning problem without modifying the global learning target and inadvertently creating loopholes for the agent. I will cover potential-based reward shaping (PBRS) as a suitable framework for doing this safely, describe two techniques for rendering external information into the form that PBRS requires, and discuss their empirical performance in the case when the external information is some form of expert data.

Friday, March 10, 2017 - 11:00
Inria, room A00
Anna Harutyunyan
Université Libre de Bruxelles