Choisir la langue :

Reliability in Batch Reinforcement Learning

Institutional tag: 
Thematic tag(s): 

This presentation will consider the Batch setting in Reinforcement Learning and investigate algorithms that ensure to some extent that the new policy is at least as good as the baseline policy: the one that has been used during the data collection. We will make a particular focus on SPIBB, an approach that consists in reproducing the baseline when the model is too uncertain about a specific state-action pair. We will quickly make an overview of three papers on the subject: SPIBB, Soft-SPIBB, and Estimated-baseline SPIBB. More details can be found here: http://aka.ms/spibb.

Dates: 
Tuesday, December 17, 2019 - 14:00
Location: 
Inria, Salle Plénière
Speaker(s): 
Romain Laroche
Affiliation(s): 
Microsoft Research (Montréal)