Choisir la langue :

Efficient exploration in sequential decision making problems

Institutional tag: 
Thematic tag(s): 

I will discuss recent results in designing more adaptive bandit algorithms. Our first approach is based on the bootstrap method and leads to a more efficient and data-dependent algorithm for the multi-armed bandit problem. Our second approach is a model-selection method for bandit problems. As an example of the usefulness of the approach, when the reward function is largely independent of the contexts, the method will automatically converge to the simpler and more efficient non-contextual algorithm.

Dates: 
Thursday, September 12, 2019 - 11:00
Location: 
Inria, room A00
Speaker(s): 
Yasin Abbasi-Yadkori
Affiliation(s): 
VinAI