Best of Both: Beating Stochastic and Adversarial Semi-Bandits simultaneously and optimally

Without knowing whether the environment is stochastic or adversarial, can a Bandit algorithm be optimal in both regimes? Surprisingly, the answer is not only yes, but the solution is a simple member of the well-studied family of OMD/FTRL algorithms. Its simplicity gives hope that we can extend the results to bandits with a richer structure.
In this talk, I introduce the general proof framework that enables us to analyse OMD/FTRL in stochastic environments. I will further show how this framework can be applied to Combinatorial Semi-Bandit and discuss the difficulties that arise under Full-Bandit feedback.

Monday, May 13, 2019 - 11:00
Inria, room A00
Julian Zimmert
University of Copenhagen