The great successes of learning theory characterize how fast we can

learn the best parameters for the hardest possible learning task.

However, both in statistical learning and in online learning, the

hardest possible learning task is often pathological. For example, in

classification it is most difficult to estimate parameters when the

class labels are (close to) random, but in this case there is no point

in doing classification anyway. Similarly, in online learning with

expert advice, the hardest case happens when the experts are all making

random guesses. But then there is nothing to gain from learning the best

expert.

It is therefore interesting to look at ways of characterizing the

difficulty of learning tasks, and indeed there exists a whole zoo of

different conditions that allow learning at rates that are faster than

is possible for the worst possible learning task. I will give a series

of examples illustrating the most important conditions that allow such

fast rates, and then explain how these conditions may all be understood

as being essentially equivalent, at least for bounded losses.

Based on the following papers:

- Fast Rates in Statistical and Online Learning - Van Erven, Grünwald,

Mehta, Reid, Williamson. JMLR, 2015. (Special issue dedicated to the

memory of Alexey Chervonenkis.)

- Combining Adversarial Guarantees and Stochastic Fast Rates in Online

Learning - Koolen, Grünwald, Van Erven. NIPS, 2016.