The great successes of learning theory characterize how fast we can
learn the best parameters for the hardest possible learning task.
However, both in statistical learning and in online learning, the
hardest possible learning task is often pathological. For example, in
classification it is most difficult to estimate parameters when the
class labels are (close to) random, but in this case there is no point
in doing classification anyway. Similarly, in online learning with
expert advice, the hardest case happens when the experts are all making
random guesses. But then there is nothing to gain from learning the best
It is therefore interesting to look at ways of characterizing the
difficulty of learning tasks, and indeed there exists a whole zoo of
different conditions that allow learning at rates that are faster than
is possible for the worst possible learning task. I will give a series
of examples illustrating the most important conditions that allow such
fast rates, and then explain how these conditions may all be understood
as being essentially equivalent, at least for bounded losses.
Based on the following papers:
- Fast Rates in Statistical and Online Learning - Van Erven, Grünwald,
Mehta, Reid, Williamson. JMLR, 2015. (Special issue dedicated to the
memory of Alexey Chervonenkis.)
- Combining Adversarial Guarantees and Stochastic Fast Rates in Online
Learning - Koolen, Grünwald, Van Erven. NIPS, 2016.