Choisir la langue :

A. Mensch (ENS Paris): Functional smoothing for sparse and cost-informed prediction of output distributions

Institutional tag: 
Thematic tag(s): 

To facilitate training with gradients, supervised learning methods often
transform selecting a single element within a set of outputs to predicting a
probability distribution over this set (using e.g. the softmax operator). In
this talk, we will understand this transformation as a functional smoothing of
the output selection mechanism. Engineering this Nesterov smoothing yields new modelling perspective. First, we will observe that
selecting an output within a combinatorial set (e.g. a sequence of tags) is
often solved using dynamic programming algorithms. Smoothing turn DP algorithms
into differentiable operators, that may predict potentially sparse
probabilities over the output set. Secondly, we will design a smoothing that
takes into account a cost function defined on the output set. This approach
transforms the softmax operator into a cost-informed geometric softmax, that
has the further capabilities of predicting distributions over a continuous set.

Thursday, November 14, 2019 - 11:00 to 12:00
Inria B21
Arthur Mensch