Choisir la langue :

T. Le Van (Inria Magnet): Semiring Rank Matrix Factorisation

Institutional tag: 

Rank data, in which each row is a complete or partial ranking of
available items (columns), is ubiquitous. It can be used to
represent, for instance, preferences of users, the levels of gene
expression, and the outcomes of sports events. While rank data has
been analysed in the data mining literature, mining patterns in such
data has so far not received much attention.
In this talk, I will discuss matrix factorisation based methods for
pattern set mining in rank data. First, I will discuss a general
framework called Semiring Rank Matrix Factorisation. The framework
employs semiring theory rather than relying on the traditional linear
algebra for matrix factorisation, which results in a more elegant way
of aggregating rankings.
Subsequently, I will introduce two instantiations of the framework:
Sparse RMF and ranked tiling. We introduce Sparse RMF to mine a set
of sparse rank vectors that can be used to summarise given rank
matrices succinctly and show the main categories of rankings. We
introduce ranked tiling to discover a set of data regions in a rank
matrix which have high ranks. Such data regions are interesting as
they can show local associations between subsets of the rows and
subsets of the columns of the given matrices.
Finally, I will discuss how to use ranked tiling to formally define
the concept of driver pathways, from which we can find cancer
subtypes, i.e., groups of tumour samples having the same molecular
mechanism driving tumorigenesis.

Thursday, April 6, 2017 - 10:00 to 11:00
Inria B31
Thanh Le Van
Inria Magnet