« Le Séminaire Palaisien » | Boris Muzellec et Sophie Donnet sur l'apprentissage automatique et la statistique
Optimal transport (OT) has recently gained popularity in the machine learning community, both as a way to measure the discrepancy between two probability measures and as a principled method to transform a distribution into another. Yet, in most cases OT does not admit a closed-form expression and either has to be evaluated through a costly optimization problem, or approximated by regularizing this problem.
Alternatively, a powerful approach consists in keeping the problem as such, and regularizing the data instead to fall back to cases that can be solved efficiently. In particular, representing the data using elliptical distributions, which are fully described by their mean vector and covariance matrix, leads to one of the very few cases of closed-form expressions for OT. Indeed, for such distributions, the Wasserstein distance can be decomposed as the sum of the Euclidean distance between means and the Bures distance between covariance matrices, which defines a Riemannian metric on the set of positive semi-definite matrices.
In this talk, we show how the Bures-Wasserstein distance can be used in machine learning applications, by presenting scalable algorithms for computing and differentiating the Bures metric. In particular, we show that a suitable reparameterization allows to emulate Riemannian gradient descent in a projection-free Euclidean setting. Finally, we show that the Bures-Wasserstein geometry can seamlessly incorporate other methods for approximating OT, such as low-dimensional projections or entropic regularization, and propose applications to probabilistic word embeddings.
Modeling relations between individuals is a classical question in social sciences, ecology, etc. In order to uncover a latent structure in the data, a popular approach consists in clustering individuals according to the observed patterns of interactions. To do so, Stochastic block models (SBM) and Latent Block models (LBM) are standard tools for clustering the individuals with respect to their comportment in a unique network. However, when adopting an integrative point of view, individuals are not involved in a unique network but are part of several networks, resulting into a potentially complex multipartite network. In this contribution, we propose a stochastic block model able to handle multipartite networks, thus supplying a clustering of the individuals based on their connection behavior in more than one network. Our model is an extension of the latent block models (LBM) and stochastic block model (SBM). The parameters -- such as the marginal probabilities of assignment to blocks and the matrix of probabilities of connections between blocks -- are estimated through a variational Expectation-Maximization procedure. The numbers of blocks are chosen with the Integrated Completed Likelihood criterion, a penalized likelihood criterion. The pertinence of our methodology is illustrated on two datasets issued from ecology and ethnobiology.