« Le Séminaire Palaisien » | Clément Bonnet & Avetik Karagulyan
Chaque session du séminaire est divisée en deux présentations scientifiques de 40 minutes chacune : 30 minutes d’exposé et 10 minutes de questions. Clément Bonnet & Avetik Karagulyan animeront la session de janvier 2026 !
Inscriptions gratuites mais obligatoires, dans la limite des places disponibles. Un buffet sera servi à l'issue du séminaire.
Many applications in machine learning involve data represented as probability distributions. The emergence of such data requires radically novel techniques to design tractable distances and gradient flows on probability distributions over this type of (infinite-dimensional) objects. For instance, being able to flow labeled datasets is a core task for applications ranging from domain adaptation to transfer learning or dataset distillation. In this setting, we propose to represent each class by the associated conditional distribution of features, and to model the dataset as a mixture distribution supported on these classes (which are themselves probability distributions), meaning that labeled datasets can be seen as probability distributions over probability distributions. We endow this space with a metric structure from optimal transport, namely the Wasserstein over Wasserstein (WoW) distance, derive a differential structure on this space, and define WoW gradient flows. The latter enables to design dynamics over this space that decrease a given objective functional. We apply our framework to transfer learning and dataset distillation tasks, leveraging our gradient flow construction as well as novel tractable functionals that take the form of Maximum Mean Discrepancies with Sliced-Wasserstein based kernels between probability distributions. We also introduce a new sliced distance specifically designed to the space of probability over probability distributions through projections using the Busemann function.
Federated sampling algorithms have recently gained great popularity in the community of machine learning and statistics. In this talk we will present the problem of sampling in the federated learning framework and discuss variants of algorithms called Error Feedback Langevin algorithms (ELF) that are designed for this setting. In particular, we will discuss how the combinations of EF21 and EF21-P with the federated Langevin Monte-Carlo improve on prior methods.