The Palaisien Seminar

« Seminar Le Palaisien » | Alejandro de la Concha & Azadeh Khalegh

Bandeau image
Seminar Le Palaisien
Date de tri
Lieu de l'événement
ENSAE, room 1001


Le "Séminaire Palaisien" gathers, every first Tuesday of the month, the vast research community of Saclay around statistics and machine learning.
Corps de texte

Each seminar session is divided into two scientific presentations of 40 minutes each: 30 minutes of presentation and 10 minutes of questions.

Alejandro de la Concha and Azadeh Khalegh will moderate the April 2023 session.

Registration is free but mandatory, subject to availability. A sandwich basket is offered.

12:15pm-12:55pm : Alejandro de la Concha "Collaborative likelihood-ratio estimation over graphs"
Corps de texte

Abstract: Assuming we have i.i.d observations from two unknown probability density functions (pdfs), p and p′, the likelihood-ratio estimation (LRE) is an elegant approach to compare the two pdfs just by relying on the available data, and without knowing the pdfs explicitly. In this paper we introduce a graph-based extension of this problem: Suppose each node v of a fixed graph has access to observations coming from two unknown node-specific pdfs, pv and p′v; the goal is then to compare the respective pv and p′v of each node by also integrating information provided by the graph structure. This setting is interesting when the graph conveys some sort of 'similarity' between the node-wise estimation tasks, which suggests that the nodes can collaborate to solve more efficiently their individual tasks, while on the other hand trying to limit the data sharing among them. Our main contribution is a distributed non-parametric framework for graph-based LRE, called GRULSIF, that incorporates in a novel way elements from f-divergence functionals, Kernel methods, and Multitask Learning. Among the several applications of LRE, we choose the two-sample hypothesis testing to develop a proof of concept for our graph-based learning framework. Our experiments compare favorably the performance of our approach against state-of-the-art non-parametric statistical tests that apply at each node independently, and thus disregard the graph structure.

1pm-1:45pm : Azadeh Khaleghi "Oblivious Data for Fairness with Kernels"
Corps de texte

Abstract: We investigate the problem of algorithmic fairness in the case where sensitive and non-sensitive features are available and one aims to generate new, ‘oblivious’, features that closely approximate the non-sensitive features, and are only minimally dependent on the sensitive ones. We study this question in the context of kernel methods. We analyze a relaxed version of the Maximum Mean Discrepancy criterion which does not guarantee full independence but makes the optimization problem tractable. We derive a closed-form solution for this relaxed optimization problem and complement the result with a study of the dependencies between the newly generated features and the sensitive ones. Our key ingredient for generating such oblivious features is a Hilbert-space-valued conditional expectation, which needs to be estimated from data. We propose a plug-in approach and demonstrate how the estimation errors can be controlled. While our techniques help reduce the bias, we would like to point out that no post-processing of any dataset could possibly serve as an alternative to well-designed experiments.

Reference: S. Grünewälder, A. Khaleghi, Oblivious Data for Fairness with Kernels, Journal of Machine Learning Research, (208): 1-36, 2021.