Catégorie
Conferences / Workshops

[đŸ‘„ WORKSHOP] "Mathematical Foundations of AI" - 6th Ă©dition

Bandeau image
[đŸ‘„ WORKSHOP] "Mathematical Foundations of AI" - 6th Ă©dition
Date de tri
Lieu de l'événement
Amphi 55B, SCAI - 4 place Jussieu, 75005 Paris

Share

lkml
Chapo
See you on December 10, 2025, at SCAI (Paris) for the sixth edition of the day dedicated to the foundations of AI mathematics!
Contenu
Corps de texte

Registrations closed!

The “Mathematical Foundations of AI” day, organized jointly by the DataIA Institute and SCAI, in association with the scientific societies: the Jacques Hadamard Mathematical Foundation (FMJH), the Paris Mathematical Sciences Foundation (FSMP), the MALIA group of the French Statistical Society, and the Francophone Machine Learning Society (SSFAM), aims to provide an overview of some promising research directions at the interface between statistical learning and AI.

It is part of the Maths & AI network in the Ile-de-France region, of which the FMJH and DataIA are members.

This new edition will focus on issues of identifiability, whether for tensor analysis, neural networks, or generative AI. The day will feature three plenary presentations and a spotlight presentation by renowned researchers and specialists in the field:

  • François Malgouyres (University of Toulouse), specialist in tensors and tensor identifiability issues;
  • Elisabeth Gassiat (Orsay Mathematics Laboratory), professor and leading statistician, who has conducted research on VAE identifiability issues;
  • Pavlo Mozharovskyi (TĂ©lĂ©com ParisTech), professor and recognized expert on explainability, with research conducted on concept-based learning;
  • Konstantin Usevich (CRAN, CNRS)

This day is also an opportunity for young researchers to present their work through short presentations.


Organizing Committee

  • Marianne Clausel (UniversitĂ© de Lorraine)
  • Emilie Chouzenoux (INRIA Saclay, Institut DataIA)

 

Scientific Committee

  • Ricardo Borsoi (CNRS, CRAN)
  • StĂ©phane ChrĂ©tien (Univ. Lyon 2)
  • Sylvain Le Corff (Sorbonne UniversitĂ©)
  • Myriam Tami (CentraleSupĂ©lec)

 

Nom de l'accordéon
9 - 10am | Keynote 1 : François Malgouyres (Université de Toulouse)
Texte dans l'accordéon

Geometry-induced regularization and identifiability of deep ReLU networks

Abstract: The first part of the presentation will use a simple, educational example to introduce the mathematical results developed in the second part, in order to make the concept accessible to as many people as possible. Due to implicit regularization that favors “good” networks, neural networks with a large number of parameters do not generally overfit. Related phenomena that are still poorly understood include the properties of flat minima, saddle-to-saddle dynamics, and neuron alignment. To analyze these phenomena, we study the local geometry of deep ReLU neural networks. We show that, for a fixed architecture, when the weights vary, the image of a sample X forms a set whose local dimension changes. The parameter space is thus partitioned into regions where this local dimension remains constant. The local dimension is invariant with respect to the natural symmetries of ReLU networks (i.e., positive scale changes and neuron permutations). We then establish that the geometry of the network induces regularization, with the local dimension constituting a key measure of regularity. Furthermore, we relate the local dimension to a new notion of flatness of minima as well as to saddle-to-saddle dynamics. For networks with a hidden layer, we also show that the local dimension is related to the number of linear regions perceived by $X$, which sheds light on the effect of regularization. This result is supported by experiments and linked to neuron alignment. Finally, I will present experiments based on MNIST, which highlight the regularization induced by geometry in this context. Finally, I will make the connection between properties of the local dimension and the local identifiability of the network parameters.
 

Biography: François Malgouyres is a professor at the University of Toulouse (France). His research focuses on the theoretical and methodological foundations of deep learning, with a particular interest in understanding the mathematical structure of neural networks. He has worked on network geometry, parameter identifiability, function approximation using neural networks, weight quantization in recurrent networks, and the design of orthogonal convolutional layers. He has also taken an interest in the straight-through estimator—the reference algorithm for quantized weight optimization—and its applications to sparse signal reconstruction. Before joining the University of Toulouse, François Malgouyres was a lecturer at Paris Nord University, a postdoctoral fellow at the University of California, Los Angeles (UCLA), and then a doctoral student at ENS Paris-Saclay (then located in Cachan).

Corps de texte

10 - 10:30am | Coffee Break

Nom de l'accordéon
10:30 - 11am | Spotlight Talk : Konstantin Usevich (CRAN, CNRS)
Texte dans l'accordéon

Identifiability of Deep Polynomial Neural Networks

Résumé : Polynomial Neural Networks (PNNs) possess a rich algebraic and geometric structure. However, their identifiability -- a key property for ensuring interpretability -- remains poorly understood. In this work, we present a comprehensive analysis of the identifiability of deep PNNs, including architectures with and without bias terms. Our results reveal an intricate interplay between activation degrees and layer widths in achieving identifiability. As special cases, we show that architectures with non-increasing layer widths are generically identifiable under mild conditions, while encoder-decoder networks are identifiable when the decoder widths do not grow too rapidly compared to the activation degrees. Our proofs are constructive and center on a connection between deep PNNs and low-rank tensor decompositions, and Kruskal-type uniqueness theorems. We also settle an open conjecture on the dimension of PNN's neurovarieties, and provide new bounds on the activation degrees required for it to reach the expected dimension.

 

Biographie : Konstantin Usevich is a CNRS researcher (chargé de recherche) at CRAN (Centre de Recherche en Automatique de Nancy), member of the SiMul group. His  research interests are in linear and multilinear algebra, optimization, focused on tensor decompositions, low-rank approximations and their applications in statistics, signal processing and machine learning. He got my PhD from St. Petersburg University (Russia) in 2011. Prior to joining the CNRS in 2017, he was a postdoc at University of Southampton (UK), Vrije Universiteit Brussel (Belgium) and GIPSA-lab (Grenoble, France).

Nom de l'accordéon
11am - 12:00 | Keynote 2 : Elisabeth Gassiat (Orsay Mathematics Laboratoty)
Texte dans l'accordéon

Title (TBA)

Abstract:

Biography:

Ancre
12:00 - 12:45pm | Short contributive talks (3x15mn)
Corps de texte

Rémi VAUCHER (Halias Technology)

Signatures, and Quiver Representations: don't be afraid to use Algebra in Causality

Understanding and testing causal relationships is a central challenge in modern artificial intelligence. In this talk, we introduce a mathematical perspective on causality based on two theoretical tools: path signatures and quiver representations. Signatures provide a hierarchical and universal description of temporal data, enabling the detection of differential causality. Quiver representations then offer an algebraic framework in which these relations can be encoded and tested in a structured and interpretable way. This approach bridges algebra, geometry and machine learning, suggesting new avenues for causal inference in dynamic settings. We will present the core mathematical ideas and illustrate their potential through examples. The Quiver Representations part is a joint work with Antoine Caradot.


Manal BENHAMZA (CentraleSupélec)

Counterfactual Robustness: a framework to analyze the robustness of Causal Generative Models across interventions

Data generation using generative models is one of the most impressive growing field of artificial intelligence. However, such models are black boxes trained on huge datasets lacking interpretability properties. Causality is a natural framework to include expert knowledge into deep generative models. Other expected beneficial properties of causal generative models are fairness, transparency and robustness of the generation process. Up to our best knowledge, while many works have analyzed general generative models’ robustness, surprisingly none have focused on their causal counterpart even if their robustness is a common claim. In the present paper, we introduce the fundamental concept of counterfactual robustness, which evaluates how sensitive causal generative models are to interventions with respect to distribution shifts. Through a series of experiments on synthetic and real-life datasets, we demonstrate that all the studied causal generative models are not equal with respect to counterfactual robustness. More surprisingly, we show that all causal interventions are also not equally robust. We provide a simple explanation based on the causal mechanisms between the variables, that is theoretically grounded in the case of an extended CausalVAE. Our in-depth analysis also yields an efficient way to identify the most robust intervention based on prior knowledge on the causal graph.


Ali AGHABABAEI (Université Grenoble Alpes)

Unified Framework for Pre-trained Neural Network Compression via Decomposition and Optimized Rank Selection

Modern deep neural networks often contain millions of parameters, making them impractical for deployment on resource-constrained devices. In this talk, I present RENE (Rank adapt tENsor dEcomposition), a unified framework that combines tensor decomposition with automatic rank selection to efficiently compress pre-trained neural networks. Unlike traditional approaches that rely on manually chosen or grid-searched ranks, RENE performs continuous rank optimization through a multi-step search strategy, exploring large rank spaces while keeping memory and computation manageable. The method identifies layer-wise optimal ranks without requiring training data and subsequently fine-tunes the decomposed model through a lightweight distillation process. Experiments on benchmark datasets, covering both convolutional and transformer architectures, demonstrate superior compression rates with strong accuracy preservation.

Corps de texte

12:45 - 1:45pm | Lunch Break

Nom de l'accordéon
1:45 - 2:45 | Keynote 3 : Pavlo Mozharovskyi (Télécom ParisTech)
Texte dans l'accordéon

Title (TBA)

Abstract:

Biography:

Corps de texte

2:45 - 3:30pm | Sweet Break

Ancre
3:30 - 5pm | Short contributive talks
Corps de texte

Sonia MAZELET (Polytechnique)

Unsupervised Learning for Optimal Transport plan prediction between unbalanced graphs

Optimal transport between graphs, based on Gromov-Wasserstein and other extensions, is a powerful tool for comparing and aligning graph structures.  However, solving the associated non-convex optimization problems is computationally expensive, which limits the scalability of these methods to large graphs. In this work, we present Unbalanced Learning of Optimal Transport (ULOT), a deep learning method that predicts optimal transport plans between two graphs. Our method is trained by minimizing the fused unbalanced Gromov-Wasserstein (FUGW) loss. We propose a novel neural architecture with cross-attention that is conditioned on the FUGW tradeoff hyperparameters. We evaluate ULOT on synthetic stochastic block model (SBM) graphs and on real cortical surface data obtained from fMRI. ULOT predicts transport plans with competitive loss up to two orders of magnitude faster than classical solvers.  Furthermore, the predicted plan can be used as a warm start for classical solvers to accelerate their convergence. Finally, the predicted transport plan is fully differentiable with respect to the graph inputs and FUGW hyperparameters, enabling the optimization of functionals of the ULOT plan.


Alexandre Chaussard (LPSM, Sorbonne Université)

Identifiability of AVEs

When studying ecosystems, hierarchical trees are often used to organize entities based on proximity criteria, such as the taxonomy in microbiology, social classes in geography, or product types in retail businesses, offering valuable insights into entity relationships. Despite their significance, current count-data models do not leverage this structured information. In particular, the widely used Poisson log-normal (PLN) model, known for its ability to model interactions between entities from count data, lacks the possibility to incorporate such hierarchical tree structures, limiting its applicability in domains characterized by such complexities. To address this matter, we introduce the PLN-Tree model as an extension of the PLN model, specifically designed for modeling hierarchical count data. By integrating structured deep variational inference techniques, we propose an adapted training procedure and establish identifiability results in the Poisson Log-Normal framework, enhancing both theoretical foundations and practical interpretability. Additionally, we present a proof-of-concept implication of identifiability by illustrating the practical benefits of using identifiable features for classification tasks.


Chuong LUONG (Université de Lorraine)

New Conditions for the Identifiability of Block-Term Tensor Decompositions

Tensor decompositions have become an important tool in machine learning and data analysis, as they can exploit the multidimensional structure of data. In particular, identifiability guarantees provide essential theoretical support to various latent variable modelling and source separation (e.g., unmixing) methods. However, for decompositions in block terms - which enjoy increased flexibility compared to the classical canonical polyadic decomposition, since each component is a block of multilinear ranks (L_r, M_r, N_r) -fewer results are available. In this ongoing work, we study the identifiability of general block-term decompositions of three-dimensional tensors from an algebraic-geometric viewpoint. Our current results provide new sufficient conditions for the identifiability of generic tensors based on the tensor dimensions, the shape of each block, and the number of components in the model (i.e., the tensor rank). Compared to previous results available in the literature, our conditions show that identifiability can hold for a larger number of components in certain regimes.


Mélissa ABIDER (Université Paris-Saclay)

Between identifiability and explainability: a mathematical and empirical exploration of variational models

Deep generative models, such as variational autoencoders (VAEs), learn to represent complex data in a hidden latent space. However, this representation is not always unique: several internal structures can produce the same observed results. This identifiability problem raises fundamental questions about the understanding and interpretation of AI models. In this presentation, I will offer both a theoretical and visual exploration of this phenomenon. I will briefly review the probabilistic framework of VAEs, before showing, through a small experiment, how regularization (ÎČ) and data noise influence the shape and stability of the latent space. These observations illustrate the trade-off between model fidelity and clarity of internal representation. This work aims to link the mathematical aspects of identifiability to the challenges of explainability in AI, and to open the discussion on how these properties could guide the design of more interpretable models.