Sites Inria

Version française

CELESTE Research team

mathematical statistics and learning

Team presentation

Data science – a vast field including statistics, machine learning, signal processing, data visualization and databases – has become front-page news, with a potentially major impact on society beyond the important role it has had in science for many decades. Within data science, the statistical community has long experience in how to infer knowledge from data, with strong mathematical foundations. The more recent field of machine learning has also involved major achievements, by combining statistics with optimization, and using a fresh point of view that came from applications where prediction was more important than building models.

The positioning of the CELESTE project-team is at the interplay between statistics and learning. We are statisticians, members of a mathematics laboratory, with a strong mathematical background, and are interested by interactions between theory, algorithms and applications. Indeed, applications lead to most interesting theoretical problems, while theory can play a key role in (i) understanding how and why successful statistical/learning algorithms work — hence improving them — and (ii) building new algorithms upon mathematical statistics foundations.

Research themes

Our work involves analyzing popular statistical learning algorithms from a mathematical statistics point of view and developing new learning algorithms based upon our skill set. Our main methodological and theoretical research axes are:
- estimator selection
- the relationship between statistical accuracy and computational complexity
- robustness to outliers and heavy tails
- statistical inference: (multiple) tests and confidence regions.

A key ingredient in our research program is matching our theoretical/methodological results with numerous real-world situations. Indeed, CELESTE members work in many domains including – but not limited to – neglected tropical diseases, pharmacovigilance, high-dimensional transcriptomic analysis, and energy and the environment.

International and industrial relations

Celeste has several ongoing collaborations with the R&D department of EDF, and collaborates with a number of other companies (via CIFRE Ph.D. theses for instance).

Celeste has academic collaborations with researchers from many institutions around the world, including MPI Tubingen, University of Warwick, Cornell University, Brown University, University of Washington at Seattle, Princeton University and IMPA Rio.

Keywords: Mathematical statistics; statistical learning; estimator selection; computational and statistical trade-offs; robustness; hypothesis testing