Umut Simsekli, a researcher in the Sierra project team and leader of the ERC Dynasty project, was recently awarded a European Starting Grant. His aim is to develop a mathematical theory to better understand the properties of deep-learning algorithms, one of the major challenges facing the AI community.
An original deep learning project
The turn of the year brought good news for Umut Simsekli, a researcher at the Inria Paris Centre in the joint project team Sierra (UMR 8548, Inria, CNRS, École Normale Supérieure). This young AI researcher and specialist in deep learning has been awarded a prestigious and selective European ERC Starting Grant, which is awarded to young scientists (two to seven years after completing their PhD) and allows them to set up a research team around an original subject of major technological importance. Umut Simsekli will lead the ERC Dynasty project, which aims to develop a theoretical approach to deep learning to improve performance reliability.
However, this project might never have existed, if Umut Simsekli had continued his early career as a professional musician! Initially a bass player in a pop/jazz group, he finally chose computer science and scientific research and worked on advanced techniques for audio signal processing. “With this research, I found a way to combine my two passions: music and computer science”, he recalls. “After a few years as a postdoctoral researcher and lecturer at Télécom Paris, I joined Inria in 2020 within the Sierra team, a key European player in the field of machine learning, with whom I also collaborated”.
At the intersection of mathematics, statistics and computer science
The Sierra team is composed of five permanent researchers supported by around thirty PhD students, postdoctoral researchers and research engineers working at the intersection of applied mathematics, statistics and computer science. Their aim is to develop conceptual (mathematical theories and algorithms) and applied (code and computer programs) tools in deep learning.
This branch of artificial intelligence allows computers to learn and execute several tasks without being explicitly programmed to do so. It is known to have many applications, such as in the transport industry (for the development of driverless navigation systems), medicine (to help diagnose cancer) and banking (assessment of a person's ability to repay a loan), as well as for certain generic problems in signal processing such as voice or facial recognition.
Simply put, machine learning uses a large quantity of data to accomplish a task automatically using an optimisation algorithm.
For example, in facial recognition, the face of a person in a photo or video is recognised based on a collection of photographs. The computer that performs this task uses complex mathematical functions that are configured by the optimisation algorithm during a “learning” phase based on information in the training data.
A theoretical framework for understanding and improving algorithms
“Some machine learning methods work like black boxes: the researchers and engineers who design them observe their performance - which is often very impressive - when automatically completing the tasks requested… but they don’t have the theoretical elements to understand this efficiency”, says Umut Simsekli. “This means that, in most cases, machine learning algorithms are developed by trial and error using corrections to achieve the desired result. In some cases, we have theoretical results supporting these approaches but there are also often many discrepancies between the results and the algorithmic and computational reality”.
The Dynasty project’s aim is therefore to build a theoretical framework for understanding, and especially predicting, the properties of machine learning algorithms. What benefits would this bring to the community? Firstly, significantly improved accuracy in programs based on machine learning as well as reduced computing time, since the trial-and-error phases, which are very demanding in terms of computational resources, would also be reduced. Above all, however, it should provide a set of open-source codes for future applications.
My research will focus on a specific class of mathematical techniques used in deep learning (“non-convex optimisation”). I will be looking at optimisation algorithms whose properties are, in my view, still poorly understood by researchers.
Promising preliminary results
Although this is an innovative approach, it is not completely unfamiliar to the researcher, who will be using research from the physics of “dynamical systems” which describes the evolution of complex systems (ecosystems, climate, etc.). “Optimisation algorithms proceed in an iterative way, like a mountaineer advancing step by step to reach the summit. In this sense, these algorithms behave in a similar way to dynamic systems and I believe that the results and approaches developed in this field will allow me to accomplish my objectives in mathematical formalisation”.
The researcher has already published promising preliminary results and has five years of work ahead of him in which to make a long-awaited contribution. No doubt that, like in the jazz music he loves, he will be able to improvise to adapt to the vagaries of his research and produce a perfect score!
Umut Simsekli completed his university studies in Turkey, his country of origin, where he obtained a Master’s and a PhD in computer science at Boğaziçi University in Istanbul. His PhD thesis, which he defended in 2015, focused on machine learning techniques applied to signal processing. With his PhD in hand, Umut Simsekli moved to France where he continued his research in computer science, exploring the theoretical aspects of deep learning as a postdoctoral researcher and then as a lecturer at Télécom Paris between 2015 and 2020, as a visiting researcher in the Department of Statistics at in the Department of Statistics at the University of Oxford in 2019 and finally as a researcher at Inria, which he joined in 2020.