Yves Robert receives the IEEE TCSC Award for his work on high-performance computing
Supercomputers allow the smooth running of programmes intended to solve major scientific challenges, whether they involve the discovery of new proteins or climate modelling. They are composed of thousands of processors working in parallel and are governed by algorithms which constantly change to adapt to their increasingly complex architectures. The algorithm researcher Yves Robert is one of the global specialists in this discipline. He is also the first European to receive the IEEE TCSC Award, which rewards the work of a researcher in the field of high-performance computing.
What are your main research topics?
My work mainly involves the development of algorithms for high-performance computing (HPC) platforms. The architecture of the most powerful supercomputers is composed of thousands of processors, each with 8, 16 or even 64 cores. All these processors must work in parallel in order to squeeze the most out of the computing power. One of my research topics consists of creating algorithms for carrying out scientific calculations in parallel, more particularly linear algebra calculations, which by their very nature are highly sequential. This is a major challenge, as the resolution of linear systems currently represents almost 80% of the computing time of scientific applications.
You are also working on developing resilience techniques. What does that mean?
The more processors there are in a supercomputer, the greater the risk of one of them becoming faulty. If a processor stops working, the resolution of a programme launched several hours earlier is compromised. To avoid this, we are developing algorithms intended to limit the effects of faults and failures. One example is by setting up checkpoints to be taken when the processors are being used the least, in order to limit the slowdown of the program. The new supercomputers present us with an additional challenge. Their memories are subject to so-called "silent" errors, caused primarily by cosmic rays. These are difficult to detect and cause computing errors which corrupt the end result. The problem lies in developing algorithms capable of detecting precisely when the error occurred, so as to select a valid checkpoint from which to restart the computing.
When did you begin your work on algorithms for HPC?
In 1982. But HPC did not yet exist in their current form when I first took an interest in algorithms for resolving linear systems. At the time, computers worked according to the principle of shared memory. Most researchers did not believe in distributed-memory parallel processing, in other words machines in which each processor has its own memory and can communicate with all the others. We were lucky, because such supercomputers have finally become the norm. Since then, we have been constantly monitoring the development of technologies in order to propose the most suitable algorithms for the new architectures. My work on resilience is more recent, as I have been working on it for 3 or 4 years now. In this field, I work on developing algorithms designed to meet the needs of future Exascale supercomputers, which could be here before the end of the decade.
You are the first European to receive the IEEE TCSC Award. What does this reward mean to you?
It is a great pleasure for me, as it is gratifying to be rewarded by my peers for my work and my service to the scientific community.
These articles could interest you:
IEEE TCSC Award
Presented by the IEEE (Institute of Electrical and Electronics Engineers), the IEEE-TCSC Award for Excellence rewards a researcher for their significant and acknowledged contribution in the field of "scalable computing", as well as for the impact of their work on the scientific community. The winner receives a medal and a symbolic cheque for $1,000 during the IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. They are also invited to speak at a plenary conference. This year's event took place in Chicago from 26 to 29 May. And this was the first time, since the Award's creation, that it was presented to a European.
- A former student of the Ecole Normale Supérieure in Cachan, mathematics agrégé (high-level French teaching qualification), 3rd level PhD student in Applied Mathematics at the Université Joseph Fourier in Grenoble and Doctor of Science in Computer Science at the Institut National Polytechnique in Grenoble, Yves Robert has been professor at the Laboratoire d’Informatique of the ENS-Lyon since 1988. He is also a researcher in the ROMA project team. He has been a senior member of the Institut Universitaire de France since 2007. He has also been a guest researcher at the University of Tennessee Knoxville since 2011. Yves Robert has been working on algorithms for some thirty years. His fields of research involve primarily the optimisation of resources and the resilience of high-performance computing platforms.
- Yves Robert Web Page