Isabelle Bellin - 25/01/2013

Mathematicians discover organisational structure of galaxies by chance

Highlighting the filamentary structures of galaxies distributions

The methods for reconstructing three-dimensional shapes developed by Frédéric Chazal and David Cohen-Steiner, two researchers in the Geometrica team (based at Inria Saclay-Ile-de-France and Sophia Antipolis-Méditerranée) are proving to be of unexpected use to astrophysicists. This discovery may be unexpected, but it is not all that surprising.

At the end of 2009, with Quentin Mérigot, their PhD student, geometric computing researchers Frédéric Chazal and David Cohen-Steiner put the finishing touches to their new method for detecting geometric structures in large data sets. The method would, they hoped, make it possible to discover the way in which millions or even billions of raw data items were organised in multidimensional spaces, be they measurements of physical phenomena (temperature, pressure, etc.), point coordinates, biological data, sociological data, etc. The idea that a data set could reveal a particular organisational structure might seem absurd to a novice, but the researchers, for their part, were convinced. Their algorithms were designed to structure mountains of data in order to shed light on the underlying geometric shapes around which they were concentrated, shapes that only specialists in the field would be capable of using to understand the phenomena that generated them and to compare several observations.

The appeal of the method they were working on at that time was this very generic character. It remained to be verified experimentally... Then, one evening in early 2010, Frédéric Chazal decided, by chance, to run a simulation on data that an astrophysicist, Rien van de Weijgaert from the University of Gröningen (Netherlands), had sent him a few months earlier: the three-dimensional position of millions of galaxies. After a few tests, he obtained some strange 3D images, resembling intertwined filaments. Nothing particularly meaningful at first glance, but he sent them to his colleague just in case and learned, to his surprise, that this was exactly what he had been trying to show. Astrophysicists have long known that the positions of galaxies owe nothing to chance, with some areas of the universe being empty and others containing high concentrations of matter. They model these concentrations and try to recreate what they have observed. It is easy to imagine the usefulness of these images, whose strong resemblance to the galaxy structures the astrophysicists are looking for is immediately obvious: they could be used to test the hypotheses and parameters of their models.

A meeting was arranged. The Dutch researcher came to spend a week in Saclay with one of his PhD students to test other data sets on other portions of the universe and check that the method was valid. It was. The PhD student was given responsibility for exploring this avenue. Inria's researchers travelled to the Netherlands in 2011 to train him in their method. Together, they developed new tools, with sound mathematical foundations, to show the geometrical structures linked to the position of galaxies while escaping the effects of parasite data. Today, two years later, the four of them are writing an article together, which will soon be submitted for publication in an astrophysics journal.

Galaxies' distribution in a piece of the universe, each small sphere representing a galaxy. The left image corresponds to the original data, the right to data after treatment with team's algorithm.

Good fortune, or proof of the pertinence of the method? Only time will tell. One thing is for sure: the approach developed by the Geometrica team is based on a generic mathematical framework, a formalism common to dozens of algorithms developed over the last twenty or so years to find the geometry of an object using the coordinates of points measured on its surface. From this, they have deduced new, even more useful algorithms, suited to lower-quality, sometimes erroneous data, such as the astrophysicists' data. Of the many ways of organising this data, their method has been found to be a pertinent one, in line with the predictions of astrophysicists based on their theories on the evolution of the universe.

And this is surely only the beginning: understanding the structure of the immense mass of data recorded in all fields of science, economics and everyday life, and being capable of extracting pertinent, strategic information from it, is one of the great challenges of the 21st century.

Testimony of Rien van de Weygaert, from University of Groningen (The Netherlands)

Rien van de Weygaert contacted Frédéric Chazal and David Cohen-Steiner in 2009 to help him develop a toolset for understanding the “cosmic web” - intriguingly structured spatial patterns that exist in the Universe on distances of a few to more than a 100 million light years (MLy) across. He had already been working with other members of the Geometrica team since 2006, namely Monique Teillaud (of INRIA Sophia-Antipolis) since she was developing new periodic boundary software that he and his mathematician colleague Gert Vegter could use for their computational geometry-based routines.

Rien is currently working with the CGAL library of computational geometry algorithms as well as the geometric inference code developed by Frédéric and David. One important aspect of the work has been to embed these algorithms in a range of codes that the researchers have developed to analyse different aspects of the spatial distribution of galaxies and matter on MLy scales.

“Our work is particularly important for analysing galaxy surveys because these observational datasets contain a lot of selection effects and other sources of error,” explains Rien. “The geometric inference tools we employ provide an efficient and clean way of tracing the underlying skeleton of cosmic mass distribution.”

Although the work is very much in the development stage, the products of the geometric inference technique (namely, the identification of celestial “filaments”, cluster nodes and how they are connected in space) will hopefully allow the researchers to quantitatively test physical theories of how the cosmic web formed.

“Thanks to our collaboration, my graduate student Pratyush Pranav and I have been obtaining maps of the cosmic web skeleton as traced by the SDSS galaxy redshift survey (a map of the spatial distribution of a million galaxies in the nearby Universe). The CGAL routine is also enabling us to efficiently calculate the dynamics of cosmic web evolution.”