Sites Inria

Version française

Acoustic - Research

Jean-Michel Prima - 18/06/2013

Novel Algorithms for Audio and Signal Modeling

A new team at Inria research center in Rennes, Panama positions itself at the crossroad of mathematical signal processing and audio modeling. Ultimately, its work might well also reverberate into a smarter exploitation of high-dimensional data, as team leader and ERC grantee Rémi Gribonval  explains.

Our goal? Developing efficient algorithmic techniques to model, acquire and process high-dimensional signals , sums up Rémi Gribonval. Acoustic data is our main focus. As a matter of fact, sound is the unifying theme among researchers of this group ” created in the wake of the now defunct Metiss team.

Three major research axes underpin the new project. First come sparse models and representations. “We enjoy a good knowledge of parsimony, in relation, for instance, with the solving of inverse problems. Our algorithms work fine, but they still encounter bottlenecks in large-scale problems. Therefore, we wish to address this issue first of all. Beyond that, and more importantly, we want to extend the outreach of parsimony as a notion. Its historical definition is linked to the concept of dictionary, i.e. basic bricks from which objects can be built. We are now looking for other ways to sparsely describe a data, using a limited number of parameters. ”  It is precisely to investigate the virtues of parsimony further upstream that Rémi Gribonval was bestowed a research grant by the European Research Council in 2011. This ERC Starting Grant made possible the recent hiring of three additional people in Panama. 

In order to fully exploit their new models, scientists also plan to hone their machine-learning techniques. “It's all about learning model-specific characteristics and finely adjusting models parameters to the nature of the processed data. In that regard, many works rely on a mathematical formalism whereby a dictionary is defined from mathematical analysis. In recent years, we have sought out new methods whereby we infer it totally from the data. In some situations, we also try to mix a priori knowledge with in-field knowledge. There are underlying laws of physics that we take into account. Wave equations for instance.

Source separation

The second research axis deals with acquisition and processing of acoustic scenes. “It's a somewhat broadened way of broaching the topic of audio source separation. ”  Too few microphones for a concert recording. Too many instruments on the same channel. A monophonic soundtrack that has to go stereo. A voice that must be extracted from the background. There is no shortage of demand for separation software.  “Our team boasts a large know-how in the field. This has resulted in a number of technology transfers. ”  Findings have been implemented by partners such as Maia recording studio or Audionamix, a Paris and Hollywood-based company that provides sound editing services to the motion picture industry. Scientists also collaborate with Sonic Emotion, a Swiss company that brings sound spatialization systems into consumer audio products.

The third axis focuses on identification of structures in large-scale audio content. “Within the team, some of our work relies on musicology and computer music. Musical pieces are often made of motifs such as the chorus and verse pattern for instance, but there are many different types of approximate repetitions at different time scales. Such structures must be described in a robust way so that people who would listen to them would themselves converge to the same description, the same annotation, the same cuts. The purpose here is to head toward definitions of music structure concepts before providing algorithms that automatically discover these structures in large audio streams.

Biomedical Data Processing

Scientists also plan to step beyond audio and extend their activity to other domains such as multimedia indexing or biomedical data processing. MRI and EEG to name but a few. “We could be a supplier of methodologies and interact with Inria teams such as TexMex, Visages, Serpico or Hybrid, as well as Rennes 1 University's LTSI Signal and Image Processing Laboratory. For, in the end of the day, the nature of the signal doesn't matter much, as long we can set it into a certain mathematical formalism.

Keywords: INRIA Rennes - Bretagne Atlantique Rémi Gribonval PANAMA Algorithm Parcimony Sound