Jean Ponce doesn't regret coming back to France in 2005 after 20 years of research in the US, having recently received an ERC grant of €2.5 million to continue his research in video. His objective is to automatically analyse video content, manipulate the elements contained within such content (for example, removing an unwanted object in a film) and restore old films. This ambitious research continues on from his previous work on shape recognition in images, which for example saw him reconstruct objects in 3D based on several photos and automatically find certain objects (a bicycle, a car, etc.) in negatives. His work has been used by one of the world leaders in special effects, Industrial Light & Magic.
"It's possible to automatically analyse what's happening in the simplest situations, when the camera doesn't move and the background is consistent ," says Jean Ponce, who directs the WILLOW team, which looks at models for the visual recognition of objects and scenes. It is jointly run by Inria, the Ecole Normale Supérieure in Paris and the French National Centre for Scientific Research. "This is the case with football matches, which you can study thanks to wide-angle fixed cameras. However, as soon as the camera moves we can no longer do it, because the appearance of an object depends not only on the object itself, but also on the position and movement of the camera ." It is also difficult to remove an unwanted object from a scene or restore a damaged film. It has to be done manually, image by image. "Those who process these videos do it pixel by pixel," says the researcher. "We think it's possible to automate these activities by taking an artificial vision approach and looking at the meaning of a scene. " But how can the meaning of a scene be analysed when there is no initial indication of what is depicted within it? What level of detail is sufficient? What is a good model for interpretation? To answer the numerous questions raised by this novel approach, Jean Ponce plans to recruit five PhD students and two post-doctoral researchers over the next five years and purchase a cluster of powerful computers.
Automatically analysing videos will allow them to be classified and restored or special effects to be produced.
Jean Ponce is pleased to have the ERC grant after having spent much of his time in the United States chasing contracts to pay his PhD students, as is the case with all team leaders. In France he has access to excellent students, many of whom already have a PhD grant. Another difference is that "in the US not much work is done as a team, research is carried out by a professor surrounded by his students", he observes. "With the teamwork that goes on in France I'm getting into new fields, it's really great ." As such, his team has in the last few years been researching statistical learning, enabling the automatic construction of the models it uses to recognise and process images and videos.
Although its goal is primarily fundamental, this research has numerous applications. The automatic archiving and indexing of videos, as is currently done with texts, will allow users to navigate more easily through the huge video archives now available at the French National Audiovisual Institute for example, with which Jean Ponce's team is already collaborating. Similarly, the ability to remove signs of ageing (white spots and lines for example) could be used to restore the old films held in these archives. Lastly, the ability to remove or add objects from or to films is of interest to special effects professionals, such as Industrial Light & Magic.