Images for free viewpoint television
The television of the future will offer viewers the option to choose among many viewpoints to follow the action according to their preferences. Around the world, researchers are already imagining algorithms that will make it possible to broadcast such video. But all come up against the same problem: the lack of raw material for testing. There simply aren’t recordings that include multiple and simultaneous viewpoints for large-scale scenes. Prior to any development, this body of images must be created. This is precisely the objective of Atep, a technology development initiative coordinated by researcher Thomas Maugey at the Inria Rennes – Bretagne Atlantique centre.
It’s the announcement of the ultimate zapping. Dribbling in front of the goal. The ball hesitates. So do the players. And millions of fans frantically use their remote control. Some to get behind the goalie. Others to choose a side angle. Yet others prefer a frontal image. It is known as “free viewpoint television ” (FTV). Each person decides in real time which camera will be used for viewing the scene. It can be a sports event, a concert or a teleplay.
“This television does not exist, but it is in the works and is raising a large number of scientific issues at all stages in the chain of production, ” explains Thomas Maugey , researcher on the Sirocco team . Example? “Compression. The data must be organised to satisfy multiply demands. But if there are a 1,000 cameras around the stadium and the action is focused on one spot in the field, it’s not necessary to transmit all these signals. The difficulty is in transmitting only what is necessary. ”
Another challenge is how the viewer receives it: “How do you synthesise the virtual viewpoints from the available data? It’s not possible to record all the viewpoints imaginable. Only part of them are recorded. The user will generate virtual viewpoints between these existing viewpoints.
At the other end of the chain, the question arises how to represent the data. “How to describe the scene? Using images or meshing, as for video games? There are several strategies. There has to be one. ” Which one then?
“Probably neither meshing nor image, but a mix of both. ” Further up the chain, it’s how the scene is captured that raises questions. “Where should the cameras be placed? How much does access cost? Which model? For this, everything remains to be done. It’s wide open. ” That the context. “Each person is trying to explore part of this chain, but these research projects are not advancing as we’d like. ” Why? “Because the multiple-view scenes we could use for testing our algorithms simply do not exist today.
An R&D tool
The objective of the Atep technology development initiative for capture, processing and sharing is thus to “constitute these databases, produce videos to then be able to pursue our research. This material could then be used by other scientists or companies for R&D. Basically, it’s a scientific tool that everyone needs to move ahead. ” How will the operation proceed? “The technology development initiative will enable us to hire an engineer for two years. We also have a budget of €40,000 from a young researcher grant awarded to me by Greater Rennes. This will be used to purchase cameras. We shall probably install between 20 and 40. It hasn’t been determined yet. This will constitute a fairly sizeable system. Imagine 40 positions with all the viewpoints that can be generated without even mentioning the synthesis of virtual views. ”
The researchers are leaning to one of the new omnidirectional cameras that make it possible to film in 360°. “The problem with a traditional camera is that it films in a single position and in a single direction. With an omnidirectional camera, you cover all the angles of rotation from a single position. ” It is an inconvenience nonetheless because the fisheye objective generates a spherical image. “The viewer cannot watch it directly. An algorithm must be used to transform it into a 2D image. ”
From multiple-view capture to sharing
Then we must enter the heart of the matter, which is to install this capture system around a dynamic scene. “Be it a football game or something else. We haven’t yet made a decision. It has to move from one camera to another, and there must be moments where the user is more interested in one scene rather than another to have real browsing. ”
To succeed at multiple-view capture, it will also be necessary “to perfectly calculate the position of the cameras in relation to one other. With two omnidirectional cameras, little work has been done on this. So with 40... even less. In post-processing, the big challenge will be providing the calibration parameters. ”
Which leads to the last phase of the project: sharing. “We’re planning a minute of video. ” Not more? “No. We could do 30 minutes, but you have to take into account the size of the files, because we’re not going to compress the images. We have chosen to share them as raw data, thus a raw format that is very large. This will enable scientists and companies to test and compare their own compression algorithms before sending it to the final user to study how it interacts with this multiple-view data. We should have initial results in about a year. ”
These articles could interest you:
Sirocco is a research team Inria, common with University of Rennes 1 and CNRS.