The scientific ambition of ROBOTLEARN is to train robots to acquire the capacity to look, listen, learn, move and speak in a socially acceptable manner. This will be achieved via a fine tuning between scientific findings, development of practical algorithms and associated software packages, and thorough experimental validation. It is planned to endow robotic platforms with the ability to perform physically-unconstrained and open-domain multi-person interaction and communication. The roadmap of ROBOTLEARN is twofold: (i) to build on the recent achievements of the Perception team, in particular, machine learning techniques for the temporal and spatial alignment of audio and visual data, variational Bayesian methods for unimodal and multimodal tracking of humans, and deep learning architectures for audio and audio-visual speech enhancement, and (ii) to explore novel scientific research opportunities at the crossroads of discriminative and generative deep learning architectures, Bayesian learning and inference, computer vision, audio/speech signal processing, spoken dialog systems, and robotics. The paramount applicative domain of ROBOTLEARN is the development of multimodal and multi-party interactive methodologies and technologies for social (companion) robots.
Inria Centre at Université Grenoble Alpes
In partnership with
Université de Grenoble Alpes