- HAL publications
TAU Research team
TAckling the Underspecified
- Leader : Marc Schoenauer
- Type : team
- Research center(s) : Saclay
- Field : Applied Mathematics, Computation and Simulation
- Theme : Optimization, machine learning and statistical methods
- Inria teams are typically groups of researchers working on the definition of a common project, and objectives, with the goal to arrive at the creation of a project-team. Such project-teams may include other partners (universities or research institutions)
Building upon the expertise in machine learning (ML) and stochastic optimization of the late TAO project-team, the TAU team aims to tackle the vagueness of the Big Data purposes.
Based on the claim that (sufficiently) big data can to some extent compensate for the lack of knowledge, Big Data is hoped to fulfill all Artificial Intelligence commitments.
This makes Big Data under-specified in three respects:
- A first source of under-specification is related to common sense, and the gap between observation and interpretation. The acquired data do not report on "obvious" issues; still, obvious issues are not necessarily so for the computer. Providing the machine with common sense is a many-faceted, AI long, challenge. A current challenge is to interpret the data and cope with its blind zones.
- A second source of under-specification regards the steering of a Big Data system. Such systems commonly require constant learning in order to deal with open environments and users with diverse profiles, expertise and expectations. A Big Data system thus is a dynamic process, whose behavior will depend in a cumulative way upon its future environment. The question regards the control of a lifelong learning system.
- A third source of under-specification regards its social acceptability. There is little doubt that Big Data can pave the way for Big Brother, and ruin the social contract through modeling benefits and costs at the individual level. What are the fair trade-offs between safety, freedom and efficiency ? We do not know the answers. A first practical and scientific challenge is to assess the fairness of a solution.
The tackling of the under-specified issues in Big Data in TAU currently relies on four core research dimensions, taking inspiration and validation in four main application domains. These research dimensions involve Causal Modelling (required to support prescriptive Big Data), Deep Learning (related to constructive representations, and their compositionality), Optimization and Meta-Optimization (including sequential decision making and categorization of problems), and Big-Data Driven Design. The application domains include the long-lasting domains of Energy Management and High Energy Physics, the more recent focus of TAO/TAU in Computational Social and Economic Sciences, and, new this year, the Autonomous Vehicle, and Population Genetics.
Research teams of the same theme :
- BONUS - Big Optimization aNd UncertaintieS
- GEOSTAT - Geometry and Statistics in acquisition data
- INOCS - Integrated Optimization with Complex Structure
- MISTIS - Modelling and Inference of Complex and Structured Stochastic Systems
- MODAL - MOdel for Data Analysis and Learning
- RANDOPT - Randomized Optimisation
- REALOPT - Reformulations based algorithms for Combinatorial Optimization
- SELECT - Model selection in statistical learning
- SEQUEL - Sequential Learning
- SIERRA - Statistical Machine Learning and Parsimony