VALDA Research team

Value from Data

Team presentation

Valda’s focus is on both foundational and systems aspects of complex data management, especially human-centric data. The data we are interested in is typically heterogeneous, massively distributed, rapidly evolving, intensional, and often subjective, possibly erroneous, imprecise, incomplete. In this setting, Valda is in particular concerned with the optimization of complex resources such as computer time and space, communication, monetary, and privacy budgets. The goal is to extract value from data, beyond simple query answering.

Research themes

  1. Foundations of data management. The systems we are interested in, i.e., for manipulating heterogeneous and confidential data, rapidly changing and massively distributed, are inherently error-prone. Moreover, because of the cost in the access to intensional data, it is important to optimize the resources needed for manipulating them. This can only be achieved with solid foundations for data management systems. Theses foundations are the ground for appropriate specifications (confidentiality rules, robustness properties, etc.), for formal and runtime verifications of the specifications, for the design of appropriate query languages (with good expressive power, with limited usage of resources), for the design of good indexes (for optimized evaluation), and so on.

  2. Uncertainty and Provenance of Data. This research axis deals with the modeling and efficient management of data that come with some uncertainty (probabilistic distributions, logical incompleteness, etc.) and with provenance information (indicating where the data originates from). Tools and foundations for uncertainty management and provenance management are often similar.

  3. Personal Information Management Systems. A PIMS is a system that allows a user to integrate her own data, e.g., emails and other kinds of messages, calendar, contacts, web search, social network, travel information, work projects, etc. Such information is commonly spread across different services. The goal is to give back to a user the control on her information, allowing her to formulate queries such as “What kind of interaction did I have recently with Alice B.?”, “Where were my last ten business trips, and who helped me plan them?”. The system has to orchestrate queries to the various services, integrate information from them, e.g., align a GPS location of the user to a business address or place mentioned in an email, or an event in a calendar to some event in a Web search.

Keywords: Complex data Theory Systems Uncertainty Provenance Personal data