Sites Inria

Version française

OAK Research team

Database optimizations and architectures for complex large data

  • Leader : Ioana Manolescu
  • Research center(s) : CRI Saclay - Île-de-France
  • Field : Perception, Cognition and Interaction
  • Theme : Data and Knowledge Representation and Processing
  • Partner(s) : CNRS,Université Paris-Sud (Paris 11)

Team presentation

The goal of OAK is to devise expressive models and languages, and efficient algorithms, in order to support complex processing on large-scale complex data. The complex data formats and models we consider include tree and graph data, temporal data, data with complex semantics as described for instance by an RDF Schema, data in very complex warehouses, ad-hoc formats for emerging applications etc. In this context, OAK investigates database optimizations enabling scale-up to very large data volumes, specializing in algebraic optimization, storage optimization through indexes and materialized views and static analysis of queries and updates.

Research themes

The goal of OAK is to devise expressive models and languages, and efficient algorithms, in order to support complex processing on large-scale complex data. In particular, our focus will be on: 1. data with complex structure, such as: structured documents, or trees (in particular XML or JSON), graph-based data (typically RDF), data described by complex schema and semantics (such as, for instance, expressed by an XML Schema or an RDF Schema) 2. complex processing understood as fine-granularity search, transformation and update of data. While XQuery and SPARQL frame most of our prior and current work, we are more generally interested in formats for structured complex data, typically represented by nested records or graphs of connected objects; 3. efficient algorithms for (i) analyzing the specification of a given processing task and identifying interesting equivalent specifications and/or decompositions of the original task into subtasks (corresponding to the traditional logical optimization step in a DBMS) and (ii) efficiently executing a given processing task, possibly with the help of specialized data structures. Distribution of the data and processing plays an important role here, in particular from a perspective of parallel evaluation in the cloud.

International and industrial relations

Oak researchers work closely with colleagues from UCSD (USA), Politecnico di Milano (Italy), U. Basilicata and U. Pisa (Italy), TU Berlin (Germany), TU Delft (The Netherland). Current industrial partners include DataPublica and SAP; in the past, we have also collaborated with Thales, EADS, Bongrain, Mandriva etc.

Keywords: Large-scale databases Semantic Web data Web RDF XML