CEDAR Research team

Rich Data Exploration at Cloud Scale

Team presentation

In today's data-intensive application, variety is the norm, and is likely to remain so for a while. This is because different applications are best served by different kinds of data: traditional commerce-oriented applications use relational databases, Web content management systems handle semistructured documents, sensors provide numerical streams, science applications manipulate arrays, highly heterogeneous data sets is often exported in RDF graphs, software system logs consist of structured text etc.
At the scale and speed of consumption of today's Big Data, unifying data across such formats into a single architecture (approach formerly known as extract-transform-load in a data warehouse context) is no longer feasible. Instead, Cedar aims at inventing expressive models and highly efficient data management tools, focused from the start on Big Data variety. Our tools are designed for deployment in the cloud, and validated at large scale.

Research themes

Our research can be viewed as pertaining to two broad areas.

Within the cloud (under the hood of the data processing system), our research aims at building  efficient platforms for highly scalable data analytics at very large scale. Particular interest in this area will be devoted to:

1. Scalable heterogeneous stores

2. Semantic query answering

Outside the cloud, at the interface between the data management system and its users, we seek to revisit the paradigms of interaction between users or application and the system, by endowing the former with novel data access primitives to facilitate and enrich the user interaction. We consider in particular the following axes:

3. Exploratory querying of semantic graphs

4. Representative semantic query answering

International and industrial relations

Outside France, we collaborate with: UCSD (Alin Deutsch), AT&T (D. Srivastava), U. Madison Wisconsin (D. DeWitt) and U. Berkeley (M. Franklin), TU Dresden (S. Rudolph), U. Bolzano (D. Calvanese).

Industrial partners include Business & Décision (EOLAS), SemSoft and Le Monde.

Keywords: Big Data Knowledge representation Data analytics Data exploration Cloud Databases Semantic Web