Sites Inria

Version française

Colloque Interfaces

What would you do with billions of source code files? News from Software Heritage

"Interfaces", Aquitaine scientific symposium for digital sciences. This serie of talks will host several times a year, scientists from entire world, recognized for the quality of their work and the results they produce. Topics will treat about computer sciences and applied mathematics but mostly about their intersections with others sciences and fields. Medicine, Social sciences and humanities, art, etc. are such topics that are addressed.

  • Date : 7/06/2017
  • Place : Resarch Centre Inria - Bordeaux - Sud-Ouest, Room Ada Lovelace
  • Guest(s) : Roberto Di Cosmo
  • Organiser(s) : Inria Bordeaux - Sud-Ouest

For its next edition, Interfaces •  Aquitaine scientific symposium for digital sciences , welcomes :

Roberto Di Cosmo
Computer Science professor at University Paris Diderot

The talk will be held in English and will be followed by a drink.  

What would you do with billions of source code files? News from Software Heritage

From ten years of working on analysing the characteristics of large open source software repositories, we draw some lessons on the key properties we need for this kind of software engineering large scale studies. This led us to launching Software Heritage, the most ambitions project to date to build a universal source code software knowledge base. The size of this archive is daunting, with billions of unique source code files, coming from tens of millions of repositories. We will explain the mission of Software Heritage and highlight some of the new challenges, both organisational and scientific, that Software Heritage brings up.

Roberto Di Cosmo holds a PhD in Computer Science and is currently Computer Science professor at University Paris Diderot, after teaching for almost a decade at École Normale Supérieure in Paris, and spending a few years at INRIA.

He has been actively involved in research in theoretical computing, specifically in functional programming, parallel and distributed programming, the semantics of programming languages, type systems, rewriting and linear logic. He focus now on new scientific problems posed by the general adoption of Free Software, with a particular focus on static analysis of large software collections, that were at the core of the european reseach project Mancoosi.

Following the evolution of our society under the impact of IT with great interest, he is a long term Free Software advocate, contributing to its adoption since 1998 with the best-seller Hijacking the world, seminars, articles and software. He created the Free Software thematic group of Systematic in October 2007, and since 2010 he is director of IRILL, a research structure dedicated to Free and Open Source Software quality.

In 2016, he co-founded and directs Software Heritage, an initiative to build the universal archive of all the source code publicly available.