How would you define Open Science?
L.R: The concept of open science emerged as people became aware that the technological progress made in the past 50 years was opening up the possibility of wider access to scientific results, not just for researchers, but also for businesses and society as a whole.
While initially open science was aimed at redressing the balance within scientific publishing, skewed in favour of the big private publishers, attention turned towards the possible benefits of scientific results being made more widely available, including in terms of reproducibility.
For an institution such as Inria, open science also entails greater control over content from an editorial and technological perspective, enhancing not only visibility and citability, but also longevity. In addition to the strictly scientific results produced by our researchers, we also factor in everything relating to education and science outreach, the aim being to disseminate the resources we produce as widely as possible.
What are the main current issues for open science? And what are the goals for an incentive policy in the short- to medium-term?
The National Plan for Open Science (PNSO - Plan National pour la Science Ouverte), the second edition of which was published by the MESRI in July 2021, is based on three pillars: publications, research data and source codes for research software. Although these three things each have their own histories and specificities, we were able to identify two main areas of focus for a policy geared towards promoting openness. Firstly, it is necessary to have durable public infrastructure capable of hosting these things and ensuring access long-term. Here the emergence of private stakeholders in the field of publishing, such as ResearchGate or Academia, has caused real tension. They understood the benefits of appropriating scientific content for financial gain.
Secondly, open science must be integrated as much as possible into the work of researchers, without excess paperwork becoming an issue. One key element is integrating open science into the production of teams’ activity reports or end of project reports so as to ensure that publications filed in the HAL open archive are added to it automatically, without any extra effort from researchers being required.
Putting an emphasis on open science when it comes to education and science outreach also shows our willingness to bring scientists and ordinary citizens together.
What is Inria doing to promote the development of the concept of Open Science at a cultural and technological level?
Inria has always had a policy of participating in the development of open science platforms and supporting researchers.
After having decided from an early stage to deploy its own portal in the open publication archive HAL, the institute is now a partner of the support and research unit CCSD (Centre pour la Communication Scientifique Directe - Center for Direct Scientific Communication), in collaboration with the two other parent organisations, the CNRS and INRAE. Simultaneously, our Publishing & Publications Department has taken steps to support researchers file publications in HAL, to moderate content and, most importantly, to guarantee the quality of content in order to facilitate reuse.
When it comes to the source codes for research software, Inria has supported the development of the Software Heritage initiative from the outset, the aim of which is to inventory and archive all open source software heritage, both from the past and the present. Software Heritage currently hosts close to 12 billion source code files, which equates to 170 million software projects.
As for resources for education and science outreach, Inria has long placed an emphasis on the importance of sharing and distribution; examples include Interstices, used for exploring digital science, plus the different MOOCs produced by Inria Learning Lab, the results of which speak for themselves.
Inria remains a pioneer when it comes to open publishing - but what about research data?
Research data is far more heterogeneous and complex to manage than publications are: it can just as easily be primary data from sensors, simulations or collected from online resources; data from complex computer processing, as is the case with the parameters for a machine learning model; or targeted datasets used to illustrate a publication. Depending on the type of research, there can be significant variations in volume, but also in terms of formats with regard to the level of structure or standardisation. Finally, there can be a number of constraints affecting the data we’re working on when it comes to handling information of a personal, medical or commercial nature, or information governed by copyright. What the institute must do is provide our research teams with secure resources for hosting, while facilitating open-access distribution when the corresponding conditions allow it.
The first stage for Inria will be to support researchers in reflecting on the issue of data management, particularly in the context of producing data management plans such as those imposed by the ANR (the French National Research Agency) or the European Union, or in the context of the recent decree on scientific integrity. Inria hopes to soon be able to draw upon Recherche Data Gouv, a national platform for hosting research data.