Open science/Open Access

HAL: an exemplary infrastructure to contribute to open, public and sustainable scientific publishing

Changed on 02/12/2021
As the HAL platform celebrates its twentieth anniversary, Laurent Romary, member of the Almanac team and specialist in language informatics, discusses with us the conditions of its emergence and the need, with its success and the development of open science, to reinforce its structure and potentialities.
Mains sur clavier avec écrans en arrière-plan
© Inria / Photo C. Morel

Why was the HAL platform created, and how did it evolve?

Portrait Laurent Romary
© Inria / Photo J.-M. Ramès

The HAL open archive was set up by the CNRS to offer a preprints service independent of the ArXiv platform in the United States while maintaining a very strong mirror link with it. It evolved little by little to welcome more and more documents-authors associated with articles published in traditional reviews or conferences, under the effect of an awareness by the researchers of the stakes of what we would call now the open science.

Since 2003, with the publication of the Berlin Declaration and its signature by the main French research establishments, including Inria on July 20, 2004, HAL is perceived as an ideal instrument of a more institutional policy of wide opening of scientific publications.

HAL in figures

971 000

documents referenced in 2021


million documents downloaded in 2020

74 000

active user accounts

What was the participation of Inria in the creation of HAL?

Inria had the technical possibility at that time to implement its own archive, but the institute chose, under the impulse of Jean-Pierre Verjus, then director of the communication and the scientific information, to join HAL to open from April 2005 a portal to the colours of Inria, It was one of the very first institutions to make this step and the step was accompanied from the beginning by a strong investment of the documentary staff to take in charge the moderation of the submissions and especially to communicate widely towards the researchers to encourage them to deposit systematically in HAL.

Inria was also involved very early, in collaboration with the CCSD, the service unit of the CNRS which develops HAL, in the implementation of tools facilitating notably the grouped integration of documents (X2HAL) or the automatic extraction of metadata from the source PDF (GROBID). Inria, which has since become one of the CCSD's partner institutions, has shown the way to exemplary institutional involvement in the field of open science by introducing a deposit obligation for all researchers in its teams. To date, more than 85% of the Institute's full texts are available under HAL.

How is the platform a good example of open science?

The HAL platform has integrated over the years all the elements that now constitute the breviary of open science and that we find in part in the terminology in vogue, namely the fact of having FAIR data. From a technical point of view, HAL offers a persistent access to its contents, integrating a management of versions and perennial identifiers. HAL entries are very well indexed on international search engines, and the platform offers a complete access API. For the management of metadata, HAL is coupled to a series of repositories (see in French on the image below) containing the reference entities used in the metadata associated with the articles and allowing to control the affiliations. The platform also allows the referencing to scientific reviews or the link with the ANR or European projects having financed the corresponding research.

The contents as well as the repositories are associated with mechanisms of validation and edition which contribute to make the publications in HAL a scientific corpus of quality. Finally and especially, HAL offers a whole range of functionalities centred on the researchers (identifiers linked to the big international bases such as ORCID, personal pages, management of collections) allowing to make a real space of management of its own productions.

3) What are the challenges that HAL will have to face in the coming years?

One of the main challenges for HAL, which has become a reference platform for many institutions, which, like the CNRS recently, have adopted systematic deposit policies, is to continue to offer services close to the researchers themselves and to be integrated in the global cycle of scientific publication. It is in this spirit that we support the deployment of the platform of management of epi-reviews Episciences, which allows the evaluation by the peers of preprints deposited beforehand in HAL (as well as other open archives such as ArXiv).

In addition, HAL must allow to link closely a publication with the data sets or the software which are associated with it. We thus contributed to the implementation of a software deposit in HAL referencing the sources archived in Software Heritage. Finally, with the increase of the contents, HAL starts to become a credible alternative to the big international bibliometric bases (WoS, Scopus etc.) to follow the French scientific production and to analyse the strong points. It is necessary for that to think the next generation of tools and interfaces which will allow in particular to each researcher, team or institution to obtain lists of publication for a web page, or to analyze its own publications.