Linkage: an AI tool for the analysis of medical publications

Date :
Changed on 28/05/2020
The platform allows the analysis and synthesis of scientific literature on a given subject, "COVID-19" for example. Thanks to the plugins added recently for the extraction and analysis of co-publication and co-citation networks for the PubMed and BioArxiv platforms, Linkage allows medical doctors and researchers to follow publications on a given subject while analyzing the research topics addressed. Linkage has proven to be extremely useful in synthesizing the mass of publications on COVID-19 in the past months (16,000 publications on PubMed in the past 4 months).
Charles Bouveyron et Pierre Latouche

What is the genesis of your project?

Project holders : Charles Bouveyron, Chair & Deputy Scientific Director of the Institut 3IA Côte d’Azur, head of the Maasai research team, Inria Sophia Antipolis - Méditerranée, Université Côte d’Azur

Partners : Pierre Latouche, Professor, Université de Paris

#IA #medical publications #analyse #networks

Linkage technology is the result of a long collaboration with Pierre Latouche, Professor at the University of Paris, on the statistical analysis of networks. After having worked for a long time on so-called “binary” networks, for which only the presence or absence of edges is considered, we looked from 2014 on more complex networks, mixing several types of data (categorical edges, dynamics, texts,…). This might be surprising, but our first work on this subject was with fellow medieval historians, with whom we analyzed a social network of bishops in the early Middle Ages [1].

We then realized the importance of taking into account networks whose edges are characterized by text. This indeed includes a very large number of cases ranging from social networks, communications networks (emails, SMS, etc.), to transactional data (for example Panama Papers), via networks of scientific co-publications.

In 2017, we proposed a model and its inference algorithm to analyze such data [2]. In 2018, we made available to the various communities a SAAS platform (software as a service),, which allows everyone to analyze their own data or public data in an easy-to-use interface. This simplicity of use is made possible by the fact that Linkage is completely autonomous in data analysis and does not require, so to speak, any intervention by the user before the results interpretation phase. This is notably due to the fact that Linkage is based on a statistical model which can be estimated and calibrated automatically from the data.

Thanks to the platform, the technology is used by a very wide audience, ranging from researchers to companies, and on various themes. In addition to the obvious use for monitoring communications networks, Linkage is also a very good tool for the analysis and synthesis of scientific publications through networks of co-publications and co-citations.

[1] C. Bouveyron, L. Jegou, Y. Jernite, S. Lamassé, P. Latouche & P. Rivera, The random subgraph model for the analysis of an ecclesiastical network in merovingian Gaul, The Annals of Applied Statistics, vol. 8(1), pp. 377-405, 2014.

[2] C. Bouveyron, P. Latouche and R. Zreik, The Stochastic Topic Block Model for the Clustering of Networks with Textual Edges, Statistics and Computing, vol. 28(1), pp. 11-31, 2017.

How is it developing today and what are its objectives?

The Covid-19 epidemic presented a unique use case for researchers and institutions in the health field where the ability to monitor and synthesize scientific publications on a given theme has proven to be strategic.

Indeed, with more than 5000 publications and pre-publications per month on the Covid-19 virus, it has proved essential for researchers and doctors to have tools capable of synthesizing publications on this subject by grouping them on the basis of the research themes they mobilize. Thus, researchers and doctors who use Linkage have the ability to visualize who is publishing with whom about the virus and with what angle of research. The figure below illustrates the result of the analysis with Linkage of the publications on Covid-19 available on PubMed.

Capture écran des résultats de recherche avec Linkage
© Linkage
Analysis with Linkage of the Covid-19 publications available on PubMed.

Following recent discussions with Inserm, the Inca and the AP-HP, we have developed new plugins using own funds to recover data from medical publications. Linkage now makes it possible to recover co-publication data from the BioArxiv, MedArxiv servers, in addition to the already existing PubMed, HAL and Arxiv plugins. It is also now possible to analyze from data from all of these sources with a single query. We have also added the possibility of analyzing co-quotation networks from the PubMed server.

Linkage therefore already offers a very broad spectrum of analysis concerning bio-medical data and will, we hope, be able to help public health researchers and doctors in their missions, against Covid-19 but also on other diseases.

How do you work with your partners?

Pierre Latouche and I are used to developing tools in close collaboration with people who face the problems posed by data in their use.

This allows us to formalize a theoretical problem very close to the observed problem and to propose an artificial intelligence solution which can then be easily implemented to solve the initial problem.

This often requires interdisciplinary communication, which is not always simple, but which makes it possible to propose useful tools ... and which are used!