Covid On The Web

Date :
Changed on 28/05/2020
We adapt and combine several tools to facilitate access to Covid information and data. The ACTA platform is a tool designed to facilitate the work of clinicians in the analysis of clinical trials by automatically extracting the arguments and displaying them in graphical form for decision making. The CORESE and MGExplorer platforms allow the manipulation of knowledge graphs and their visualization on the Web. The Covid On The Web project proposes to adapt and combine these tools to a corpus and a continuously enriched knowledge graph dealing with Covid-19. The project draws on scientific expertise in knowledge representation, data and argument mining, information extraction from text, data querying and data visualization.

What is the origin of your project?

Project holder: Fabien Gandon, responsable de l'équipe-projet Wimmics

Partners: Université Côte d'Azur, CNRS, I3S

Assisting the use of scientific literature on coronaviruses

Scientists from all domains are harnessing their multidisciplinary expertise and resources to fight Covid-19 pandemic. To contribute to this effort, the Wimmics team decided to use the confinement period to launch the Covid-on-the-Web project as sprint to adapt and combine its methods, models and tools (ACTA, Corese, MGExplorer, Morph-xR2RML) to process, analyse and enrich the “Covid-19 Open Research Dataset” (CORD-19) that gathers 50,000+ full-text scientific articles related to the coronaviruses.

How is it evolving today and what are its objectives?

Extracting, publishing and visualizing a knowledge graph about the Covid

The goal is to make it easier for biomedical researchers to access, query and make sense of Covid-19 related literature. We designed a pipeline to continuously enrich a knowledge graph about the Covid and software to exploit it, leveraging knowledge representation, text, data and argument mining, data visualization and exploration. The pipeline extract named entities mentioned in articles (DBpedia,  Wikidata  and  other  Bioportal  vocabularies) as well as argumentative graphs, meant to help clinicians analyse clinical trials and make decisions. On top of this knowledge graph, we developed, adapted and deployed several tools providing visualizations and exploration methods and notebooks for data scientists.

Chaine de traitement ACTA
© CovidOnTheWeb - Wimmics

How do you work with your partners?

Addressing motivating scenarios and competency questions from biomedical institutions

Several biomedical institutions have shown interest in using our resources, may they be direct project partners (French Institute of Medical Research - Inserm, French National Cancer Institute - INCa) or indirect (e.g., Antibes Hospital, Nice Hospital). For now, these institutions act as potential users of the resources, and as co-designers. Through active discussions with INCa and  INSERM,  we  are  ensuring  that  our  approach  is  guided  by  and aligned with the actual needs of the biomedical community. Having  a  user-oriented  approach,  we  are  designing  the tools and resources  according to  motivating  scenarios  identified  through  a  needs  analysis  of  the biomedical institutions. One of the very first example of query they suggest we work on was “”find all articles that talk about both a type of cancer and a virus of the corona type”. We are constantly eliciting meaningful new queries from the potential users we interview, and these queries serve to specify and test our knowledge graph and services.

The SARS-Cov-2 outbreak is linked to a so-called emerging virus. Since its appearance in December 2019 in China and its emergence on a global scale from January 2020, the effects of this virus are gradually being discovered in parallel with the progression of the epidemic, such as the broad spectrum of affected organs (ENT, lung, nervous system, skin, etc.).

However, the links between SARS-Cov-2 (asymptomatic, severe forms or even possible reinfections) and cancer are not known. Moreover, the role of several viruses in the development of different types of cancer is demonstrated (e.g. HPV, HBV, EBV, etc.) or suspected to a greater or lesser extent (IARC monographs).

Thus, in addition to the fate of patients suffering from cancer and secondarily affected by SARS-Cov-2, the role of the virus in the medium or long term in the predisposition to the appearance of a cancer and its possible involvement in the evolution or appearance of a second cancer cannot be excluded (e.g. pulmonary, ENT, brain, etc.). In addition, and retrospectively, it would be relevant to study the impact of the first two epidemics due to coronaviruses: SARS-CoV1 and MERS-CoV, which appeared respectively in 2002 in South-East Asia and in 2012 in the Middle East on the a posteriori development of cancer and, more broadly, their impact in relation to cancer.

It is in this context that the collaboration between INCa and the Wimmics team was born.

 

Titre

Karima Bourougaa, PhD, Responsable des affaires scientifiques, Division Recherche et Innovation

Verbatim

En effet, l’expertise de l’équipe Wimmics dans le Web sémantique est apparue comme nécessaire et incontournable pour identifier les liens potentiels entre cancer et coronavirus. L’équipe peut en effet traduire en requêtes spécifiques des échanges informels ou des hypothèses de recherche afin de remonter l’ensemble des données pertinentes. Cette collaboration met d’autant plus en exergue la complémentarité et la nécessité de développer des outils de recherche avancés permettant la remontée d’informations de tout type (non limitées aux journaux à comité de lecture) afin d’étudier les liens potentiels entre cancer et infection par un des coronavirus. Ce travail permettra d’autant plus d’anticiper l’impact éventuel sur le développement d’un cancer ainsi que de proposer une programmation adaptée en fonction des questions de recherche qui seront identifiées.

Auteur

Karima Bourougaa

Poste

Responsable des affaires scientifiques, Division Recherche et Innovation de l'INCa