Sites Inria

Version française

Science day

Text mining in biology


The Bioinformatics PPF is organising a science day exploring text mining applications in biology. The programme will include data extraction in genomics, inference of genetic interactions and much more.

  • Date : 20/09/2011
  • Place : Amphithéatre du batiment IRI-IRCICA, campus CNRS
  • Organiser(s) : PPF Bio-informatique

The day will be held in the amphitheatre in the IRI-IRCICA building, from 9.45am to 5pm .

Mikaela Keller , a member of the Inria Mostrare project team and of the University Lille 3, will present the automated vocabulary discovery for geo-parsing online epidemic intelligence.


Automated vocabulary discovery for geo-parsing online epidemic intelligence  
Automated surveillance of the Internet provides a timely and sensitive method for alerting on global emerging infectious disease threats. HealthMap is part of a new generation of online systems designed to monitor and visualize, on a real-time basis, disease outbreak alerts as reported by online news media and public health sources. HealthMap is of specific interest for national and international public health organizations and international travelers. A particular task that makes such a surveillance useful is the automated discovery of the geographic references contained in the retrieved outbreak alerts. This task is sometimes referred to as "geo-parsing". A typical approach to geo-parsing would demand an expensive training corpus of alerts manually tagged by a human. Given that human readers perform this kind of task by using both their lexical and contextual knowledge, we developed an approach which relies on a relatively small expert-built gazetteer, thus limiting the need of human input, but focuses on learning the context in which geographic references appear. We show in a set of experiments, that this approach exhibits a substantial capacity to discover geographic locations outside of its initial lexicon.

Participation is free but registration is required.

Découvrir le programme complet


Keywords: Text mining Biology Bioinformatics PPF Mikaela Keller Mostrare project team Inria Lille - Nord Europe research centre