Text mining in biology
The Bioinformatics PPF is organising a science day exploring text mining applications in biology. The programme will include data extraction in genomics, inference of genetic interactions and much more.
- Date : 20/09/2011
- Place : Amphithéatre du batiment IRI-IRCICA, campus CNRS
- Organiser(s) : PPF Bio-informatique
The day will be held in the amphitheatre in the IRI-IRCICA building, from 9.45am to 5pm .
Mikaela Keller , a member of the Inria Mostrare project team and of the University Lille 3, will present the automated vocabulary discovery for geo-parsing online epidemic intelligence.
Automated vocabulary discovery for geo-parsing online epidemic intelligence
Automated surveillance of the Internet provides a timely and sensitive method for alerting on global emerging infectious disease threats. HealthMap is part of a new generation of online systems designed to monitor and visualize, on a real-time basis, disease outbreak alerts as reported by online news media and public health sources. HealthMap is of specific interest for national and international public health organizations and international travelers. A particular task that makes such a surveillance useful is the automated discovery of the geographic references contained in the retrieved outbreak alerts. This task is sometimes referred to as "geo-parsing". A typical approach to geo-parsing would demand an expensive training corpus of alerts manually tagged by a human. Given that human readers perform this kind of task by using both their lexical and contextual knowledge, we developed an approach which relies on a relatively small expert-built gazetteer, thus limiting the need of human input, but focuses on learning the context in which geographic references appear. We show in a set of experiments, that this approach exhibits a substantial capacity to discover geographic locations outside of its initial lexicon.
Participation is free but registration is required.
To register, please send the below information to Guillemette Marot:
- Attendance at lunch
Registration is free but required.
Text mining in biology
Text mining , sometimes alternately referred to as text data mining, refers to the process of deriving high-quality information from text. High-quality information is typically derived through the devising of patterns and trends through means such as statistical pattern learning. In practice this means converting a simplified model of linguistic theories into algorithms in learning and statistical information systems. Source: Wikipedia