B.Sagot, leader of the new ALMAnaCH team

The ALMAnaCH project team is based at the Inria Paris Research Centre and is headed by Benoît Sagot. ALMAnaCH is an Inria project team which follows on from the Alpage project team.

What does ALMAnaCH stand for? What are the main focuses of your research? 

The new ALMAnaCH team (standing for Automatic Language Modelling and Analysis & Computational Humanities) carries out research in the field of natural language processing (NLP). This is a very exciting field, which is part of the new momentum seen in AI. It implies a need for multidisciplinary skills, primarily in theoretical computer science, machine learning and theoretical and descriptive linguistics.

ALMAnaCH follows on from where the ALPAGE project team left off, while extending the research themes. It takes up ALPAGE's research on syntax and semantic parsing of natural language, including their analysis of noisy data coming from the web, using symbolic, statistical, neuronal and hybrid techniques. The team also intends to pursue their research by developing linguistic resources. One new aspect of our work will be a focus on modelling and taking context into account, meaning both linguistic context (for example, in a conversation with achatbot) and non-linguistic context. Our research on computational linguistics (language modelling) will also explore diachronic variation (modelling how languages evolve over time). Last, the team is particularly interested in digital and computational Humanities, for example, by adapting and applying NPL techniques to historical corpora, which will include ancient papyrus documents and documents of historical value dating from a few hundred years ago to a few decades ago. Applying this kind of approach, on handwritten or printed documents, raises a number of issues, ranging from recognition of handwriting to extracting and mining data from ancient documents.

These new subjects of research are the reason for the collaboration between Inria and the École Pratique des Hautes Études (EPHE). Two EPHE research professors (one Computer Scientist and one Philologist, both specialising in Digital Humanities) are members of the team.

Is this is fundamental or applied research? What applications might it have?

The scope of the team's research is vast, encompassing theoretical questions (formal languages, algorithmics, and formal linguistics) as well as industrial applications, operational prototypes (syntax analysers) and academic applications (computational language and language evolution modelling), and also medical applications.

Do you have any industrial or academic partnerships?

Among others, we are currently involved in five ANR projects, four H2020 projects, a French/US ANR-NSF project, as well as several networks such as the Empirical Foundations of Linguistics  LabEx and the Huma-Num Very Large Facility. An Inria startup specialising in opinion polls questioning employees has just been set up in partnership with the ALMAnaCH team. We have many contacts with the NLP and AI industry, primarily through projects for which we have funding or have applied for funding. At the Research Centre, we have particularly close ties with the Willow, Sierra and (soon) CoML teams.

