Project-team

ALMANACH

Automatic Language Modelling and Analysis & Computational Humanities
Automatic Language Modelling and Analysis & Computational Humanities

The ALMAnaCH team (Automatic Language Modelling and Analysis & Computational Humanities) focuses on Natural Language Processing (NLP), a key area within Artificial Intelligence (AI) and Digital Humanities (DH), at the crossroads between theoretical computer science, machine learning, and linguistics. The team’s work covers a wide variety of topics related to language variation, both in a historical sense and within contemporary language states (developing robust NLP systems for noisy web content and dialectal varieties of language). Our interests also span to the pre-training of neural networks (e.g. CamemBERT model), interpretability of neural approaches, language resource development (e.g. OSCAR corpus, treebanks, parallel datasets, lexicons, but also historical corpora built using OCR and HTR applied to archives and other historical documents), evaluation and information extraction and retrieval (especially from specialised corpora and historical documents).

Centre(s) inria
Inria Paris Centre

Members

Team leader

Meriem Guemair

Team assistant

News