Infrastructure software

Inria: helping to modernise income tax calculations with Mlang

Date:
Changed on 13/04/2021
After Etalab helped them to get things off the ground back in 2016, and through close collaboration with the Direction Générale des Finances Publiques (the French public finance department), three researchers, among them Denis Merigoux (from Inria Paris), were given the opportunity to access the source code for France’s income tax calculator. After nearly two years of research, they have overhauled the language M and its compiler, making tax calculations more reliable.
Mlang
© PuntoStudioFoto Lda - stock.adobe.com

 

Each year, close to 37 million households in France pay income tax. Calculated based on individual declarations made by taxpayers and information the tax authorities have at their disposal, it takes into account everything relating to people’s income and expenses, as well as their circumstances. This ensures that contributions are as fair as possible in terms of people’s individual circumstances, helping to avoid any rebates or penalties.

The DGFiP calculator: combining reliability with speed

The software architecture responsible for making calculations (called “the calculator”) is based on a dedicated language, M, which was developed in the 1990s by the SI-1E office of the DGFiP.

Recognised for its high performance levels, this calculator is currently capable of processing all income tax calculations for all homes in France in less than 16 hours (an average of 650 households a second). It has also successfully adapted to increasingly tight legislative requirements over the past thirty years. As a result of these, the codebase has to be updated annually in order to take into account any changes to income tax rates introduced as part of the budget.

A solution that is showing its limitations

However, the computing landscape has changed dramatically since the 1990s, and people with knowledge of the technology the calculator employs are thin on the ground. Similarly, the tax landscape has also changed significantly. In order to adapt to these changes, workarounds have had to be found in order to compensate for M’s limitations.

Formal methods is a field within IT that is centred around the mathematical analysis of programming languages and software, the goal being to make them safer, more efficient and more secure. Inria has a reputation as a leader in this sector at a global level, having earned recognition for its Coq proof assistant..

One solution to this was to overhaul this infrastructure using modern technology in order to develop a new compiler capable of supporting updates, in addition to opening up new opportunities, such as the calculator being able to work with new tools... which is exactly what two IT researchers specialising in formal methods, Denis Merigoux (Prosecco – Inria) and Raphaël Monat (APR - LIP6 - CNRS - Sorbonne University), have done. “My research speciality is viewing programmes as mathematical objects that you can work on in order to improve results. That's what I did with the tax code,” explains Denis Merigoux.

Denis Merigoux and Raphaël Monat were granted access to the source code for the calculator under the terms of a confidentiality agreement relating to the critical aspects of this code that was signed with the DGFiP. With support from experts from the SI-1E office, they explored M and its compiler before devising a way of overhauling it employing cutting-edge technology.

Mlang: a modern compiler for income tax calculations

This overhaul led to the writing of a new compiler, Mlang, which was created using Ocaml (developed by Inria) and published opensource. A research paper outlining how Mlang works was also written, with support from Jonathan Protzenko (Microsoft Research, a member of the Microsoft Research-Inria Joint Centre).

find out more about how Mlang works

Having been tested and approved by the DGFiP, Mlang was transferred to SI-1E, who decided to prolong the research in order to replace the original compiler for M. Work on this replacement is set to begin in March, and it is expected to be operational within the next two years. Long-term, it will make both the characteristics and the semantics of the language more reliable, in addition to adding new features to M through the use of modern compilation techniques. Eventually, the tax code could be accompanied by a correction proof.

Public research working hand-in-hand with the tax authorities

None of this would have been possible without the code for the income tax calculation algorithm being made opensource. First published in 2016 by Etalab, it was this that helped the researchers to get their investigations up and running.

It is only fitting, therefore, that Mlang is also to be distributed opensource. This will make it possible to fully replicate the original tax calculation found on people’s assessment notices which, it is hoped, will help to build trust in income tax calculations by allowing people to access the algorithm and understand the results.