Cambium unveils a new version of OCaml programming language

Changed on 21/12/2022

OCaml 5, a major new version of the open source programming language OCaml, has just been released by the Cambium project team researchers. Thanks to its compiler, which translates code into the language used by each machine, OCaml 5 offers developers numerous new features, including support for concurrent programming on multicore systems.

Recipes for machines to follow

Java, C++, Python, PHP, OCaml, Pharo... There are many computer programming languages. Each offers specific traits and advantages, and they have a tendency to multiply which shows no signs of stopping. "A program can be compared to a recipe, or actually – to give a better sense of size – a cookbook. Each recipe (for example, "How to make a white roux") is broken down into a series of simple instructions that the machine must follow step by step in order to create the desired dish for dinner", explains researcher François Pottier, leader of the Cambium, joint research team with Collège de France (Cambium took over from Gallium in 2019).

"A high-level language (such as Java, Python or Javascript) can be used to write an instruction that groups together several steps, without explaining each one in detail: for example, 'melt 100g of butter', he explains. By contrast, with a low-level language (such as C, Ada or Fortran) every basic step has to be explained. For example, 'switch on the scales; add pieces of butter to the scales until the machine indicates 100g or more; put these butter pieces in a pan; place the pan on the cooker; turn on the heat beneath the pan,' etc." Nothing can remain implicit. The Cambium project team leader emphasises that "we have to think about the equipment that will make this program work, and specify the details of all the characteristics of the utensils and methods to be used."

An ultra-reliable programming language

Over the last 40 years, OCaml has asserted itself as one of the most secure and fast languages for critical systems and symbolic computation applications, which perform calculations based on mathematical expressions. For example, it is used by Meta (parent company of Facebook) to develop the program analyser "Infer"; also by AbsInt for the "Astrée" program analyser (which is used by Airbus), and by a number of Inria teams, notably to develop the Coq proof assistant.

More recently, part of the software used by Jane Street, an American quantitative trading firm, and by LexiFi, a French developer of software for analysis and management of financial products, were written in OCaml, as was Tezos, a French-designed blockchain that specialises in managing "tokens" (digital financial assets). Why? "As a programming language, OCaml meets the reliability requirements of these companies, whose software is used to handle millions, if not billions of Euros," says Florian Angeletti, research engineer on the Cambium project team, who works on both the compiler (machine language translator) and the type system (error detection) in OCaml. "Companies want to be certain that the commands executed by the programs match their intentions."

37 years of open source developments

From the outset, OCaml was designed to meet a high level of requirements. The initial versions of the language were created in the 1980s, following on from the work of Robin Milner, a researcher at the University of Edinburgh, on ML ("Meta-Language"). "At that time, researchers on Inria's ‘Formel’ project team studied this innovative language and contributed to its development, before going on to create a version called CAML (pronounced 'camel') in 1987," François Pottier recalls.

Starting in 1990, the Inria researcher Xavier Leroy went back to the drawing board, and developed Caml Light, then Caml Special Light. Subsequently in 1996, research by Didier Rémy, who is still in the Cambium team, and Jérôme Vouillon, nowadays a research fellow at France’s CNRS, yielded the first version of Objective Caml (renamed OCaml in 2011). Its specificity is that it is object-oriented, which means that programs written in OCaml include objects that interact with each other using messages.

Like its older siblings, this version is also published as open source software by Inria, and therefore enriched by an international community of contributors, both professional and academic. As an example of this, the compiler, which translates the programmes into machine language, was recently enhanced by innovations applied by Tarides (a company that merged with OCaml Labs at the start of 2022) and by researchers from the University of Cambridge and the Indian Institute of Technology (New Delhi and Madras), etc.

Inria is still the main contributor to this language, which "is of strategic importance," in François Pottier's opinion. "In addition to Coq, about thirty software packages developed by the Institute's project teams are implemented in OCaml." Another we could mention is the start-up OCamlPro, an offshoot of Inria that commercializes a professional development environment for the OCaml user community.

Make way for version 5

A new level was reached at the end of 2022. Indeed, ten years after the previous major release (OCaml 4) in July 2012, a new version of OCaml has just been published: OCaml 5. Florian Angeletti says :

The major new feature in OCaml 5 is its support for multicore architectures (i.e. microprocessors that have multiple physical cores).

"Most machines nowadays are equipped with this kind of architecture, which can run several tasks concurrently. This meant that our language had to be able to handle this way of working on several cores at the same time." This mission has been accomplished, thanks to the addition of features that allow language users to program tasks that will run on multiple cores, and to specify any desired interactions. The compiler has also been rewritten so that new instructions, issued by the programmes developed in OCaml 5, can be translated into machine language.

New memory manager and effect handler

Another important innovation: the memory manager has been redeveloped from scratch for OCaml 5. "The memory is like a piece of paper on which we write information," François Pottier explains. "However, unlike with a piece of paper, we don't want to throw this memory away afterwards. Quite the opposite; we wish to reuse it for other tasks, so we need to be able to determine when a piece of information becomes unneeded and the space that it occupies can be reused. "

Whereas previously the memory manager only interacted with one task, the version in OCaml 5 can now communicate with several tasks concurrently in either private or shared memory zones. It has also been designed to be compatible with the memory systems of today's machines, in which each processor has access to different memory zones, some close at hand, others further away. Because of this, information does not circulate instantaneously, so the processors do not all have the exact same picture of the memory contents at a given moment: this is called weakly consistent memory. Florian Angeletti explains:

We were aiming to reduce the complexity that this weak consistency creates for programmers, by making the memory a unified whole in which several tasks could run simultaneously with clearly defined interaction rules.

OCaml 5.0 – available for download from the OCaml.org website – also features effect handlers. With these handlers, tasks become objects that the developers can manipulate. "This means the developers can specify precisely when tasks can be paused and when they can restart," says François Pottier.

This should deliver even more flexibility when it comes to programming actions that may entail resource-intensive processing. Especially as the Cambium project team – wholly dedicated to designing, formalising and implementing programming languages and systems – has no plans to stop there... Evolutions that are already planned include future optimisation of the ergonomics of the new effect managers.

Find out more

For all readers

Why create new programming languages?, Inria blog "Interstices" (in French), 28/01/2019.
Xavier Leroy: Inria – French Académie des Sciences Grand Prize. Among other achievements, Xavier was the architect of the OCaml programming language, Inria, 29/10/2018.

For specialists

Bounding data races in space and time, article on the "memory model" adopted by OCaml 5, University of Cambridge, 18/06/2018.
Retrofitting effect handlers onto OCaml, IIT Madras et al., article on OCaml's effect handlers, 20/06/2021.
Cosmo: A Concurrent Separation Logic for Multicore OCaml (video), presentation by Glen Mével, during his time as a PhD student at Inria (ICFP 2020), ACM SIGPLAN, 8/12/2020.
A Separation Logic for Effect Handlers, Inria, article by François Pottier and Paulo Emílio de Vilhena on program verification involving effect handlers (ACM POPL 2021), 01/2021.