Sites Inria

Version française


Jean-Michel Prima - 24/06/2014

Hunting viruses in binary code

Code © Inria / Photo Kaksonen

A few malicious instructions concealed in the code of a software application are all it takes to compromise the security of an information system. At the Inria Rennes-Bretagne Atlantique centre, a team of research scientists is studying how formal procedures can be used to analyse binary code to make it more legible and thus simplify the detection of malicious programs that might be lurking within. This work is being carried out in partnership with the DGA Maîtrise de l’information facility (French Defence Procurement Agency Information Security and Control) in Rennes, France.

They ravage hard drives. They sneak in unnoticed. They spy. They steal bank codes and card numbers. They appropriate passwords, or slyly alter the rotation speed of industrial centrifuges. In a hyper-connected world, worms, viruses, Trojan horses and other malware represent a threat that is anything but virtual. At the Inria centre in Rennes, Thomas Jensen directs the Celtique project team, specialised in software security. For a number of years now, these scientists have been working closely with DGA Maîtrise de l’information.“This partnership is an important component in the wider context of a framework agreement between Inria and the DGA. Different teams focus on different issues, such as cryptology for example. Our work is about the analysis of binary code. This is very low level code which runs on all computers.

Fragmented code

Viruses like to hole up in these strings of zeros and ones, making themselves as invisible as possible. “By its nature, binary code is difficult to read. It is comprised of small, relatively unstructured instructions. Each instruction performs a single tiny action. The notion of a loop for example needs to be reconstructed. It is very hard to detect a few malicious instructions concealed in the midst of code that is so fragmented.
For research scientists, the challenge is how to use formal procedures to analyse this code and devise a way of representing it on a higher level. Eventually the aim is to be able to automate the analysis process. “But we aren't there yet. We have a lot of barriers to overcome first. There is still research to do upstream. We have identified an issue here which is of interest to both Inria and DGA Maîtrise de l’information and we are pooling our resources to develop our understanding of the subject.

A seminar on formal procedures and security

This partnership is reinforced by a twice monthly seminar around the theme of formal procedures and security held at Inria's premises. “These meetings are supported by DGA Maîtrise de l’information, which means we get to invite major scientific figures to presentations on subjects of interest to us. These seminars can seen as technology watch type activities.
DGA Maîtrise de l’informationare also co-funding a number of theses that are being developed at the Inria centre in Rennes. Additionally, they have assigned one of their own scientists to work with the Celtique team one day a week as an external colleague. “Our research is of interest to anybody who is involved with software security ", explains Colas Le Guernic. For example a program user who wants to check certain properties directly by analysing the binary code himself. Or a designer who is looking to integrate an external component and who wants to be sure he can do so without risk.


Another aspect of our work concerns code obfuscation techniques. In other words, the thousand and one ways used to make a program truly impenetrable to a reader. “These methods are used by the creators of viruses, but they are also used to protect legitimate software against malware. Video game publishers, for example, use obfuscation to prevent pirates counterfeiting their products. In other fields such as telecommunications, manufacturers also use obfuscation to protect the industrial secrets of proprietary technologies. ” In such cases, it is important to conserve the characteristics of the software while making it as unintelligible as possible.
A part of this work also takes place in the context of a research project on binary security that, since the start of 2013, has been funded by the ANR (French National Research Agency). Its goals are to provide tools for the security industry, improve our cyber defence capability and develop instruments for the protection of digital infrastructures.

Keywords: Code INRIA Rennes - Bretagne Atlantique Celtique Virus DGA