Artificial intelligence

Local image recognition on phones

Date:
Changed on 12/04/2022
We use our smartphones every day to scan a multitude of objects and access a wide range of services thanks to image recognition technology using artificial intelligence. The problem is that these applications require access to the mobile network to connect to the cloud and request the classification response from remote servers. Three companies (IMATAG, QUAI DES APPS and ARIADNEXT) have joined forces in the Inria Rennes – Bretagne Atlantique research centre to resolve this problem. The collaboration has led to the optimisation of deep learning algorithms for execution on embedded platforms while maintaining a good level of performance.
Photo d'une personne qui utilise un téléphone portable
© Inria / Photo M. Magnin

From Cloud to local

Currently, in most approaches used to classify images or read identity documents from a smartphone, the device is only used to capture the image. The data is sent to cloud servers for analysis, a process which is often computationally intensive. The result is then sent back to the user’s telephone. The technique works well so long as network coverage is available. However, it does not work in areas where there is no coverage or where access is limited. To solve this problem, the MobileAI research project aimed to incorporate artificial intelligence technology into the smartphone while maintaining its robustness and ability to operate in real time,” explains Montaser Awal, head of the artificial intelligence research team at ARIADNEXT, a company specialising in the remote verification of ID documents.

The project was born from informal discussions among a group of people working at three different companies where visual content identification plays a central role”, says Mathieu Desoubeaux, Co-Founder of IMATAG, a company created with the support of Inria that specialises in robust watermarks for copyright-protected content. “The subject was first mentioned by our friends at QUAI DES APPS, a company that works in the field of augmented reality. Their aim is to achieve image recognition on a mobile phone without network coverage. Yannis Avrithis, a researcher from the Linkmedia team at the Inria Rennes Centre also played a very active role in these discussions. That was how the four of us decided to set up an R&D collaboration to try to solve the problem”.

Launched in September 2018 and completed in 2021, the MobileAI project was funded by the BPI, Rennes Métropole and the regions of Brittany and Pays de la Loire through a call for projects launched by Images & Réseaux.

 

Convolutional neural networks

représentation d'un téléphone mobile en train de scanner une pièce d'identité

At the heart of the matter is a family of particularly powerful deep-learning algorithms, called convolutional neural networks (CNN). “These are excellent candidates for mobile image recognition”, explains Montaser Awal. “But for our purposes, we had to modify their architecture and optimise them to make them executable on mobile devices while maintaining a similar performance level to cloud-based server systems”.

Mission accomplished. The project has advanced the state of the art and resulted in ten scientific publications and five prototype applications. The new algorithms for image classification and text recognition from a photograph of an ID document were immediately integrated into IDcheck.io, ARIADNEXT's flagship product for ID document authentication. “The acquirement of cutting-edge expertise in deep learning for image recognition is also an important factor for future developments”, the company explains.

In-store image recognition

Video file

The project has allowed QUAI DES APPS to improve Blinkl, its augmented narration web app. Its service allows clients in shops to photograph products on the shelves and obtain more information about them. Until now, the image recognition process was executed on remote servers. The disadvantages of this were the computational load on these machines and latencies during peak periods, such as during sales or product launches. In addition, there was a bottleneck in the image search which limited the size of the database to 1000 products. By switching image recognition to the mobile and improving the descriptors of these images, the company made a game-changing move and can now handle databases of 100,000 images. These capacities will allow QUAI DES APPS to meet the needs of the retail industry, whether for images of products on the shelf or in a catalogue.

For IMATAG, this R&D project improved the image search technology used in its monitoring solution for copyright infringement. It also opens up prospects for new product lines.

 

Titre

Collaborative project

Image
Photo portrait de Mathieu Desoubeaux, co-fondateur d'IMATAG
Verbatim

The three industrial partners have acquired valuable expertise in deep learning solutions for mobile image recognition in general. The project has allowed each of us to strengthen our positions in a strategic area and improve solutions that contribute to the majority of our turnover.

Auteur

Mathieu Desoubeaux

Poste

IMATAG co-founder