Redesigning the Cloud
In reality, most of the world's Cloud infrastructure is made up of gigantic data centres grouped together in a surprisingly small number of places. Such a concentration creates different risks, in particular with regard to reliability. If just one of these giant sites breaks down, there could be significant consequences. Moreover, the global transport of data between the end-user and these platforms, situated at the opposite side of the world, consumes a lot of resources. In addition, there are latency issues due to distance. Inria, in association with Orange and Renater*, is studying an alternative model that would enable the computing and storage power to be spread along the whole IP network, in particular in premises already managed by the Internet service providers.
Quincy? 6,000 people close to the Columbia river in the backlands of the state of Washington. A peaceful little town until 2007, when someone remembered the Grand Coulee dam. Attracted by its annual 20TWh of cheap hydroelectric energy, data centres began to spring up like mushrooms - those of Microsoft, Yahoo and Dell to name but a few. "When you connect to book a table in your favourite pizzeria, it is highly likely that in actual fact the data is processed over there. Or in Dublin, or another of these rare places that now host the vast majority of websites. In other words the Cloud is, in reality, made up of a handful of giant installations " states the researcher Adrien Lebre, coordinator of the Inria Project Lab Discovery, an initiative that aims to redefine the very concept of the Cloud .
As he explains, this concentration leads to its own set of drawbacks. "First of all, hosting services that are in fact of a local nature on these very distant platforms proves to be nonsensical in terms of network exchanges, and even energy consumption. What is the pertinence, if not economic, of hosting a local webTV service a significant distance away, or even in another country to that of the end-users? On the other hand, using a very small number of very large data centres represents an inherent security risk. Without mentioning all of the legal issues relating to data sovereignty when the Cloud service is located abroad. "However, after three years of research, the scientists came to think that the main problem was perhaps of a different kind. In other words: " latency due to the great distance between the end-user and the data centres. On the Internet there are more and more potentially mobile objects that must take this latency into account. This is the case, for example, of mobile phones whose SSD disk deteriorates as data is written onto it. In order to minimise the writing of data, it would be more judicious to be able do this in real time on storage resources provided by the Cloud. However latency issues impede this model. Hence the necessity to reduce the distances between the users and the data centres. " In other words: move the resources in order to better match the geographical distribution of the users.
Locate in proximity
Instead of the current gigantic installations, the researchers envisage rolling out a multitude of micro- and nano- data centres at local level. "A study by Microsoft showed that it was a viable alternative. " However another question immediately arose: where could these countless arrays of energy-consuming servers be installed? "On the Internet network structure itself, Adrien Lebre replies. This network already includes numerous infrastructures managed by access providers. We call them points of presence. These physical installations host servers, routers and other equipment responsible for the operation of the Internet network. Due to their proximity to the users, these pre-existing infrastructures are the obvious place to host computing and storage resources in order to best meet local requirements. Setting up in these network points on a certain number of servers hosting virtual machines and data storage service would make it possible to natively limit network exchanges to the minimum necessary. This would reduce both the latency and quantity of data circulating via the networks, thereby opening the door to new strategies in order to minimise the energy impact. "
Incidentally, this utility computing model could also use premises located at the foot of relay masts. In this perspective, "our partner, Orange, is carrying out supplementary research by studying the interest of Device-to-device (D2D) communications. Imagine a goal scored during a football match. Everybody wants to see the action replay. Instead of having thousands of people wanting to download the same file from a server, it would be just as easy to transmit the video directly between mobile phones within the same geographical area. " The research being carried out as part of the Project Lab Discovery goes in the same direction, "but we stop at the network's periphery. For us, it is a question of ensuring that this video (or any other Cloudified service) is available at the bottom of the mast. "
Relying on OpenStack
However, in order to get groups of utility computer platforms that are so widely distributed over the network points of presence and other local infrastructures to operate, an operating system for resource management is required. Such a system does not however currently exist. One of the aims of the Inria Project Lab Discovery is therefore to propose one. That said, rather than building a new OS from scratch, the scientists have decided, once again, to base themselves on what already exists. They are going to modify one of the tools that was initially developed to roll out and manage IaaS infrastructures on centralised sites. "We chose OpenStack. This open source software is attracting an increasing amount of interest from industrialists, including major actors such as IBM, Google or, more recently, the French actor OVH.
One problem: in the current version of this tool, the main components rely on centralised mechanisms, such as MySQL databases for example. "Discovery aims to replace all of these centralised elements by pair-to-pair (P2P) solutions and self-*, which will allow OpenStack to expand over multiple sites. To begin with, we have already replaced the MySQL component with a standard NoSQL solution. In other words, it is a well-known approach to escape the scale issues inherent to SQL technologies. " This project is raising "a lot of enthusiastic reactions within the OpenStack community. We were warmly welcomed by the OpenStack Foundation. It is getting ready to create a dedicated internal research group. A whole dynamic is getting under way. "