A threat to website electronic signatures
© Inria / Photo Kaksonen
An international team of cryptanalysts - Pierre Karpman, from Inria Saclay - Ile-de-France Grace project-team, Marc Stevens from CWI, Netherlands, and Thomas Peyrin from NTU, Singapore – are today urging the industry to deprecate the SHA-1 Internet security standard earlier than planned. Research carried out by the team has revealed a freestart collision in the SHA-1 hash function.
In order to understand the implications of this result, we talked to Daniel Augot, manager of the Grace project-team at Inria Saclay – Ile-de-France Research Centre, of which Pierre Karpman is a member. He explains:
What is SHA-1?
SHA-1 is a cryptographic hash function adopted as a standard by the NIST in 1995. The input to the function is a digital document of any size, and the output is a fingerprint that is always 160 bits long. In addition to other uses, the function is widely used in the certificates used to authenticate websites. This industry standard is also used in the electronic signatures used to secure payment card transactions, online banking and software distribution.
What is a hash function?
A hash function is a cryptographic algorithm that transforms any electronic document into a short fixed-length fingerprint. Hash functions are the Swiss army knife of cryptography with many uses, the most important of which is in electronic signatures. The electronic signature for a document is built in two stages, using two distinct algorithms. Firstly, the hash function is applied to the document to create a short fingerprint. The signature algorithm itself is then applied to this fingerprint. Electronic signatures are mainly used to authenticate websites to which we wish to connect securely. These electronic signatures are issued by a number of certification authorities.
Why is a collision such a threat?
A collision in a given hash function occurs when two different documents generate the same fingerprint.One of the documents may have been signed legitimately by its owner. An attacker with access to the second document with the same fingerprint could substitute it for the first document. As the signature is derived only from the fingerprint, the signature would validate the second document as genuine. The attacker has effectively produced a counterfeit. Even if the signature algorithm itself is secure, combining it with a hash function that has been weakened in this way results in electronic signatures that cannot be relied upon.
What is the result achieved by Stevens, Karpman and Peyrin?
A hash function such as SHA-1 makes use of an internal compression function that takes messages of a fixed length and converts them into shorter messages. By iterating this function over blocks of fixed-length messages, it is possible to construct a global algorithm capable of accepting messages of any length. The collision discovered by Stevens, Karpman and Peyrin occurs in the compression function, and not in the SHA-1 hash function itself.
What is the danger?
The production of a counterfeit certificate would allow an attacker to create a substitute for a genuine e-commerce site and set up secure connections with users of the site. These users would believe that they were dealing with the genuine site but, in reality, any data they enter would immediately fall into the hands of the attacker. However, in order to generate a counterfeit certificate, an attacker would need to be able to calculate the collisions using a chosen prefix, and this is not what the researchers have done.
Still two more steps to a dangerous collision.
A collision in the compression function does not automatically result in a collision in the hash function. Significantly more cryptanalytic work still needs to be done. Moreover, a collision in the hash function would normally only result in two documents full of random characters with no meaning. Much more cryptanalytic work would be needed to generate two intelligible documents with identical signatures. This is known as a chosen prefix collision.
Why then does it matter?
Because of lessons from history. The most commonly used function prior to SHA-1 was MD5. Weaknesses in the MD5 compression function were discovered in the mid 1990s, and cryptologists were quick to recommend its deprecation. However, this was largely ignored by the software industry. MD5 was finally broken in 2004, with the demonstration of the first collision;two messages producing the same MD5 fingerprint. A chosen prefix attack with two intelligible messages was achieved in 2007. By December 2008, a counterfeit certificate had been generated, demonstrating the real consequences of the weaknesses in MD5. The demonstration of a real attack with severe implications finally convinced the software industry to abandon MD5. However, not before the FLAME virus had used MD5 collisions to construct a counterfeit Microsoft certificate and use it to install malware.
The computer security expert Bruce Schneier predicts a collision in SHA-1 in 2018, and Google has announced that it will gradually make certificates using SHA-1 obsolete before 2017. Extrapolating the results obtained by Karpman, Peyrin and Stevens indicate that these timescales are too long. The discovery of this freestart collision required ten days processing time on a cluster of 64 GPUs. Extrapolating this indicates that a complete collision could be achieved in 78 days using 512 GPUs
What are the alternatives to SHA-1?
It is currently planned to declare signatures based on SHA-1 at risk in web browsers in January 2017, and replace it with its successor SHA-2. Marc Stevens, Pierre Karpman and Thomas Peyrin now estimate that the creation of counterfeit digital signatures will become possible well before that date.
The team of cryptanalysts that obtained this result recommend that SHA-1 should be deprecated sooner than planned, and that websites should migrate to SHA-2 as soon as possible.