Spoofing certificates with MD5 collisions
- 10 min read - Text OnlyI attended a presentation at Crypto and Privacy village where Tomer Peled and Yoni Rozenshein from Akamai. They reverse engineer a Windows update to crypt32.dll
to find out what's behind CVE-2022-34689. A truncated MD5 was used as an index to a hash table which caches whether a certificate has been validated successfully. Only the MD5 was compared when the entry was found in that cache. By using MD5 collisions, they found that crypt32.dll
would validate a malicious certificate after an honest certificate was validated.
This talk summary is part of my DEF CON 31 series. The talks this year have sufficient depth to be shared independently and are separated for easier consumption.
This next talk commenced at the Crypto and Privacy village. I went in knowing that MD5 has not been used in certificates for a long time now and for a good reason.
The presentation
The National Security Agency (NSA) published a Common Vulnerabilities and Exposures (CVE) for a Windows CryptoAPI Spoofing Vulnerability - CVE-2022-34689. The speaker mentions that NSA and GCHQ (the United Kingdom equivalent) release very interesting CVEs with few public details. To find out what it was, researchers had to reverse engineer the changes to crypt32.dll
. Thankfully, crypt32.dll
's changes were easy to identify. By doing a binary comparison with the prior release, researchers identified the fix with ease.
Windows uses crypt32.dll
to validate certificates for applications, drivers, and websites. Inside is a cache so that when a certificate is validated a second time, it can skip the expensive cryptographic operations like RSA signature verification. This cache used a hash table to store the verification results. In essence the hash table would be internally keyed by the MD5 hash of the certificate bytes and the value found would be an object with the certificate and its validation status.
When the MD5 hash is the same and the certificate is not, we have a birthday attack through a hash collision. In short, crypt32.dll
could be tricked to validate a valid certificate (called cache poisoning) and later assess that a malicious certificate with the same MD5 hash as also valid. This enables a man-in-the-middle attack (MITM) with applications that retry TLS connections with crypt32.dll
as the certificate verification method.
The researchers demonstrated this successfully using an old version of Chrome from 2015. Since then, Google Chrome has switched to another cryptographic backend BoringSSL. Their MITM proxy would first send the target's certificate, which is public, to the client and poison the crypt32.dll
cache. Then the request would fail and the client would try again. This time, it received the malicious certificate and crypt32.dll
sees that a certificate with the same MD5 hash was valid, without comparing the certificate saved in the cache with the input certificate. Finally, the TLS handshake succeeds and the client is served a malicious page while the security symbol shows that everything checks out.
MD5 collisions can be found in two ways. The first is to have a common prefix and two different suffixes. This one can be found in seconds on a modern computer. The second is to have two different prefixes and two different suffixes. That one takes much longer to find, though it is not expensive for a threat actor to do. Once a collision is found, if a researcher or threat actor adds another suffix to the data, the data continues to have a colliding digest.
The second approach is the only option for this use case. The malicious certificate must be valid binary ASN.1 certificate encoded with DER. Existing MD5 collision techniques work by mangling a binary block of data. Given the difficulty of finding one collision, it would be too expensive to find acceptable collisions if the block of data were constrained to fit a certain pattern, protocol, or encoding. The researchers needed somewhere to insert arbitrary binary data after their private key to make the certificate functionally accepted while keeping the same MD5 hash in the middle. Then, by appending the rest of the certificate, it appears with the same final MD5 hash to crypt32.dll
.
The next challenge is where to put the arbitrary data after the public key.
It so happens that there is a really convenient location. In RFC 5912 - New ASN.1 Modules for the Public Key Infrastructure Using X.509 (PKIX) - section 6, it specifies the semantic ASN.1 structure: the pk-rsa
object is an ordered set of bytes where it has the RSAPublicKey
followed by a PARAMS
value. This value must be null
, and must be checked to be null
.
RSAPublicKey ::= SEQUENCE {
modulus INTEGER, -- n
publicExponent INTEGER -- e
}
rsaEncryption OBJECT IDENTIFIER ::= {
iso(1) member-body(2) us(840) rsadsi(113549) pkcs(1) pkcs-1(1) 1 }
pk-rsa PUBLIC-KEY ::= {
IDENTIFIER rsaEncryption
KEY RSAPublicKey
PARAMS TYPE NULL ARE absent
-- Private key format not in this module --
CERT-KEY-USAGE {digitalSignature, nonRepudiation, keyEncipherment, dataEncipherment, keyCertSign, cRLSign}
}
What if certificate had PARAMS
set to an arbitrary bit or byte string? And, what if certificate validation code did not validate that PARAMS
is always null
? Then we have a perfect place to insert arbitrary data to mangle the MD5 digest as we desire. In practice, cryptographic libraries omit this check as the CA signature would fail and trusted CAs would not sign a public key without proving ownership of the domain.
Demonstration
Alternatively, they published the demo through a tweet.
More details can be found on the official write up Exploiting a Critical Spoofing Vulnerability in Windows CryptoAPI.