CERIAS - Center for Education and Research in Information Assurance and Security

Skip Navigation
Purdue University
Center for Education and Research in Information Assurance and Security

A Serious Threat to Online Trust


There are several news stories now appearing (e.g., Security News) about a serious flaw in how certificates used in online authentication are validated. Ed Felten gives a nice summary of how this affects online WWW site authentication in his Freedom to Tinker blog posting. Brian Krebs also has his usual readable coverage of the problem in his Washington Post article. Steve Bellovin has some interesting commentary, too, about the legal climate.

Is there cause to be concerned? Yes, but not necessarily about what is being covered in the media. There are other lessons to be learned from this.

Short tutorial

First, for the non-geek reader, I’ll briefly explain certificates.

Think about how, online, I can assure myself that the party at the other end of a link is really who they claim to be. What proof can they offer, considering that I don’t have a direct link? Remember that an attacker can send any bits down the wire to me and may access to faster computers than I do.

I can’t base my decision on how the WWW pages appear, or embedded images. Phishing, for instance, succeeds because the phishers set up sites with names and graphics that look like the real banks and merchants, and users trust the visual appearance. This is a standard difficulty for people—understanding the difference between identity (claiming who I am) and authentication (proving who I am).

In the physical world, we do this by using identity tokens that are issued by trusted third parties. Drivers licenses and passports are two of the most common examples. To get one, we need to produce sufficient proof of identity to a third party to meet its standards of proof. Then, the third party issues a document that is very difficult to forge (almost nothing constructed is impossible to forge or duplicate—but some things require so much time and expenditure it isn’t worthwhile). Because the criteria for proof of identity and strength of construction of the document are known, various other parties will accept the document as “proof” of identity. Of course, other problems occur that I’m not going to address—this USACM whitepaper (of which I was principal author) touches on many of them.

Now, in the online world we cannot issue or see physical documents. Instead, we use certificates. We do this by putting together an electronic document that gives the information we want some entity to certify as true about us. The format of this certificate is generally fixed by standards, the most common one being the X.509 suite. This document is sent to an organization known as a Certificate Authority (CA), usually along with a fee. The certificate authority is presumably well-known, and performs a check (to their own standards) that the information in the document is correct, and it has the right form. The CA then calculate a digital hash value of the data, and creates a digital signature of that hash value. This is then added to the certificate and sent back to the user. This is the equivalent of putting a signature on a license and then sealing it in plastic. Any alteration of the data will change the digital hash, and a third party will find that the new hash and the hash value signed with the key of the CA don’t match. The reason this works is that the hash function and encryption algorithm used are presumed to be so computationally difficult to forge that it is basically not possible.

As an example of a certificate , if you visit “https://www.cerias.purdue.edu” you can click on the little padlock icon that appears somewhere in the browser window frame (this is browser dependent) to view details of the CERIAS SSL certificate.

You can get more details on all this by reading the referenced Wikipedia pages, and by reading chapters 5 & 7 in Web Security, Privacy and Commerce.

Back to the hack

In summary, some CAs have been negligent about updating their certificate signing mechanisms in the wake of news that MD5 is weak, published back in 2004. The result is that malicious parties can generate and obtain a certificate “authenticating” them as someone else. What makes it worse is that the root certificate of most of these CAs are “built in” to browser and application trust lists to simplify look-up of new certificates. Thus, most people using standard WWW browsers can be fooled into thinking they have connected to real, valid sites—even through they are connecting to rogue sites.

The approach is simple enough: a party constructs two certificates. One is for the false identity she wishes to claim, and the other is real. She crafts the contents of the certificate so that the MD5 hash of the two, in canonical format, is the same. She submits the real identity certificate to the authority, which verifies her bona fides, and returns the certificate with the MD5 hash signed with the CA private key. Our protagonist then copies that signature to the false certificate, which has the same MD5 hash value and thus the same digital signature, and proceeds with her impersonation!

What makes this worse is that the false key she crafts is for a secondary certificate authority. She can publish this in appropriate places, and is now able to mint as many false keys as she wishes—and they will all have signatures that verify in the chain of trust back to the issuer! She can even issue these new certificates using a stronger hash algorithm than MD5!

What makes this even worse is that it has been known for years that MD5 is weak, yet some CAs have continued to use it! Particularly unfortunate is the realization that Lenstra, Wang and de Weger described how this could be done back in 2005. Methinks that may be grounds for some negligence lawsuits if anyone gets really burned by this….

And adding to the complexity of all this is the issue of certificates in use for other purposes. For example, certificates are used with encrypted S/MIME email to digitally sign messages. Certificates are used to sign ActiveX controls for Microsoft software. Certificates are used to verify the information on many identity cards, including (I believe) government-issued Common Access Cards (CAC). Certificates also provide identification for secured instant messaging sessions (e.g., iChat). There may be many other sensitive uses because certificates are a “known” mechanism. Cloud computing services , software updates, and more may be based on these same assumptions. Some of these services may accept and/or use certificates issued by these deficient CAs.


Fixing this is not trivial. Certainly, all CAs need to start issuing certificates based on other message digests, such as SHA-1. However, this will take time and effort, and may not take effect before this problem can be exploited by attackers. Responsible vendors will cease to issue certificates until they get this fixed, but that has an economic impact some many not wish to incur.

We can try to educate end-users about this, but the problem is so complicated with technical details, the average person won’t know how to actually make a determination about valid certificates. It might even cause more harm by leading people to distrust valid certificates by mistake!

It is not possible to simply say that all existing applications will no longer accept certificates rooted at those CAs, or will not accept certificates based on MD5: there are too many extant, valid certificates in place to do that. Eventually, those certificates will expire, and be replaced. That will eventually take care of the problem—perhaps within the space of the next 18 months or so (most certificates are issued for only a year at a time, in part for reasons such as this).

Vendors of applications, and especially WWW browsers, need to give careful thought about updates to their software to flag MD5-based certificates as deserving of special attention. This may or may not be a worthwhile approach, for the reason given above: even with a warning, too few people will be able to know what to do.

Bigger issue

We base a huge amount of trust on certificates and encryption. History has shown how easy it is to get implementations and details wrong. History has also shown how quickly things can be destabilized with advances in technology.

In particular, too many people and organizations take for granted the assumptions on which this vast certificate system is based. For instance, we assume that the hash/digest functions in use are computationally difficult to reverse or cause collisions. We also assume that certain mathematical functions underlying public/private key encryption are too difficult to reverse or “brute force.” However, all it takes is some new insight or analysis, or maybe new, affordable technology (e.g., practical quantum computing, or massively parallel computing) to violate those assumptions.

If you look at the way our systems are constructed, too little thought is given to what happens to existing infrastructure when something breaks. Designs can include compensating and recovery code, but doing so requires some cost in space or time. However, all too often people are willing to avoid the investment by putting off the danger to “if and when that happens.” Thus, we instance such as the Y2K problems and the issues here with potentially rogue CAs.

(I’ll note as an aside, that when I designed the original version of Tripwire and what became the Signacert product, I specifically included simultaneous use of several different message digest functions in different families for this very reason. I knew it was a matter of time before one or two were broken. I still believe that it is beyond reason to find files that will match multiple, different algorithms simultaneously.)

Another issue is the whole question of who we trust, and for what. As noted in the USACM whitepaper, authentication is always relative to a third party. How much do we trust those third parties? How much trust have we invested in the companies and agencies issuing certificates? Are they properly verifying identities? How good is there internal security? How do we know, and how much is at risk from our trust in those entities?

Let me leave you with a final thought. How do we know that this problem has not already been quietly exploited? The basic concept has been in the open literature for years. The general nature of this attack on certificates has been known for well over a decade, if not two. Given the technical and infrastructure resources available to national agencies and organized criminals, and given the motivation to use this hack selectively and quietly, how can we know that it is not already being used?

[Added 12/31/2008]: A follow-up post to this one is available in the blog.



Posted by Laura Raderman
on Tuesday, December 30, 2008 at 02:33 PM

One of the pluses to this is that in order to be cross-certified with the federal bridge (all Fed agencies either are, or are required to use a CA that is), you have to meet the Federal Bridge CP, which does not allow MD5 as a hashing algorithm. 

Having the browser (or OS) flag certificates that are using MD5 is a great idea, and why no one’s added that feature is a little astounding.  As you pointed out, we’ve known since 2005 that it is possible to construct two X.509 certificates whose hashes collide, yet CAs *still* use MD5 as a hashing algorithm.

On the bigger issue you point out - it’s a hard problem to solve, and we do what’s “good enough”.  Imagine trying to explain PKI to someone who’s not even computer savvy, much less understands enough about cryptography to realize what’s going on “under the hood” (and I know quite a few CS majors who don’t).  It’s easier to say “look for the little lock, then it’s OK”.

Posted by Jason
on Tuesday, December 30, 2008 at 03:25 PM

Dr. Spafford,

I noticed that the certificate for https://www.cerias.purdue.edu was issued by Thawte.  While Cerias’s site is a SHA-1 signature, the CA that vouches for it has a MD5 signature.

Now, I’m not throwing stones, as I too depend on 3rd parties that I trust to do things properly.  However, how much research should be done before purchasing things like this?

Or, put another way, are we the customers to blame for not forcing CAs to switch off of MD5?  After all, if it wasn’t important enough for us, why would it be for them?

Are there other sleeping issues like this that we should start looking at from our vendors?

Posted by Gene Spafford
on Tuesday, December 30, 2008 at 10:23 PM

Thanks for the feedback, Jason.

Yes, Thawte used MD5 on their own certificate, but if you analyze the case, there is no significant danger to them from doing so.  Eventually, they will reissue their own master certificate, and as they seem to be issuing all new certs with SHA-1, it looks like they are operating responsibly.

As to our duty to be informed consumers, yes, that exists.  I’ve written about that in prior postings, and it is complicated by issues of cost and understanding complexity.  For instance, if people really acquired software on the basis of overall security and reliability no one would be using Internet Explorer any more!

As to lurking problems, well, there are way too many.  And that too has been an ongoing topic of my posts, and of many of my colleagues.  Unfortunately, our warnings are often labeled as “unduly alarmist” and ignored….

Thanks for the feedback!

Posted by Mark
on Wednesday, December 31, 2008 at 11:02 AM

Insider threat is a major weakness of the certificate system. Even if we ssume cryptographic safety, an insider could issue a fake certificate by penetrating the work process of the CA.

Spaf replies:
Excellent point, Mark.  That is what I was trying to get at with my comments about “assumptions” that underlie the certificate system.

Posted by Mitja G
on Thursday, January 1, 2009 at 07:29 AM

Interesting post good work.

Posted by Vin McLellan
on Friday, January 2, 2009 at 03:12 PM

Hi Spaf,

Upgrading applications to SHA2 is a big task, but blocking this type of attack on MD5-based CAs may not be as difficult as you suggest. Five of the six CAs listed, by Sotirov et al, as potentially vulnerable to this MD5 CA-forgery attack all seem to be under the operational control of VeriSign.

The sixth CA listed, RSA Data Security’s “Secure Server Certification Authority,” was legally transferred from RSADSI to VeriSign in 1995, when RSA spun off VeriSign as an independent entity. (Sotirov’s team said they saw certs created by that particular CA in 2008, but that CA was supposedly only operational from November, 1994, to December, 1999.)

When VeriSign VP Tim Callan announced that VeriSign, within hours, had blocked this particular attack against any “SSL certificate that VeriSign sells under any brand,” half the problem seemed solved. When he added that VeriSign will “discontinue MD5 in all end entity certificates by the end of January, 2009,” did not that take care of most of the rest?

Assuming no one successfully cloned a rogue CA before this exploit was announced, MD5-based certs in everyday use should be fine. The trust placed in these digital certificates relies upon the security of the RSA public key cryptosystem used to sign it, not the message digest algorithm used to prepare a fingerprint of the cert for a signature, right?

Your concern about the procedural integrity of the certificate chain, and the amount of blind trust laymen invest in a mechanism that is inevitably somewhat flawed, is thought-provoking. Your concern about the responsiveness of the industry to emerging risks—risks associated with incremental or catastrophic breakthroughs in math, or the inevitable march of time that brings more and more computational power to bear on these cryptosystems and hashes—is also timely and appropriate.

RSA, now a division of EMC, has been a leading US vendor of commercial crypto for decades. (I’ve been associated with the company for almost that long.)  RSA, of course, originally sponsored and promoted the use of MD5, invented by MIT Prof Ron Rivest (the R in RSA) in 1991. Five years later, in Nov. 1996, RSA Labs—with the concurrence of Rivest—announced that the security of MD5 was suspect and urged the industry to move from MD5 to SHA1. RSA, and Rivest, often repeated that recommendation in the years that followed.

Sad, 12 years later, to see us confront MD5 as a threat to our infrastructure. Yet it is perhaps heartening to see Rivest, in 2009, again offering a new hash, MD6, as a candidate for a US national standard. I delight in watching, on the NIST mailing list, the best and the brightest jab and chat about their respective candidates for SHA3 status. It’s fun and somehow, like this hard-edged research project, reassuring. The cutting edge will probably take care of itself, but it is a damn good thing that we have savvy skeptics like Sotirov and Company to keep an eye on the lagging edge.



Leave a comment

Commenting is not available in this section entry.