CERIAS - Center for Education and Research in Information Assurance and Security

Skip Navigation
Purdue University
Center for Education and Research in Information Assurance and Security

What is Secure Software Engineering?


A popular saying is that “Reliable software does what it is supposed to do.  Secure software does that and nothing else” (Ivan Arce).  However, how do we get there, and can we claim that we have achieved the practice of an engineering science?  The plethora of vulnerabilities found every year (thousands, and that’s just in software that matters or is publicly known) suggests not.  Does that mean that we don’t know how, or that it is just not put into practice for reasons of ignorance, education, costs, market pressures, or something else?

The distinction between artisanal work and engineering work is well expressed in the SEI (Software Engineering Institute) work on capability maturity models.  Levels of maturity range from 1 to 5: 

  1. Ad-hoc, individual efforts and heroics
  2. Repeatable
  3. Defined
  4. Managed
  5. Optimizing (Science)

  Artisanal work is individual work, entirely dependent on the (unique) skills of the individual and personal level of organization.  Engineering work aims to be objective, independent from one individual’s perception and does not require unique skills.  It should be reproducible, predictable and systematic.

  In this context, it occurred to me that the security community often suggests using methods that have artisanal characteristics.  We are also somewhat hypocritical (in the academic sense of the term, not deceitful, just not thinking through critically enough).  The methods that are suggested to increase security actually rely on practices we decry.  What am I talking about?  I am talking about black lists.

  A common design error is to create a list of “bad” inputs, bad characters, or other undesirable things.  This is a black list;  it often fails because the enumeration is incomplete, or because the removal of bad characters from the input can result in the production of another bad input which is not caught (and so on recursively).  It turns out more often than not that there is a way to circumvent or fool the black list mechanism.  Black lists fail also because they are based on previous experience, and only enumerate *known* bad input.  The recommended practice is the creation of white lists, that enumerate known good input.  Everything else is rejected. 

  When I teach secure programming, I go through often repeated mistakes, and show students how to avoid them.  Books on secure programming show lists upon lists of “sins” and errors to avoid.  Those are blacklists that we are in effect creating in the minds of readers and students!  It doesn’t stop there.  Recommended development methods (solutions for repeated mistakes) also often take the form of black lists.  For example, risk assessment and threat modeling require expert artisans to imagine, based on past experience, what are likely avenues of attack, and possible damage and other consequences.  The results of those activities are dependent upon unique skill sets, are irreproducible (ask different people and you will get different answers), and attempt to enumerate known bad things.  They build black lists into the design of software development projects. 

  Risk assessment and threat modeling are appropriate for insurance purposes in the physical world, because the laws of physics and gravity on earth aren’t going to change tomorrow.  The experience is collected at geographical, geological and national levels, tabulated and analyzed for decades.  However, in software engineering, black lists are doomed to failure, because they are based on past experience, and need to face intelligent attackers inventing new attacks.  How good can that be for the future of secure software engineering? 

  Precious few people emphasize development and software configuration methods that result (with guarantees) in the creation of provably correct code.  This of course leads into formal methods (and languages like SPARK and the correctness by construction approach), but not necessarily so.  For example, I was recently educated on the existence of a software solution called AppArmor (Suse Linux, Crispin Cowan et al.).  This solution is based on fairly fine-grained capabilities, and granting to an application only known required capabilities;  all the others are denied.  This corresponds to building a white list of what an application is allowed to do;  the developers even say that it can contain and limit a process running as root.  Now, it may still be possible for some malicious activity to take place within the limits of the granted capabilities (if an application was compromised), but their scope is greatly limited.  The white list can be developed simply by exercising an application throughout its normal states and functions, in normal usage.  Then the list of capabilites is frozen and provides protection against unexpected conditions. 

  We need to come up with more white list methods for both development and configuration, and move away from black lists.  This is the only way that secure software development will become secure software engineering.

Edit (4/16/06): Someone pointed out the site http://blogs.msdn.com/threatmodeling/ to me.  It is interesting because it shows awareness of the challenge of getting from an art to a science.  It also attempts to abstract the expert knowledge into an “attack library”, which makes explicit its black list nature.  However, they don’t openly acknowledge the limitations of black lists.  Whereas we don’t currently have a white list design methodology that can replace threat modeling (it is useful!), it’s regrettable that the best everyone can come up with is a black list. 

Also, it occurred to me since writing this post that AppArmor isn’t quite a pure white list methodology, strictly speaking.  Instead of being a list of known *safe* capabilities, it is a list of *required* capabilities.  The difference is that the list of required capabilities, due to the granularity of capabilities and the complexity emerging from composing different capabilities together, is a superset of what is safe for the application to be able to do.  What to call it then?  I am thinking of “permissive white list” for a white list that allows more than necessary, vs a “restrictive white list” for a white list that possibly prevents some safe actions, and an “exact white list” when the white list matches exactly what is safe to do, no more and no less.


Leave a comment

Commenting is not available in this section entry.