Posts by pmeunier
What is Secure Software Engineering?
A popular saying is that “Reliable software does what it is supposed to do. Secure software does that and nothing else” (Ivan Arce). However, how do we get there, and can we claim that we have achieved the practice of an engineering science? The plethora of vulnerabilities found every year (thousands, and that’s just in software that matters or is publicly known) suggests not. Does that mean that we don’t know how, or that it is just not put into practice for reasons of ignorance, education, costs, market pressures, or something else?
The distinction between artisanal work and engineering work is well expressed in the SEI (Software Engineering Institute) work on capability maturity models. Levels of maturity range from 1 to 5:
- Ad-hoc, individual efforts and heroics
- Repeatable
- Defined
- Managed
- Optimizing (Science)
Artisanal work is individual work, entirely dependent on the (unique) skills of the individual and personal level of organization. Engineering work aims to be objective, independent from one individual’s perception and does not require unique skills. It should be reproducible, predictable and systematic.
In this context, it occurred to me that the security community often suggests using methods that have artisanal characteristics. We are also somewhat hypocritical (in the academic sense of the term, not deceitful, just not thinking through critically enough). The methods that are suggested to increase security actually rely on practices we decry. What am I talking about? I am talking about black lists.
A common design error is to create a list of “bad” inputs, bad characters, or other undesirable things. This is a black list; it often fails because the enumeration is incomplete, or because the removal of bad characters from the input can result in the production of another bad input which is not caught (and so on recursively). It turns out more often than not that there is a way to circumvent or fool the black list mechanism. Black lists fail also because they are based on previous experience, and only enumerate *known* bad input. The recommended practice is the creation of white lists, that enumerate known good input. Everything else is rejected.
When I teach secure programming, I go through often repeated mistakes, and show students how to avoid them. Books on secure programming show lists upon lists of “sins” and errors to avoid. Those are blacklists that we are in effect creating in the minds of readers and students! It doesn’t stop there. Recommended development methods (solutions for repeated mistakes) also often take the form of black lists. For example, risk assessment and threat modeling require expert artisans to imagine, based on past experience, what are likely avenues of attack, and possible damage and other consequences. The results of those activities are dependent upon unique skill sets, are irreproducible (ask different people and you will get different answers), and attempt to enumerate known bad things. They build black lists into the design of software development projects.
Risk assessment and threat modeling are appropriate for insurance purposes in the physical world, because the laws of physics and gravity on earth aren’t going to change tomorrow. The experience is collected at geographical, geological and national levels, tabulated and analyzed for decades. However, in software engineering, black lists are doomed to failure, because they are based on past experience, and need to face intelligent attackers inventing new attacks. How good can that be for the future of secure software engineering?
Precious few people emphasize development and software configuration methods that result (with guarantees) in the creation of provably correct code. This of course leads into formal methods (and languages like SPARK and the correctness by construction approach), but not necessarily so. For example, I was recently educated on the existence of a software solution called AppArmor (Suse Linux, Crispin Cowan et al.). This solution is based on fairly fine-grained capabilities, and granting to an application only known required capabilities; all the others are denied. This corresponds to building a white list of what an application is allowed to do; the developers even say that it can contain and limit a process running as root. Now, it may still be possible for some malicious activity to take place within the limits of the granted capabilities (if an application was compromised), but their scope is greatly limited. The white list can be developed simply by exercising an application throughout its normal states and functions, in normal usage. Then the list of capabilites is frozen and provides protection against unexpected conditions.
We need to come up with more white list methods for both development and configuration, and move away from black lists. This is the only way that secure software development will become secure software engineering.
Edit (4/16/06): Someone pointed out the site http://blogs.msdn.com/threatmodeling/ to me. It is interesting because it shows awareness of the challenge of getting from an art to a science. It also attempts to abstract the expert knowledge into an “attack library”, which makes explicit its black list nature. However, they don’t openly acknowledge the limitations of black lists. Whereas we don’t currently have a white list design methodology that can replace threat modeling (it is useful!), it’s regrettable that the best everyone can come up with is a black list.
Also, it occurred to me since writing this post that AppArmor isn’t quite a pure white list methodology, strictly speaking. Instead of being a list of known *safe* capabilities, it is a list of *required* capabilities. The difference is that the list of required capabilities, due to the granularity of capabilities and the complexity emerging from composing different capabilities together, is a superset of what is safe for the application to be able to do. What to call it then? I am thinking of “permissive white list” for a white list that allows more than necessary, vs a “restrictive white list” for a white list that possibly prevents some safe actions, and an “exact white list” when the white list matches exactly what is safe to do, no more and no less.
Using Virtual Machines to Defend Against Security and Trust Failures
According to the National Vulnerability Database (http://nvd.nist.gov), the number of vulnerabilities found every year increases: 1253 in 2003, 2343 in 2004, and 4734 in 2005. We take security risks not only by choosing a specific operating system, but also by installing applications and services. We take risks by browsing the web, because web sites insist on running code on our systems: JavaScript, Flash (ActionScript), Java, ActiveX, VBscript, QuickTime, and all the plug-ins and browser extensions imaginable. Applications we pay for want to probe the network to make sure there isn’t another copy running on another computer, creating a vector by which malicious replies could attack us.
Games refuse to install in unprivileged accounts, so they can run their own integrity checkers with spyware qualities with full privileges (e.g., WoW, but others do the same, e.g., Lineage II), that in turn can even deny you the capability to terminate (kill) the game if it hangs (e.g., Lineage II). This is done supposedly to prevent cheating, but allows the game companies full access and control of your machine, which is objectionable. On top of that those games are networked applications, meaning that any vulnerability in them could result in a complete (i.e., root, LocalSystem) compromise.
It is common knowledge that if a worm like MyTob compromises your system, you need to wipe the drive and reinstall everything. This is in part because these worms are so hard to remove, as they attack security software and will prevent firewalls and virus scanners from functioning properly. However there is also a trust issue—a rootkit could have been installed, so you can’t trust that computer anymore. So, if you do any sensitive work or are just afraid of losing your work in progress, you need a dedicated gaming or internet PC. Or do you?
Company VMWare offers on their web site the free download of VMWare player, as well as a “browser appliance” based on Firefox and Ubuntu Linux. The advantages are that you don’t need to install and *trust* Firefox. Moreover, you don’t need to trust Internet Explorer or any other browser anymore. If a worm compromises Firefox, or malicious JavaScripts change settings and take control of Firefox, you may simply trash the browser appliance and download a new copy. I can’t overemphasize how much less work this is compared to reinstalling Windows XP for the nth time, possibly having to call the license validation phone line, and frantically trying to find a recent backup that works and isn’t infected too. As long as VMWare player can contain the infection, your installation is preserved. Also hosted on the VMWare site are various community-created images allowing you to test various software at essentially no risk, and no configuration work!
After experiencing this, I am left to wonder, why aren’t all applications like a VMWare “appliance” image, and the operating system like VMWare player? They should be. Efforts to engineer software security have obviously failed to contain the growth of vulnerabilities and security problems. Applying the same solutions the same problems will keep resulting in failures. I’m not giving up on secure programming and secure software engineering, as I can see promising languages, development methods and technologies appearing, but at the same time I can’t trust my personal computers, and I need to compartmentalize by buying separate machines. This is expensive and inconvenient. Virtual machines provide us with an alternative. In the past, storing entire images of operating systems for each application was unthinkable. Nowadays, storage is so cheap and abundant that the size of “appliance” images is no longer an issue. It is time to virtualize the entire machine; all I now require from the base operating system is to manage a file system and be able to launch VMWare player, with at least a browser appliance to bootstrap… Well, not quite. Isolated appliances are not so useful; I want to be able to transfer documents from appliance to appliance. This is easily accomplished with a USB memory stick, or perhaps a virtual drive that I can mount when needed. This shared storage could become a new propagation vector for viruses, but it would be very limited in scope.
Virtual machine appliances, anyone?
Note (March13, 2006): Virtual machines can’t defend against cross-site scripting vulnerabilities (XSS), so they are not a solution for all security problems.
CIRDB 3.06 released
In this version, the CIRDB has been updated to import XML data from the National Vulnerability Database, and a few bug fixes were made (a library file was missing in the previous distribution to support RSS feeds of the status of each domain over SSL).
An open source command-line Cassandra!
I am pleased to announce the availability of the first beta of my_cassandra.php, which can be downloaded from my home page
(change the extension from phps to php after you download it).
Because you get the source code and the custody of your profiles, this version of Cassandra should not generate the privacy concerns that the online version did. As it is under your control you can also run it at the intervals you choose. It is made available under an open source license so you can modify it. It runs under PHP so it should run on almost any platform by changing the path to PHP (from “#!/usr/bin/php -q” for MacOS X).
Enjoy!
P.S.: I already received a patch from Benjamin Lewis from Purdue ITSP, improving robustness while reading a profile. Thanks Ben!
Managing Web Browser risks with the NoScript extension
It is very risky to enable all client-side scripting technologies when browsing the web (plugins/ActiveX/ JavaScript/Flash etc…). I installed the “NoScript” extension for Firefox, which allows JavaScript to run only on some whitelisted sites. It is a wonderful idea, except that it comes with a list of pre-enabled sites with some that you can’t delete (the arrogance of dictating unerasable sites!), and the defaults are to not block Flash and other plugins. Moreover, it’s only as secure as DNS, unless you require the “full addresses” option through which I presume you could require an https (SSL) url. Unfortunately there is no way to enable “base 2nd level domains” *and* require SSL, to say for example that I want to trust all *.purdue.edu sites that I contact through SSL and that have valid SSL certificates. It is better than nothing, but needs SSL support to be really useful. Most people don’t understand the limitations and vulnerabilities of DNS, and the need for SSL, and will therefore have an unwarranted feeling of security while using this plugin.
About: Backtracking Intrusions
King and Chen (2005) write about their BackTracker software. The idea is interesting: let’s log everything needed to relate a sequence of events leading to an intrusion. Everything in this case is processes, files, and filenames. It can generate dependency graphs, once an anomalous process or event has been identified. That is, something else must raise an alert, and then BackTracker helps find the cause. It’s an interesting representation of an attack.
Taken one step further than they do, perhaps these dependency graphs could be used for intrusion detection?
About: Secure Program Execution via Dynamic Information Flow Tracking
Suh et al. (2004) propose a wonderful method for tracking taintedness, and denying dangerous operations. It’s elegant, easy to understand, cheap in terms of performance hit, and effective. The only problem is… it would require re-designing the hardware (CPUs) to support it.
I wish it would happen, but I’m not holding my breath. Perhaps virtual machines could help until it happens, and even make it happen?
Review: Secure Execution via Program Sheperding
Kiriansky et al. (2002) wrote an interesting paper on what they call “program sheperding”. The basic idea is to control how the program counter changes and where it points to. The PC should not point to data areas (this is somewhat similar in concept to non-executable stacks or memory pages). The PC should enter library code through approved entry points only. It would be capable in principle to enforce that the return target of a function should be the instruction located right after the call.
Their solution keeps track of “code origins”, which resembles a multi-level taint tracking. The authors argue that this is better than execute flags on memory pages, because those could be “inadvertently or maliciously changed” (and they have three states instead of only two). I thought those flags were managed by the kernel and could not be changed in user space? If the kernel is compromised, then program sheperding will be compromised too. The mechanism tracking code origins heavily uses write-protected memory pages, so the question that comes to mind is why couldn’t those also be “inadvertently or maliciously changed” if we have to worry about that for execute flags? I must be missing something.
The potential versatility of this technology is impressive. The authors test only one policy. Policies have to be written, tested and approved; it is not clear to me why that policy was chosen and the compromises it implies.
The crux of the whole system is code interpretation, which, despite the use of advanced optimizations, slows the execution. It would be interesting to see how it would fare inside the framework of a virtual machine (e.g., VMWare). Enterprises are already embracing VMWare and virtual machine solutions for its easier management of hardware, software, and disaster recovery. With a price already paid for sandboxing, using this new sandboxing technology may not be so expensive after all. Whereas it may not be as appealing as some solutions requiring hardware support, it may be easier to deploy.
Elisa’s dead
No, not our esteemed director of research. It turned off my ELISA project, Enterprise-Level Information Security Assurance, due to lack of interest from the public at large. The idea for this web application was to keep track of patches and basically support NIST’s recommendation on managing patches to use such a system. I believe this indicates that the process was too heavy; people don’t like to spend so much effort and money managing patches.

