CERIAS Weblogs » Pascal Meunier

[topcap]

Pascal Meunier

Pascal Meunier is a research scientist at the Center for Education and Research in Information and Assurance (CERIAS) at Purdue University. He is the author of the Cassandra system, the CIRDB and PI for the ReAssure project. He also teaches secure programming and publishes a set of slides in 3 parts on the subject.

Author XML Feeds

Search

[bottomcap]

Notes about the Faculty Workshop on Secure Software Development

On April 13-15, I attended the “Faculty Workshop on Secure Software Development” (alternatively called “Secure Coding Faculty Workshop” by SANS), paid for by NSF (no grant number yet) and organized by Bill Chu, Matt Bishop and SANS. There were presentations from a number of faculty involved in secure coding or software engineering, as well as some companies. My presentation focused on secure programming, and so was somewhat off-the-mark due to my confusion about the name and objectives of the workshop. It was more about software engineering and introducing good security practices in the CS/SE curriculum, than secure coding itself. In hindsight, the objectives appeared to be:

  • Share content. There was some sharing at the workshop, with an attempt to gather relevant material from attendees and combine it into a repository. I seemed to surprise people because I didn’t bring my laptop (I wanted to avoid the temptation of a distraction, and give all my attention to the workshop, avoid lugging it and getting it through the TSA screenings) so I ended up giving urls for my secure programming material. The difference between this repository and others “that failed” (Sam Redwine pointed out the low success rate of educational material repositories) would be that SANS and industry would be “beating down doors” of universities and industry for its adoption. I would have preferred if we had discussed and devised a mechanism by which we could leverage existing sources, discuss duplication of efforts, make a general appeal for relevant material from all sources instead of only those at the meeting, thought about the consolidation, organization and vetting of this material in a consistent and usable manner, not to mention identifying sources of funding to do so. Input from a librarian would have helped. Besides correctness, organization is a key difference between expert knowledge and ordinary knowledge. This is a big problem that requires a lot of work to do correctly. Despite seeing on a slide earlier the quote “success is foreseeing failure”, participants did not discuss very seriously how this effort could fail. No amount of beating down doors will make people adopt content that is poorly organized and has little usability, yet despite awareness of this problem at the workshop, I believe that this hasn’t been addressed properly. This is not to say that it’s doomed; but let’s think about why we really need it, what we really need, how it can fail and what we need to do to make it successful just not in the next few months, but how to make it a dependable resource with a lasting success.
  • Improve the content of training materials, whether these are professional training books or reference books for university classes. A problem is insecure code examples that are later used “as is” in production systems. These bad examples are used to create succinct code examples, but sometimes a more secure version wouldn’t be any longer in terms of number of lines of code. Sometimes, authors use the excuse that “it was never intended for production use”, but most students don’t know any better than what teachers show them… Educators aren’t fully aware of their responsibility in that regard, or choose to ignore it for one reason or another. One goal of the workshop was to initiate the creation of exercises that could be used to supplement or replace insecure code examples. In my opinion too much emphasis was put on trying to come up with some during the workshop, instead of devising a systematic way of creating them, and ensuring the identification and correction of the relevant online material.
  • Network, keep contact and keep working on it…

Some people were pleasantly surprised by the usefulness and portability of the SEED labs. I have been using some of these in CS 390S this semester and recommend them. The Fedora Linux VMware image that I created for CS 390S is available for download on the ReAssure public downloads page. If some of you created more images suitable for use with the SEED labs, please upload them in ReAssure as Kevin (Syracuse) doesn’t have the bandwidth to host them.

I was personally impressed by the secure software engineering program at Leuven as described by Wouter Joosen, but disappointed when I quickly hit pages that displayed “This information is not available in English. Consult the Dutch pages”. I guess I’ll have to resort to babel fish translation and the likes.

My final concern is, without funding commitments this effort will rely on personal heroics. I found it ironic that we were discussing how to improve software engineering while our effort would only classify as CMM level 1. Open source and community efforts are nice and can deliver useful things, but they can also deliver lots of wiki stubs that nobody seemingly has the time or inclination to fill and complete, as well as vetting and other management problems. The workshop resulted in great interactions, and clearly it was intended to just start the ball rolling. My point is that we preach that solving problems at design time is 100 times less costly than at production time, yet it seemed that we were rushing to production. Perhaps I just didn’t “get” the vision. Nevertheless, I’m glad that I attended; there certainly were a lot of things to think about.

New Record for the Largest CVE Entry

Last week my script that processes and logs daily CVE changes broke. It truncated inputs larger than 16000 bytes, because I believed that no CVE entry should ever be that large, therefore indicating some sort of trouble if it ever was. Guess what… The entry for CVE-2006-4339 reached 16941 bytes, with 352 references. This is an OpenSSL issue, and highlights how much we are dependent on it. It’s impressive work from MITRE’s CVE team in locating and keeping track of all these references.

A Look at MITRE’s OVAL Schemas: A Weak Proof of Compliance

If you are intrigued by OVAL, the Open Vulnerability and Assessment Language, and are considering developing rules in it or just want to understand the technical details of how it works, you will need to look at its schemas. The OVAL Design Document, Version 5.0 states that there are 3 schemas, for “representing configuration information of systems for testing; analyzing the system for the presence of the specified machine state (vulnerability, configuration, patch state, etc.); and reporting the results of this assessment.” However, when looking at the downloads available (here’s version 5.3) you’ll notice that there are 28 pdf documents. Which ones do you need?

The document “Introduction to the OVAL Language, Version 5.0″ contains this figure:
Conceptual Breakdown of the OVAL Language
-----------------------------------------
OVAL Definition Schema
|
|--> Core Schema
|
|--> Independent Schema (family_test, variable_test, xmlfilecontent_test, etc.)
|
|--> UNIX Schema (file_test, process_test, uname_test, etc.)
| |
| |--> Solaris Schema
| |--> HP-UX Schema
| |--> MacOS Schema
| |
| |--> Linix Schema (dpkg_test, rpminfo_test, etc.)
| |
| |--> RedHat Schema
| |--> Debian Schema
|
|--> Windows Schema (file_test, wmi_test, etc.)
|
|--> Apache Schema

This isn’t a 1-to-1 match with the pdfs available for download, but presents the idea. One difference is that the downloads are organized by rows for “Core”, others for “Common”. One of the downloads is titled inside as “Core Common” — which row would you guess it’s in? What’s the difference between Core, Common and Independent?

Note that “Core Common” is present in all of the schemas, from “OVAL Definition Schema” to “OVAL Variables Schema” (hmm, that’s a 4th schema!?). So, “Core Common” is what is common to the core of all schemas. The “core” of a schema is therefore composed of the “Core Common” and the part of the “Core” that is specific to that schema. Then, the “Independent” part of the schema describes things that are common to all operating systems, and serves to avoid duplication between the documents specific to OSes. So, the difference between “Independent” and “Core” is that the OS-specific parts of the schema have dependencies on the “Core” but not on the “Independent” part.

To get a complete schema, you assemble it from “Core Common”, the “Core” that is specific to that schema, the “Independent” part of the schema, and then the parts specific to your system. So, if you are interested in Solaris definitions, you would need to download the following parts: “Core Common”, “Core”, “Independent”, UNIX and Solaris.

As a more concrete example, the test to find if Solaris is running in 64-bit mode uses the “isainfo_test”, “isainfo_object”, and “isainfo_state” elements defined in the Solaris part of the definitions schema:


<isainfo_test id="oval:org.mitre.oval:tst:3884" version="1"
comment="system is running in 64-bit mode"
check_existence="at_least_one_exists" check="at least one">
<object object_ref="oval:org.mitre.oval:obj:2704"/>
<state state_ref="oval:org.mitre.oval:ste:3528"/>
</isainfo_test>

uses this object and this state:


<isainfo_object id="oval:org.mitre.oval:obj:2704" version="1"/>
<isainfo_state id="oval:org.mitre.oval:ste:3528" version="1">
<bits>64</bits>
</isainfo_state>

From what I can tell, the OVAL interpreter is required to know that given an “isainfo_object” and an isainfo_state with a bits child, it needs to run “isainfo -b”. Information about a scanned system can be stored in a “isainfo_item” which would also have a “bits” child (defined in the Solaris System Characteristics document).

To check if a service is running you would use the element “process_test” (defined in the UNIX Definition schema, as in:


<process_test id="oval:org.mitre.oval:tst:1334" version="1"
check="at least one" comment="The Xorg X server is running"
check_existence="at_least_one_exists"
xmlns="http://oval.mitre.org/XMLSchema/oval-definitions-5#unix">
<object object_ref="oval:org.mitre.oval:obj:923"/>
</process_test>

which would do a pattern matching operation to the output of “ps”:

<process_object id="oval:org.mitre.oval:obj:923" version="1"
xmlns="http://oval.mitre.org/XMLSchema/oval-definitions-5#unix">
<command operation="pattern match">.*Xorg\b.*</command>
</process_object>

So again the OVAL interpreter has to know that it needs to run “ps” for this test, and to store the result in a “process_item” element (predictably defined in the Unix System Characteristics schema).

This is nifty but I still feel uncomfortable that all this gathered information is falsifiable in a difficult-to-detect manner. An attacker who compromised the system and wants to evade detection, and therefore might avoid doing suspicious or more difficult things such as replacing a closed-source OVAL interpreter, could write something resembling a rootkit that would fetch up-to-date OVAL definitions and return “correct” answers when the OVAL tool runs. A rogue administrator that just wants to pass some tests while doing other things and operating under configurations that aren’t compliant could write a fake “ps” or “inetd” to pass one or two specific rules; that would be much easier than to mess with the OVAL interpreter itself (in their simplest form the fake programs could simply print a constant string). An attacker could want to indicate that all patches have been applied to avoid the possibility of a patch breaking a “modified” application or closing a backdoor. It could be done by simply modifying or creating the appropriate registry keys.

Regardless of the likelihood of these scenarios, my point is that the evaluation is “subjective” in the sense that it is done by the subject of the evaluation. Anyone wanting to evade compliance (as SCAP and XCCDF currently rely on OVAL, even though they are independent from it in theory) could do so easily, in concept anyway, so the software doesn’t necessarily need to fool the administrator, just the OVAL tool; it doesn’t need to be a full rootkit. OVAL therefore provides a weak proof of compliance. Using external tests whenever possible, conducted from a trusted machine, would be much stronger, but I see no evidence of efforts in that direction. I realize that not all tests can be conducted externally, but emphasizing possible external tests would strengthen OVAL. I have communicated those concerns to MITRE years ago, but it seems that the easy “practical” way is still favored. One could argue that perhaps it’s “good enough”, and provides an opportunity for vendors to deliver value with more robust auditing methods. But if the government is the main market for OVAL products, will it buy those better products or the cheap ones? I understand that the government “needs yesterday” a manageable solution to handle compliance issues. However, I’m afraid that once the weaker standard of proof will be in place, it will be very difficult to upgrade.

The open source Purdue Vulnerability Scanning Cluster based on Nessus is an example of the external approach to the evaluation of compliance (disclosure: I contributed to the design of the VSC). A mixed solution, using external scanning when possible, as well as internal, would be very powerful, especially if discrepancies were flagged.

Virtualization Is Successful Because Operating Systems Are Weak

It occurred to me that virtual machine monitors (VMMs) provide similar functionality to that of operating systems. Virtualization supports functions such as these:

  1. Availability
    • Minimized downtime for patching OSes and applications
    • Restart a crashed OS or server
  2. Scalability
    • More or different images as demand changes
  3. Isolation and compartmentalization
  4. Better hardware utilization
  5. Hardware abstraction for OSes
    • Support legacy platforms

Compare it to the list of operating system duties:

  1. Availability
    • Minimized downtime for patching applications
    • Restart crashed applications
  2. Scalability
    • More or different processes as demand changes
  3. Isolation and compartmentalization
    • Protected memory
    • Accounts, capabilities
  4. Better hardware utilization (with processes)
  5. Hardware abstraction for applications

The similarity suggests that virtualization solutions compete with operating systems. I now believe that a part of their success must be because operating systems do not satisfy these needs well enough, not taking into account the capability to run legacy operating systems or entirely different operating systems simultaneously. Typical operating systems lack security, reliability and ease of maintenance. They have drivers in kernel space; Windows Vista thankfully now has them in user space, and Linux is moving in that direction. The complexity is staggering. This is reflected in the security guidance; hardening guides and “benchmarks” (essentially an evaluation of configuration settings) are long and complex. The attempt to solve the federal IT maintenance and compliance problem created the SCAP and XCCDF standards, which are currently ambiguously specified, buggy and very complex. The result of all this is intensive, stressful and inefficient maintenance in an environment of numerous and unending vulnerability advisories and patches.

What it looks like is that we have sinking boats, so we’re putting them inside a bigger, more powerful boat, virtualization. In reality, virtualization typically depends on another, full-blown operating system.
more OSes
VMWare ESX Server runs its own OS with drivers. Xen and offerings based on it have a full, general purpose OS in domain 0, in control and command of the VMM (notwithstanding disaggregation). Microsoft’s “Hyper-V” requires a full-blown Windows operating system to run it. So what we’re doing is really exchanging an untrusted OS for another, that we should trust more for some reason. This other OS also needs patches, configuration and maintenance. Now we have multiple OSes to maintain! What did we gain? We don’t trust OSes but we trust “virtualization” that depends on more OSes? At least ESX is “only” 50 MB, simpler and smaller than the others, but the number of defects/MB of binary code as measured by patches issued is not convincing.

I’m now not convinced that a virtualization solution + guest OS is significantly more secure or functional than just one well-designed OS could be, in theory. Defense in depth is good, but the extent of the spread of virtualization may be an admission that we don’t trust operating systems enough to let them stand on their own. The practice of wiping and reinstalling an OS after an application or an account is compromised, or deploying a new image by default suggests that there is little trust in the depth provided by current OSes.

As for ease of management and availability vs patching, I don’t see why operating systems would be unable to be managed in a smart manner just like ESX is, migrating applications as necessary. ESX is an operating system anyway… I believe that all the special things that a virtualization solution does for functionality and security, as well as the “new” opportunities being researched, could be done as well by a trustworthy, properly designed OS; there may be a thesis or two in figuring out how to implement them back in an operating system.

What virtualization vendors are really doing is a clever way to smoothly replace one operating system with another. This may be how an OS monopoly could be dislodged, and perhaps would explain the virtualization-unfriendly clauses in the licensing options for Vista: virtualization could become a threat to the dominance of Windows, if application developers started coding for the underlying OS instead of the guest. Of course, even with a better OS we’d still need virtualization for testbeds like ReAssure, and for legacy applications. Perhaps ReAssure could help test new, better operating systems.
(This text is the essence of my presentation in the panel on virtualization at the 2008 CERIAS symposium).

Related reading:
Heiser G et al. (2007) Towards trustworthy computing systems: Taking microkernels to the next level. ACM Operating Systems Review, 41
Tanenbaum AS, Herder JN and Bos H (2006) Can we make operating systems reliable and secure? Computer, 39

Open Source Outclassing Home Router Vendor’s Firmware

I’ve had an interesting new experience these last few months. I was faced with having to return a home wireless router again and trying a different model or brand, or try an open source firmware replacement. If one is to believe reviews on sites like Amazon and Newegg, all home wireless routers have significant flaws, so the return and exchange game could have kept going on for a while. The second Linksys device I bought (the most expensive on the display!) had the QoS features I wanted but crashed every day and had to be rebooted, even with the latest vendor-provided firmware. It was hardly better than the Verizon-provided Westell modem, which had to be rebooted sometimes several times per day despite having simpler firmware. That was an indication of poor code quality, and quite likely security problems (beyond the obvious availability issues).

I then heard about DD-WRT, an alternative firmware released under the GPL. There are other alternative firmwares as well, but I chose this one simply because it supported the Linsys router; I’m not sure which of the alternatives is the best. For several months now, not only has the device demonstrated 100% availability with v.24 (RC5), but it supports more advanced security features and is more polished. I expected difficulties because it is beta software, but had none. Neither CERIAS or I are endorsing DD-WRT, and I don’t care if my home router is running vendor-provided or open source firmware, as long as it is a trustworthy and reliable implementation of the features I want. Yet, I am amazed that open source firmware has outclassed firmware for an expensive (for a home router) model of a recognized and trusted brand. Perhaps home router vendors should give up their proprietary, low-quality development efforts, and fund or contribute somehow to projects like DD-WRT and install that as default. A similar suggestion can be made if the software development is already outsourced. I believe that it might save a lot of grief to their customers, and lower the return rates on their products.

Firefox’s Super Cookies

Given all the noise that was made about cookies and programs that look for “spy cookies”, the silence about DOM storage is a little surprising. DOM storage allows web sites to store all kinds of information in a persistent manner on your computer, much like cookies but with a greater capacity and efficiency. Another way that web sites store information about you is Adobe’s Flash local storage; this seems to be a highly popular option (e.g., youtube stores statistics about you that way), and it’s better known. Web applications such as pandora.com will even deny you access if you turn it off at the Flash management page. If you’re curious, see the contents in “~/.macromedia/Flash_Player/#SharedObjects/”, but most of it is not human readable.
I wonder why DOM storage isn’t used much after being available for a whole year; I haven’t been able to find any web site or web application making use of it so far, besides a proof of concept for taking notes. Yet, it probably will be (ab)used, given enough time. There is no user interface in Firefox for viewing this information, deleting it, or managing it in a meaningful way. All you can do is turn it on or off by going to the “about:config” URL, typing “storage” in the filter and set it to true or false. Compare this to what you can do about cookies… I’m not suggesting that anyone worry about it, but I think that we should have more control over what is stored and how, and the curious or paranoid should be able to view and audit the contents without needing the tricks below. Flash local storage should also be auditable, but I haven’t found a way to do it easily.

Auditing DOM storage. To find out what information web sites store on your computer using DOM storage (if any), you need to find where your Firefox profile is stored. In Linux, this would be “~/.mozilla/firefox/”. You should find a file named “webappsstore.sqlite”. To view the contents in human readable form, install sqlite3; in Ubuntu you can use Synaptic to search for sqlite3 and get it installed. Then, the command:
echo 'select * from webappsstore;' | sqlite3 webappsstore.sqlite

will print contents such as (warning, there could potentially be a lot of data stored):
cerias.purdue.edu|test|asdfasdf|0|homes.cerias.purdue.edu

Other SQL commands can be used to delete specific entries or change them, or even add new ones. If you are a programmer, you should know better than to trust these values! They are not any more secure than cookies.

Speculations on Teaching Secure Programming

I have taught secure programming for several years, and along the way I developed a world view of how teaching it is different from teaching other subject matters. Some of the following are inferences from uncontrolled observations, others are simply opinions or mere speculation. I expose this world view here, hoping that it will generate some discussions and that flaws in it will be corrected.

As other fields, software security can be studied from several different aspects, such as secure software engineering, secure coding at a technical level, architecture, procurement, configuration and deployment. Similarly to other fields, effective software security teaching depends on the audience — its needs, its current state and capabilities, and its potential for learning. Learning techniques such as repetition are useful, and students can ultimately benefit from organized, abstracted thought on the subject. However, teaching software security is different from teaching other subjects because it is not just teaching facts (data), “how to” (skills) and theories and models (knowledge), but also a mindset and the capability to repeatably derive and achieve a form of wisdom in various, even new situations. It’s not just a question of the technologies used or the degree of technological acumen, but of behavioral psychology, economy, motivation and humor.

Behavioral Psychology — Security is somewhat of a habit, an attitude, a way of thinking and life. You won’t become a secure programmer just because you learned of a new vulnerability, exploit or security trick today, although it may help and have a cumulative effect. Attacking requires opportunistic, lateral, experimental thinking with exciting rewards upon success. It somewhat resembles the capability to create humor by taking something out of the context for which it was created and subjecting it to new, unexpected conditions. I am also surprised sometimes by the amount of perseverance and dedication attackers demonstrate. Defending requires vigilance and a systematic, careful, most often tedious labor and thought, which are rewarded slowly by “uptime” or long-term peace. They are different, yet understanding one is a great advantage to the other. To excel at both simultaneously is difficult, requires practice and is probably not achievable by everyone. I note that undergraduate computer science rewards passing tests, including sometimes provided software tests for assignments, which are closer to immediate rewards upon success or immediate failure, with no long-term consequences or requirements. On top of that, assignments are most often evaluated solely on achieving functionality, and not on preventing unintended side-effects or not allowing other things to happen. I suspect that this produces graduates with learned behaviors unfavorable to security. The problem with behaviors is that you may know better than what you’re doing, but you do it anyways. Economy may provide some limited justification.

Economy — Many people know that doing things securely is “better”, and that they ought to, but it costs. People are “naturally optimizing” (lazy) — they won’t do something if there’s no perceived need for it, or if they can delay paying the costs or ultimately pay only the necessary ones (”late security” as in “late binding”). This is where patches stand; vulnerability disclosures and patches are remotely possible costs to be weighted against the perceived opportunity costs of delays and additional production expenses. Isolated occurrences of exploits and vulnerability disclosures may be dismissed as bad luck, accidents or something that happens to other projects. An intense scrutiny of some works may be necessary to demonstrate to a product’s team that their software engineering methods and security results are flawed. There is plenty of evidence that these attempts at evading costs don’t work well and often backfire.
Even if change is desired, students can graduate with negligible knowledge of the best practices presented in the SOAR on Software Security Assurance 2007. Computer science programs are strained by the large amount of knowledge that needs to be taught; perhaps software engineering should be spun off, just like electrical engineering was spun off from physics. Companies that need software engineers, and ultimately our economy, would be better served by that than by getting students that were just told to “go and create a program that does this and that”. While I was revising these thoughts, “Crosstalk” published some opinions on the use of Java for teaching computer science, but the title laments “where are the software engineers of tomorrow?” I think that there is just not enough teaching time to educate people to become both good computer scientists and software engineers, and the result is something that satisfies the need for neither. Even if new departments aren’t created, two different degrees should probably be offered.

Motivation — For many, trying to teach software security will be in one ear, out the other unless consequences are demonstrated. Most people need to be shown the exploits that a flaw enables, to believe that it is a serious flaw. This resembles how a kid may ignore warnings about burns and hot things until a burn is experienced. Even as teenagers and adults, every summer some people have to re-learn how sunscreen is needed, and the possibility of skin cancer is too remote a consideration for others. So, security teaching needs to contain a lot of anecdotes and examples of bad things that happened. I like to show real code in class and analyze the mistakes that were made; that approach seems to get the interest of undergraduates. At a later stage, this will evolve from “security prevents bad things” to “with security you can do this safely”. Actualizing secure programming can make it even more interesting and exciting, by discussing current events in class.

Repetition — Repeated experiences reinforce learning. Security-focused code scanners repeat and reinforce good coding practice, as long as the warnings are not allowed to be ignored. Code audits reinforce the message, this time coming from peers, and so result in peer pressure and the risk of shame. They are great in a company, but I am ambivalent about using code audits by other students, due to the risk of humiliation — humiliation is not appropriate while learning, for many reasons. Also, the students doing the audit may not be competent yet, by definition, and I’m not sure how I would grade the activity. Code audits by the teacher do not scale well. This leaves scanners. I have been looking into it and I tried some commercial code scanners, but what I’ve seen are systems that are unmanageable for classroom use and don’t catch some of the flaws I wish they would.

Organization and abstraction — Whereas showing exploits and attacks is good for the beginner, more advanced students will want to move away from black lists of things not to do (e.g., “Deadly Sins”) to good practices, assurance, and formal methods. I made a presentation on the subject almost two years ago.

In conclusion, teaching secure programming differs from typical subject matters because of how the knowledge is utilized; it needs to change behaviors and attitudes; and it benefits from different tools and activities. It is interesting in how it connects with morality. Whereas these characteristics aren’t unique in the entire body of human knowledge, they present interesting challenges.

Confusion of Separation of Privilege and Least Privilege

Least privilege is the idea of giving a subject or process only the privileges it needs to complete a task. Compartmentalization is a technique to separate code into parts on which least privilege can be applied, so that if one part is compromised, the attacker does not gain full access. Why does this get confused all the time with separation of privilege? Separation of privilege is breaking up a *single* privilege amongst multiple, independent components or people, so that multiple agreement or collusion is necessary to perform an action (e.g., dual signature checks). So, if an authentication system has various biometric components, a component that evaluates a token, and another component that evaluates some knowledge or capability, and all have to agree for authentication to occur, then that is separation of privilege. It is essentially an “AND” logical operation; in its simplest form, a system would check several conditions before granting approval for an operation. Bishop uses the example of “su” or “sudo”; a user (or attacker of a compromised process) needs to know the appropriate password, and the user needs to be in a special group. A related, but not identical concept, is that of majority voting systems. Redundant systems have to agree, hopefully outvoting a defective system. If there was no voting, i.e., if all of the systems always had to agree, it would be separation of privilege. OpenSSH’s UsePrivilegeSeparation option is *not* an implementation of privilege separation by that definition, it simply runs compartmentalized code using least privilege on each compartment.

ReAssure Version 1.01 Released

As the saying goes, version 1.0 always has bugs, and ReAssure was no exception. Version 1.01 is a bug-fix release for broken links and the like; there were no security issues. Download the source code in Ruby here, or try it there. ReAssure is the virtualization (VMware and UML) experimental testbed built for containment and networking security experiments. There are two computers for creating and updating images, and of course you can use VMware appliances. The other 19 computers are hooked to a Gbit switch configured on-the-fly according to the network topology you specified, with images being transfered, setup and started automatically. Remote access is through ssh for the host OS, and through NX (think VNC) or the VMware console for the guest OS.

Looking for Trustworthy Alternatives to Adobe PDFs

There was a day when PDFs were the safe, portable alternative to Microsoft Word documents. There was no chance of macro-virus infections, and emails to Spaf with PDFs didn’t bounce back as they did if you sent him a Word document. It became clear that PDFs adopted mixed loyalties by locking features down and phoning home. Embedded content caused security issues in PDF viewers (CVE-2007-0047, CVE-2007-0046, CVE-2007-0045, CVE-2005-1306, CVE-2004-1598, CVE-2004-0194, CVE-2003-0434) including a virus using JavaScript as a distribution vector (CVE-2003-0284). Can you call safe a document viewer that stands in such company as Skype, Mozilla Firefox, Thunderbird, Netscape Navigator, Microsoft Outlook, and Microsoft Outlook Express [1] with a CVSS score above 9 (CVE-2007-5020)? How about PDFs that can dynamically retrieve Yahoo ads over the internet [2], whereas Yahoo has recently been tricked into distributing trojans in advertisements [3]? Fully functional PDF viewers are now about as safe and loyal (under your control) as your web browser with full scripting enabled. That may be good enough for some people, but clearly falls short for risk-averse industries. It is not enough to fix vulnerabilities quickly; people saying that there’s no bug-free software are also missing the point. The point is that it is desirable to have a conservative but functional enough document viewer that does not have a bullseye painted on it by attempting to do too much and be everything to everyone. This can be stated succinctly as “avoid unnecessary complexity” and “be loyal to the computer owner”.

Whereas it might be possible to use a PDF viewer with limited functionality and not supporting attack vectors, the format has become tainted — in the future more and more people will require you to be able to read their flashy PDF just as some webmasters now deny you access if you don’t have JavaScript enabled. Adobe has patents on PDF and is intent on keeping control and conformance to specifications; Apple’s MacOS X PDF viewer (”Preview”) initially allowed printing of secured PDFs to unsecured PDFs [4]. That was quickly fixed, for obvious reasons. This is as it should be, but it highlights that you are not free to make just any application that manipulates PDFs.

Last year Adobe forced Microsoft to pull PDF creation support from Office 2007 under the threat of a lawsuit while asking them to “charge more” for Office [5]. What stops Adobe from interfering with OpenOffice? In January 2007 Adobe released the full PDF (Portable Document Format) to make PDF an ISO standard [6]. People believe: “Anyone may create applications that read and write PDF files without having to pay royalties to Adobe Systems”, but that’s not quite true. These applications must conform to the specification as decided by Adobe. Applications that are too permissive or somehow irk Adobe could possibly be made illegal, including open source ones, at least in the US. It is unclear how much control Adobe still has (obviously enough for the Yahoo deal) and will still have when and if it becomes an ISO standard. Being an ISO standard does not make PDFs necessarily compatible with free software. If part of the point of free software is to be able to change it so that it is fully loyal to you, then isn’t it a contradiction for free software to implement standards that mandate and enforce mixed loyalties?

Finally, my purchase of the full version of Adobe Acrobat for MacOS X was a usability disaster; you’ll need to apply duress to make me use Acrobat again. I say it’s time to move on to safer ground, from security, legal, and code quality perspectives, ISO standard or not.

How then can we safely transmit and receive documents that are more than plain text? HTML, postscript, and rich-text (rtf) are alternatives that have been disused in favor of PDF for various reasons which I will not analyze here. Two alternatives seemed promising: DVI files and Microsoft XPS, but a bit of research shows that they both have significant shortcomings.

Tex (dvi): TeX is a typesetting system, used to produce DVI (Device independent file format) files. TeX is used mostly in academia, by computer scientists, mathematicians or UNIX enthusiasts. There are many TeX editors with various levels of sophistication; for example OpenOffice can export documents to .tex files, so you can use even a common WYSIWYG text editor. Tex files can be created and managed on Windows [7], MacOS X and Linux. TeX files do not include images but have tags referencing them as separate files; you have to manage them separately. Windows has DVI viewers, such as YAP and DVIWIN.

However, in my tests OpenOffice lost references to embedded images, producing TeX tags containing errors (”[Warning: Image not found]“). The PDF export on the same file worked perfectly. Even if the TeX export worked, you would still have a bunch of files instead of a single document to send. You then need to produce a DVI file in a second step, using some other program.

Even if OpenOffice’s support of DVI was better, there are other problems. I have found many downloadable DVI documents that could not be displayed in Ubuntu, using “evince”; they produced the error “Unable to open document — DVI document has incorrect format”. After installing the “advi” program (which may have installed some fonts as well), some became viewable both using evince and advi. DVI files do not support embedded fonts; if the end user does not have the correct fonts your document will not be displayed properly.

Another issue is that of orphaned images. Images are missing from dvi downloads such as this one; at some point they were available as a separate download, but aren’t anymore. This is a significant shortcoming, which is side-stepped by converting DVI documents to PDF; however this defeats our purpose.

Microsoft XPS: XPS (XML Paper Specification) documents embed all the fonts used, so XPS documents will behave more predictably than DVI ones. XPS also has the advantage that

“it is a safe format. Unlike Word documents and PDF files, which can contain macros and JavaScript respectively, XPS files are fixed and do not support any embedded code. The inability to make documents that can literally change their own content makes this a preferable archive format for industries where regulation and compliance is a way of life” [8].

Despite being an open specification, there is no support for it yet in Linux. Visiting Microsoft’s XPS web site and clicking on the “get an XPS viewer” link results in the message “This OS is not supported”.

It seems, however, that Microsoft may be just as intent on keeping control of XPS as Adobe for PDFs; the “community promise for XPS” contains an implicit threat should your software not comply “with all of the required parts of the mandatory provisions of the XPS Document Format” [9]. These attached strings negate some advantages that XPS might have had over PDFs.

XPS must become supported on alternative operating systems such as Linux and BSDs, for it to become competitive. This may not happen simply because Microsoft is actively antagonizing Linux and open source developers with vague and threatening patent claims, as well as people interested in open standards with shady lobbying moves and “voting operations” [10] at standards organizations (Microsoft: you need public support and goodwill for XPS to “win” this one). The advantages of XPS may also not be evident to users comfortable in a world of TeX, postscript, and no-charge PDF tools. The confusion about open formats vs open standards and exactly how much control Adobe still has and will still have when and if PDF becomes an ISO standard does not help. Companies offering XPS products are also limiting their possibilities by not offering Linux versions, at least of the viewers, even without support.

In conclusion, PDF viewers have become risky examples of mixed loyalty software. It is my personal opinion that risk-averse industries and free software enthusiasts should steer clear of the PDF standard, but there are currently no practical replacements. XPS faces extreme adoption problems, not simply due to the PDF installed base, but also due to the ill will generated by Microsoft’s tactics. I wish that DVI was enhanced with included fonts and images, better portability, and better integration within tools like OpenOffice, and that this became an often requested feature for the OpenOffice folks. I don’t expect DVI handlers to be absolutely perfect (e.g., CVE-2002-0836), but the reduced feature set and absence of certain attack vectors should mean less complexity, fewer risks and greater loyalty to the computer owner.

1. ISS, Multiple vendor products URI handling command execution, October 2007. http://www.iss.net/threats/276.html

2. Robert Daniel, Adobe-Yahoo plan places ads on PDF documents, November 2007. http://www.marketwatch.com/news/story/adobe-yahoo-partner-place-ads/story.aspx?guid=%7B903F1845-0B05-4741-8633-C6D72EE11F9A%7D

3. Bogdan Popa, Yahoo Infects Users’ Computers with Trojans - Using a simple advert distributed by Right Media, September 2007. http://news.softpedia.com/news/Yahoo-Infects-Users-039-Computers-With-Trojans-65202.shtml

4. Kurt Foss, Web site editor illustrates how Mac OS X can circumvent PDF security, March 2002. http://www.planetpdf.com/mainpage.asp?webpageid=1976

5. Nate Mook, Microsoft to Drop PDF Support in Office, June 2006. http://www.betanews.com/article/Microsoft_to_Drop_PDF_Support_in_Office/1149284222

6. Adobe Press release, Adobe to Release PDF for Industry Standardization, January 2007. http://www.adobe.com/aboutadobe/pressroom/pressreleases/200701/012907OpenPDFAIIM.html

7. Eric Schechter, Free TeX software available for Windows computers, November 2007. http://www.math.vanderbilt.edu/~schectex/wincd/list_tex.htm

8. Jonathan Allen, The wide ranging impact of the XML Paper Specification, November 2006. http://www.infoq.com/news/2006/11/XPS-Released

9. Microsoft, Community Promise for XPS, January 2007. http://www.microsoft.com/whdc/xps/xpscommunitypromise.mspx

10. Kim Haverblad, Microsoft buys the Swedish vote on OOXML, August 2007. http://www.os2world.com/content/view/14868/1/