Posts tagged CERIAS

Page Content

Symposium Transcript: Complexity vs. Security—Choosing the Right Curve, Morning Keynote Address

Teaching as an adjunct faculty member, even at George Mason is a labor of love: the pay is not great, but the job definitely has rewards. One of these rewards is to see how bad a job we are doing as an industry in teaching people about security awareness. I teach a course on how to develop secure software. In my opinion, this is a “101” level course. At George Mason, it is listed as 600. The quality of the code at the beginning of each semester is interesting. Some code is very well structured and most of the code is not that bad, but the security awareness is not there. When you teach security awareness to the students, the quality of the code improves. Especially when you give them the techniques on what to watch for, and the methods and the mechanisms that can be used to avoid those challenges. This is not what this talk is about.

This talk is about the end result, and what happens after you do that [give them the tools and techniques], even when we spend this time, a full semester, 15 weeks of concentrated effort. By the way, they write a ton of code, which is one of the challenges about the class—let me just describe this one problem, since this talk is likely to run somewhat shortly anyway:

Right now, I have students writing programs. I give them a specification and they write a program back to me given that specification. And the only way I can think of to grade it is to actually do a code analysis. To actually look at it and do the testing myself. Then I come back and identify the vulnerabilities at the code level, because if I don’t do it at the code level, how could they possibly learn what they messed up ? And that’s “killing me”. grin I spend at least 40 hours every time a program is turned in by the class. And I guarantee I am not hitting every vulnerability. If anybody has a way of structuring these assignments so that I can get the same points across, I am all ears. We will leave this as an aside.

At the end of the class, even with the folks that I consider to be good coders, there are still vulnerabilities that are creeping in. Especially into the more interesting and full featured assignements that we give out.

The students spend the second half of the semester writing something big. This year they had two choices for the final assignment. The first choice was a network enabled game, which had to be peer-to-peer, between two machines, running across the network. They could choose the game. I am always interested when people choose really bad games.

Quick thing: in any game, the idea is that there is an adversary actively trying to compromise the game, and the software has to defend against that. Don’t choose random ! Random is hard in that environment. You want to choose chess. Chess has a defined state: there is no randomness in chess. If people choose games like poker, it is a bad idea: who controls the deck ? There are ways to work around that though.

The second choice of assignment was a remote backup service which, surprisingly enough, about half the class chose to do. I think conceptually they can get their head around it. I thought it would be a harder assignment.

These projects turn out to be non-trivial: there is a lot of code written. I think the most lines of code we saw was on the order of 10,000 lines of code, which isn’t huge by the standards of industry, but certainly is pretty big for a classroom project. Some of the submissions were much smaller. Some people try to abstract away the difficulty by using whatever Microsoft libraries they can find to implement the project functionalities, which caused its own difficulties. Most of these projects, by the way, were the worst from a security perspective. Not because the Microsoft technology wasn’t good; actually Microsoft has done some pretty cool stuff when it comes to security APIs, but the students did not understand what the APIs had to offer, so they called them in ways that were truly insecure. This is another challenge that we get: understand the underlying base technology that we rely on is a major challenge.

So, this talk —back to this talk…

This talk is about what we do after we’ve already done some level of training and started focusing on security, to keep our products more secure over time. Because, frankly, our economy (this is a bad time to say this of course) isn’t a big smoking hole; some people would say it is. The impact of software vulnerabilities definitely impacts on the economy. There is a great book called Geekonomics by David Rice. The major thesis of the book is that the cost of vulnerabilities in code is actually a tax that we all pay. So when you buy, say, Microsoft Office or Adobe Illustrator, whatever tool suite you’re on, all the vulnerabilities and bugs in that are paid by every user and every person that gains service through the use of these tools. We are paying a huge cost. Because all of us are being taxed, we don’t see it. It is a very big number. Every once in a while it becomes specific. I have worked for the banking industry. You have to love these guys, because you do not have to make an ROI argument with them when it comes to cybercrime. They’re either experiencing fraud or they’re not. If they’re experiencing fraud, they know how much. In the case of fraud, the numbers are high enough that, even as a CEO used to seeing big numbers, you would pay attention to them. By the way those banks are still profitable, some of them.

The thing is, if all that fraud is happening, who is paying for that ? Eventually, the cost of fraud has to be passed on to the consumer for the bank to remain profitable. So, again, the point I want to make is that we are all paying this price.

Some of the pre-conceptions I had before preparing this talk turned out to be false.

The main question of this talk is “How are we doing as an industry?” Is software trending towards a better place or a worse one? My position when I started preparing this talk is that we are going towards a worse place. The thought around that is that we are trending towards more complex systems. Why is that happening ? I’ll argue with numbers later; these number show that software is getting at least larger if not more complex.

Question: “How do you sell version 2?” Actually, “How do you sell version 9?” (since nobody has version 2 anymore). So how do you sell the next version of a software? You are not going to sell version (n+1) with less features than version (n) in this market. You have to add something. But most of the time we add something, that means more code. There is also all that engineering work that goes into maintaining that code over time. There is no economic incentive to buy version (n+1) if it does not provide something new besides a new pretty GUI. I guess this did not work too well with Vista. In theory it works. Actually, have you seen the Windows 7 interface ? That is actually pretty cool. I am currently being blown away by how the interface on my Mac increased my productivity, after just a few weeks. I think 7 might give Microsoft a leg up in that area. But I digress.

So, software is getting bigger because the vendors who produce the software for us, by and large, need to add features to the products to sell it to us. That’s one reason. A couple of other things though:

  • If we are getting bigger, are we getting more complex. And if we are getting more complex, does that actually affect the security question? That was the underlying thought behind this talk.

To the left, this is the only code I can write and guarantee it comes out bug-free. [points to “Hello, World”, with character encoding issues]. There is a joke there…

But that’s not like average code. Average code looks more like the code on the right [points to cryptic code: deeply nested, and dense]. By the way, in fairness, I “borrowed” from the recent winners of the C obfuscation project. But I have been handed code that is at least that ugly. When you start trying to understand that code, you are going to find out that it is nearly impossible to do so. It really matters how much you focus on the readability and legibility of your code. Architecture matters here. But as you get bigger, chances are that even with good controls, you are still going to wind up with increasing complexity.

How many people think they can pick up bugs in the right hand side given an hour to do it? How confident would you be?

I read a lot of source code. By the way, one of the reasons I teach the class is to keep touching source code. Even with the course I teach and all the time I spend doing it, I am not going to find it. Now, there are tricks you can do from a vulnerability perspective. Following the inputs is always a good idea. I do not necessarily have to read every line to find the most tasty vulnerabilities in a piece of code. Nonetheless, that is kind of ugly [obfuscated code on the right] and there is a lot of ugly in the products we use every day. So we have to account for that.

Here is what a few bright guys in the industry have to say about this. [list of a few quotes on the slide including one from Dr. Gene Spafford]

This is a general acknowledgment that complexity is the driving force behind some of the problems that we have. What do you all think ? Is it real, is it fake ? There are pretty outrageous claims being made here. I like the second one: “Every time the number of lines is doubled, the company experiences four times as many security problems.”

Do you buy that ?

If that is the case, pardon the language, but we are screwed. We can’t survive that because the level of changes that we are experiencing in the code base that we are sitting on are following something like Moore’s law. The period at which code bases are doubling is much longer than a year and a half [Moore’s law], but it is occurring. So, if we project at five years, that would be a fairly reasonable time to expect that we will be sitting on something twice as big as what we’re sitting on today, unless it all comes down and crashing and we decide we have to get rid of legacy code.

If we are increasing by four times the rate of security problems, we are in serious trouble. So the questions are:

  1. “Is that true?”
  2. “If it is true, or even partially true, what do we do about it ?”

There is research that tries to correlate complexity and security faults. Surprisingly, this is an underexplored area of research. There was not as much literature in this area as I was expecting,but there is some. Some of it comes from Industry and some of it comes from Academia.

The first one comes from academia and it does find some correlation between complexity and security. One of the interesting things about that paper is that it actually went in and looked at a number of different complexity metrics. A lot of people use source lines of code but these guys actually looked at several different ones. Some perform much better than others at predicting vulnerabilities in the experiment that was run, which is a relatively small experiment.

The next one is an industry line. These guys sell a tool that looks for, in an automated sense, for software vulnerabilities. By the way, I really wish that these tools were much cheaper, because they do work ! They actually are cool products. Do they find every bug? Absolutely not: they have very little concept of the context. So I find vulnerabilities in the code of students who run Fortify and Coverity and Klocwork against their code all the time. The thing is, these tools do find vulnerabilities, so they reduce the total count of vulnerabilities. And everytime we close one of these vulnerabilities, we close a window to the attack community. So I wish these tools were cheaper and therefore more commonly used.

So, these guys [next reference] run their experiment over a fairly large codebase. They pulled a number of open source products and ran their tools against them. They found a very large correlation in their experiments between lines of code and not just the number but the rate of software vulnerabilities that were being discovered.

So there are some folks on the pro- side.

We have some folks on the con side. Very interesting paper out of MIT where they went through multiple versions of OpenBSD and looked at what happened to the vulnerability rates over time. And they showed pretty much the opposite. They showed a trending of vulnerabilities down over time. I have a theory for why this is. This is one of the surprising things as I got into this paper I was trying to figure out what was going on there, because that seems counter to my intuition.

There are some other folks at Microsoft who would say “Look at XP and look at Vista, it’s night and day. We got religion, we got our security development lifecycle going on.” By the way, I believe they deserve a huge amount of credit for doing that. You could say it was a response to a market demand, and that’s probably true. But they did it and they have been a pretty strong force in the industry for pushing this idea of secure development forward. I’m not in any way, shape, or form an anti-Microsoft guy, although I really love my Mac. They’re showing a system (Vista) which is larger but has fewer vulnerabilities.

This is interesting: we have people looking at this problem hard and they are coming up with entirely different results. That got me thinking, as I was preparing this talk, “What is actually going on here?” I think I have a few ideas as to why we are getting these divergences.

One thing is that source lines of code is probably a very poor measure of complexity. If I have a program that never branches and is a thousand lines long, chances are it is going to be pretty easy to analyze. But if I have a program that has a bunch of recursive loops in it, and they are chosen based on random values and interact with threads that have to synchronize themselves over a shared database. Someway, it is going to be more difficult to get your head around that. So there are lots of different ways to measure complexity. Examples include:

  1. Size
  2. Cyclomatic
  3. Halstead metrics
  4. McCabe metrics

I think this is an area that ought to receive more focused attention in research. We ought to spend more time trying to understand what metrics should be applied, and in what context should they be applied to understand when we’re getting past our ability to successfully control quality in a software project. I think this is a very interesting research question. I do not know how you feel about that, if that is something that people should chew on. I spent a lot of time crawling through IEEE and ACM in preparation for this talk. Frankly, i was a little bit underwhelmed by the number of papers that were out there on the subject. So, to anybody working on their doctorate, anytime you find a white space, this is a good thing. And this is a very focused research area. It is very easy in one sense to run an experiment. The hard part is actually in understanding the difference between discovering a vulnerability and understanding how many vulnerabilities are likely in a particular population of code.

So, let’s talk a little bit about size. Again, source lines of code are probably not the best measure. And by the way some vendors publish the source lines of code for their products and some don’t. So some of these numbers are better than others; some of them are estimates. Microsoft has started publishing the numbers for Windows. So we have: * 40 million LOC (MLOC) in XP, depending on which version we are talking about * 60 MLOC in Vista, although some people would say higher.

These are pretty large numbers. If you told me my car had 60 MLOC on it, I probably wouldn’t drive it. As an interesting aside, has anyone driven a BMW 7 series? Nice car, right? They had huge quality problems when they were first shipping, mainly the German version, before it came over. Anyone know the story behind this? The cars would crash. I don’t mean they would hit something. The computers in the car would crash. And at that point you could not re-enable them. They were dead; they had to be towed off the road and the computers replaced.

Prof. Gene Spafford: “I had the first series and the windows system onboard, while the car was sitting in the garage, thought it had been in an accident and disabled the car and actually blew the explosive bolt on the battery on the starter cable. So I had the distinction of having the first BMW 7 series in the Midwest towed to the dealership”

Dr. Ron Ritchey: “Don’t you find it ironic that somebody in you position, with your standing in industry, is the guy that has the BMW that blows itself up because of a software fault ?”

Prof. Gene Spafford: “Somehow, it seems very appropriate… to most of my students”

Dr. Ron Ritchey: “I think that’s brilliant.”

Someone in the audience: “Was it actually a security fault ? Wasn’t it actually one of the students who didn’t want you to come to class ?”

Dr. Ron Ritchey: “Well, a failure mode is a failure mode.”

[more joking]

So, how fast are these things growing ? Between versions, quite a bit. Windows, if we go way way back to NT4, some numbers show around 8MLOC; the number I got for this talk was 11, so I went with it. XP, 40MLOC. That’s twice twice. [discusses also Mac OS X, Linux, Internet Explorer]

The point is that a lot of different products are getting a lot more complex. I shouldn’t say that. They are getting larger. But there is strong evidence to suggest that structurally, if you do not focus on it, larger does actually equal more complex. I do need to be a little more careful here. One of the pieces of belief I’m attempting to prove is that the reason why SLOC as a metric are showing that sometimes things get better and sometimes things get worse, is probably because SLOC is not the measure we should be focusing on. Unfortunately, it’s very hard for an outside researcher, unless you work for Microsoft, or you work for Sun, or you work for one of these companies that produces these large code blocks, to have access to the source code in order to run these different metrics and track the performance of these metrics over time. I think that it would help a lot if we could get these metrics published and used better, to correlate these tools.

This is a simple experiment that we ran for the talk. We basically took a number of different versions of Windows and we went into the national vulnerability database NVD, which is a wonderful program run by NIST. They track all vulnerabilities they can get their hands on. They do analysis of each vulnerability, and they put them in this database that is accessible to anybody (to you, to anyone), to do research. This is really useful data. So we pulled the data from NVD.

This slide shows vulnerability discovery over time and each one of the different lines is a different product. [# of vulnerabilities vertical per year, years in horizontal; one curve per product, with the product’s release year as the origin of the curve]. You need to take a second and look a this to realize what is going on. The bottom one where the line jumps up and then kind of bounces at the bottom a little bit, that’s 95. With 95, you saw a little bit of bump when the product was released and then a flat line. What’s going on there? Is Windows 95 a really solid secure product? No. But keep that question in your mind for a little bit. When we start getting to more modern operating systems and look at the curve for XP for instance, it goes up. Notice that it spikes up initially and then it continues to spike over time. In my opinion that is an unusual curve, because that was when Microsoft was starting to develop an appreciation for security being important. If you look at Vista, unfortunately this hasn’t been out that long, we have a huge spike at the beginning. If we look at NT, we see a more classic curve: a lot of spikes in the early lifecycle of the product and then it dwindles down. That dwindle could happen for two reasons. One, nobody cares about it because they are not running it, so vulnerability researchers go away, which I think is a factor to keep in mind. But I also think that the code base that these products are sitting on is being mined out.

For the vulnerabilities that are being discovered, there is this concept of foundational code versus new code that I’ll get into. It’s really interesting that in the first two years we are seeing this spike in vulnerabilities. The count at two years for Vista was around 75 vulnerabilities. In fairness, I did not distinguish between root-level compromises and more minor issues. This is just raw numbers. You might get a much different analysis if the only thing you cared about was root level compromise. This is not an argument that there aren’t techniques to reduce your risk. Fortunately, there are. This is an argument about our ability to control the number of faults that we write into our software, that result in vulnerabilities over time.

[graph showing the cumulative number of vulnerabilities (vertical axis) for Windows products over time (horizontal axis)]

This is an interesting graph: it represents the cumulative number of vulnerabilities over time. The fact that you have numbers going up left to right in this graph isn’t really that scary. But what you want to see is the rate of change over time [the rate of discovery of vulnerabilities] decreasing. You want to see these curves flattening as they reach the right. That’s a good thing. When you see stuff that is accelerating forward, when the curve of your graph is going positive, that’s not a good thing. What that says is that you are not managing complexity well, you are not managing the security well. Vista hasn’t been long enough to figure out which way it is going to curve. XP had a pretty bad history, as did 2K, though you can see it starting to flatten out. And NT is following the curve that we would like to see: it is starting to flatten out as we go to the right.

The last thing that we want to show is cumulative vulnerabilities versus complexity. So this is the slide that attempts to answer the question are we getting worse or better. And, again, because I do not have better data, I have to use source lines of code. As we get to more source lines of code, what is happening to the rate of vulnerabilities that we’re experiencing? They are pretty clearly going up, and the rate at which they are going up is not linear in many cases. There is a curve occurring here. Whether when we got to 120 MLOC it would flatten out or not, we don’t know. What this data tends to suggest is that as complexity goes up, the total count of security vulnerabilities goes up at a rate greater than the linear extrapolation of the number of source lines of code.

I am not going to try to make a proof out of this; it’s a thought exercise. The thought is “Hey, this might be true.” The sum of the evidence seems to suggest that’s true.

So the result suggests that complexity does matter. And actually it contradicts some of the data that has been coming out by industry saying that “We’ve got religion, SDL [software development lifecycle] is working”, which, by the way I think is probably a true statement, but it’s not true enough to completely eliminate this problem. Interestingly enough, why are some of the numbers different ? Let me get into that.

There is this notion of foundational code. Foundational code is the code that is shipped in the initial release or at least exists in a particular version of a product. As that product moves forward and more things are added to it, that new code, those patches, the things that we change, those aren’t in the foundational code. Those are in the new areas. We need to be careful figuring out what is going on when we have foundational code versus new code. We can have two or three different scenarios.

In the first scenario, you have foundational code, say 10 MLOC. Version (n+1) comes out and you have 20 MLOC, and the new code is just a superset of the old code [diagram with two concentric circular areas]. In other words, all of the foundational code still exists in the product.

The other extreme you could have is when new code merely replaces foundational code, which is no longer being used in the product [diagram with a small and a large circle that barly overlap, small being the foundational code]. The first case is much more likely to happen, because once you have a library written … I don’t know if you maintain your own personal libraries of code; I probably have code that I wrote when I was a senior in high school, because it just works. So foundational code tends to be sticky. There are probably lines of code running in Windows 7 that Gates wrote back in his graduate days.

The interesting thing with foundational code is that when we find problems in it we fix them. Sometimes when we fix them we break other things, but over time the quality of foundational code improves. It has to: we’re not that bad. In fact, once we find a problem, if it’s a class of problems, we often go looking for other examples of that same class of problems in our code. So one fault in one section of the code can actually make larger chunks of the code better. So this notion of foundational code is somehing that we need to keep in our heads. As we go out we figure, “Hey, I’m gonna ship version (n+1)”, I might want to ask myself: “Did I only change my code at the margins that make the user experience a little bit better? But most of the code that I’m shipping is actually code that has really gotten through the rigors and the experience of a lot of peer review, a lot of time to stress the code” Or, “did I throw out everything else and start from scratch because maybe my foundational code was so bad that I needed to do that?” Maybe there are other marketing reasons to do that.

But I have to have a notion of how much of the product I am shipping is foundational code versus new, because that is going to affect the vulnerability rate that it is going to experience over time.

Another issue that we need to worry about, say that we are about to ship version (n+1), how many vulnerabilities are in it? 2, 12, 2,000? What you have to worry about is how many people care, how many people are looking. If you have a non-popular product, a niche product, the number of people that will focus on it is probably directly correlated to the number of people that really don’t like you. What I mean by that is that people need to have a reason to target the smaller set of groups that are using that tool.

An unpopular product is not going to experience the same level of scrutiny.

Now, if you are Internet Explorer, there will be a lot of folks focused on you. And this skews our data horribly. If you are trying to compare the fault rate between IE and Opera, that’s a difficult thing to do because, while people care about the security of Opera, people in the attack community are business people. [the day before, the “fireside” conversation touched on the “professionalization of the attack community”]. They are business people that like not to follow the rules. (Of course, some of these people live in countries where the rules are written differently.) They invest in what they can get a return on; they are very good at this. And, if you are going to choose where you are going to invest your money, as an attacker, what do you choose? Do you choose IE or do you choose Opera? Well, up and until the point that Opera has a market share that is similar to IE, you are going to choose IE, because you get a bigger bang for your buck. So, this is something that we need to keep in mind.

Another thing that we need to worry about is that having the vulnerability reporting, itself, turns out to be really difficult to get right. One problem that we have now, is that the vulnerabilities are not being published. There are commercial organizations and a black market that are perfectly happy to pay you to release your vulnerability only to them, and it does not get into tools like the NVD [national vulnerability database]. This is one of the challenges that we have. A lot of the really good vulnerabilities that are being discovered are being sat on. So, we are not learning about them at the same rate that we used to. It used to be that tools like metasploit would be up to date very quickly with new vulnerabilities, but it’s slowing. It’s slowing because you can hand that vulnerability over to the black market and get 10.000 dollars, sometimes more depending on what product is being targeted and how “cool” the attack is —cool being an operational term.

There are other problems: it’s very hard sometimes to [identify] two different bug reports, two different fault reports, that actually are the same vulnerability but they were expressed in differet ways. So, the data that we collect, itself, can be somewhat challenged. Also, we don’t always know when these things are truly closed up. Sometimes the fix dates are published because, when you get to “Patch Tuesday” and your box reboots, they might have told you about all of the vulnerabilities they just patched; they might not. The same thing happens with all of the folks. That’s an operational security decision. So, again, not all of the data is getting to us, so you have to take some of this with a grain of salt. But, again, I am making an argument about the large numbers; not trying to make sure that every single fact that I have is 100% correct. It;s more of a trend that I think demonstrates the concept.

There are some industries that are much more careful about their development process than others. And when you look at the fault rates, their fault rates are actually much lower than our experience by industry as a whole. In this case I am refering to safety critical [faults], and aerospace is a good example of this. Now, there was an incident with one of fly-by-wire airplanes that recently crashed in London. [Prof. Gene Spafford helped clarify that it was an Airbus 340. It is actually a Boeing 777, flight BA038. So, there have been accidents caused by software faults in the aviation industry. But, by and large, their fault rate is pretty darn low.

In fact, their is a project that my academic advisor was involved in, which was TCAS II. TCAS II is a system where, if two airplanes are on a collision path, there are transponders in both aircrafts that negociate with each other and give the airplane directions to deconflict themselves. So, the TCAS alert will go off and say “This airplane, dive, go left”, and the other will say “Climb, go right”. The thing is, you have to make sure that both instructions are right. There has been one accident, where the TCAS system went off and announced directions, and air-traffic control said “No, the TCAS box is wrong, do what I said”. The pilot chose to follow air-traffic control and, indeed, collided. It was a pretty bad accident. The TCAS system was actually correct; it came with the right decision. There has never been a case where TCAS has given the wrong answer in an operational setting. So, the point is that if we focus on this, and train for it, and actually run complexity metrics against our code and measure it today, we can get better. Unfortunately, it is costly: there is a price to be paid. And can we, as a society, afford to pay that cost for every product that we rely on. The answer is probably no. But, I get back to the point I made earlier: are we truly calculating the cost of faults in the products that we rely upon today ?

There are lots of different reasons why managing complexity is hard. I’ve just listed a few of the examples. [slide would help here].

You can’t manage complexity in an ad-hoc development environment. Just throwing code together without any idea of how you are going to collaborate, how you are going to compare, how you are going to define interfaces, how you are going to architect your code, is … madness. It does not work. That’s another thing I learned from my class. Most of the assignments are written as individuals, and the quality of that code is directly tied to the quality of that individual developer. As soon as they get into project teams, things get a lot more interesting, because the faults start showing up at the integration layer, as opposed to the unit level. So, when we bring development shops together, especially large development shops, if you do not have a lot of good process behind your development, you are not going to manage complexity well.

Here is the point that I wanted to get to: if you really focus on security and in general software faults management, you can actually write code that is of higher complexity, without necessarily injecting a lot of security faults into it. But the problem is, you have to know how good you are before you do that. If you have really good development processes, if you have smart developers that thave been trained, they’re security-aware, they’re quality-aware, they’re writing code. [the slide presents curves with different slopes and asymptotic behaviors] You might be able to have —the bottom axis here is complexity, driving up in complexity— [the vertical axis is the number of faults] …

The bottom line is the mystical beast: you write larger code and actually get higher quality. [the curve decreases from left to right]. I would say that if you could be on these first two lines [constant curve and slow linear increase], you are doing very very well. The idea here is that you want to measure the complexity of your code and you want to measure the fault rate that you are experiencing once you ship the product. This is going to take years to get good numbers for any particular organization. But once you know where you are at, from a maturity standpoint, and once you have decided what you are willing to pay for in terms of your development processes, you can place yourself somewhere on this graph. Based on that you are going to have to make a decision of how far along on that complexity curve you are willing to tolerate [faults increase with the complexity], and your customers are willing to tolerate from a vulnerability perspective. If you are that top line [at least quadratic growth] or worse, you don’t want to be doubling the complexity of your code in the next five years, because youre vulnerabilities number is not going to double but to quadruple. You are going to be that “four times” number that we saw a couple of slides ago. If you’re instead on one of these slow-sloped lines, maybe you can double the complexity of your code and still successfully manage the vulnerabilities that you’re exposing. But unless you measure that, unless you’ve looked at it, it’s all ad-hoc. It is faith versus true.

So I am making, I hope, a plea here. As you go out, as you start looking and working on real world software projects that are going to translate into code that large numbers of people run, that you know where you are on this graph.

So, conclusions:

One thing is that the initial code release, the foundational code, that’s where most of the bugs are introduced. The lesson here is, as you’re releasing large chunks of new, that’s when you need to focus really tightly on this question. When you’re just doing the maintenance-side of things, that’s a little easier. But that initial chunk, all the numbers show —the research is in fact pretty consistent in this area— that large chunks of foundational code is usually where most of the action is from a vulnerability perspective.

Complexity does impact vulnerability. I strongly believe that. I don’t believe that source lines of code (SLOC) are a way to go. I believe that we need better complexity measures that correlate better to security faults. I do believe complexity drives security.

This is an important one: foundational code is increasing in size rather rapidly. So, again, not Moore’s law, but about every five years it appears to be doubling, at least with the large products that we are using every day. And our ability to prevent vulnerabilities from increasing at the same rate is directly related to our ability to put ourselves in the right curve [referring to the graph on previous slide] and then potentially make some hard choices. Because, I have two large options.

If I discover that I am on one of these high-slope curves, one approach is that I can invest more heavily in my software development processes. I could invest more heavily in training my developers. I could invest more heavily in measuring the complexity of code as I am doing it, and maintaining it over time. I could walk closer to the kind of standards that are used in the safety critical world. So I could put myself on a slower slope, and I could say “OK, my customers need more complexity because that’s what is going to sell. But to do that I am going to need to invest more heavily in my software development process.”

The other option is: I have to choose to release products that have the same complexity as version (n). And really what that means is that I am going to choose to give up some functionnality that i would normally like to release. And that’s a very hard decision to make. But, as consumers, we have some ability to control that as well. By the way, one of the things that the software industry has been doing over the last year and a half to two years is moving away from charging you for software. Software as a service: big move. You are not being charged for software; you are being charged for the use of the software. I actually think this is going to be a good thing in the long term for security. Why ? Because there is a profit motive for maintaining that software and keeping it up and reliable over time.

Same token: anybody bought Oracle or any of these large products recently ? They’ll add: you want some new Oracle feature ? Fine, they are almost not going to charge you for it, but they are going to charge you for the maintenance. 25 percent a year. That might be annoying to you but it is changing the economics behind software development pretty dramatically. I think it’s in the right way: a way that starts shifting the focus to what the customers actually need as opposed to adding features just so that you will buy the next version. (somewhat of an aside).

The last option which, I’m afraid, is the default one, is that you don’t care. You go ahead and you release the product that has the increased complexity. You probably don’t know what part of the curve you’re on. And because of that, we are moving towards a world where the complexity [vulnerability?] rates are increasing as opposed to decreasing. Frankly I think that some of the reason the curve of the vulnerability rate flattens out is because there are only so many vulnerabilities that you need as an attacker to get where you want to be. So there are a numbers of reasons why that curve might flatten, which are beyond bugs simply disappearing. So with that, I would open up to thoughts and questions.


Symposium Summary: Complexity vs. Security—Choosing the Right Curve, Morning Keynote Address

A keynote summary by Gaspar Modelo-Howard.

Dr. Ronald W. Ritchey, Booz, Allen and Hamilton

Ronald Ritchey is a principal at Booz Allen Hamilton, a strategy and technology consulting firm, and chief scientist for IATAC, a DoD initiative to provide authoritative cyber assurance information to the defense community. He spoke about software complexity and its relation to the number of vulnerabilities found in software.

Ritchey opened the talk sharing his experience as a lecturer for a secure software development course he gives at George Mason University. The objective of the course is to allow students to understand why emphasis on secure programming is so important. Using the course dynamics, he provided several examples on why secure programming is not easy to achieve: much of the code analysis to grade his course projects includes manual evaluation which makes the whole process long, even students with good development skills usually have vulnerabilities in their code, and some students insert vulnerabilities by calling secure-sounded libraries in insecure ways. All these examples allowed Ritchey to formulate the following question: How hard can it be to write good/secure software?

Ritchey then moved on to discuss software complexity. He presented the following statement: software products tend toward increasing complexity over time. The reason is that to sell the next version of a program, market is expecting to receive more features, compared to previous version. To add more features, more code is needed. Software is getting bigger, therefore more complex. So in light of this scenario: Does complexity correlate to software faults? Can we manage complexity for large development projects? And, should development teams explicitly limit complexity to what they have demonstrated they can manage?

Several security experts suggest that complexity increases security problems in software. Quoting Dan Geer, “Complexity is the enemy”. But Ritchey mentioned that researchers are divided on the subject. Some agree that complexity is a source of vulnerabilities in code. The latest Scan Open Source Report[1] found strong linear correlation between source lines of code (SLOC) and number of faults, after the analysis of 55 million SLOC from 250 open source projects. Shin & Williams[2] suggest that vulnerable code is more complex than faulty code after analyzing the Mozilla JavaScript engine.

Some researchers suggest there is no clear correlation. Ozment and Schechter[3] found no correlation after analysis of the OpenBSD operating system which is known for its developers’ focus on security. Also, Michael Howard of Microsoft Corp. pointed out that even though Windows Vista’s SLOC is higher than XP, Vista is experiencing a 50% reduction in its vulnerability count and this is attributed to their secure development practices.

Regardless of the relationship between complexity and security, Ritchey mentioned it is likely that SLOC is a weak metric for complexity and suggested potential replacements in terms of code structure (cyclomatic complexity, depth of inheritance), computational requirements (space, time), and code architecture (number of methods per class, lack of cohesion of methods).

Looking at different popular programs, it is clear that all are becoming larger as new versions are released. MacOS X v10.4 included 86M SLOC and Ubuntu Linux has 121M. Browser applications also follow this trend, with Internet Explorer v6 included 7M SLOC and Firefox v3 has 5M. A considerable percentage of these products doubled their sizes between versions: Windows NT4 has more than 11M SLOC and its later version XP has 40M, Debian v3 has 104M and v4 jumped to 283M.

In light of the different opinions and studies presented, Ritchey analyzed the Microsoft Windows operating system by counting the vulnerabilities listed on the National Vulnerabilities Database[4] for different versions of this popular system. No distinction was made between the root level compromise and other levels. From the results presented, a large number of vulnerabilities were found after the initial release of the different Windows versions. Such trend represents the initial interest shown by researchers to find vulnerabilities who later moved to newer versions or different products. Ritchey also commented on the impact of the foundational (initial release) code, which seems to have a higher vulnerability rate than later added code from updates. From the cumulative vulnerability count vs. complexity (SLOC) graph shown, lines go up so it might be true that complexity impacts security. He alerted though on need to be careful on how to judge these numbers since factors such as quantity and quality of resources available to development team, popularity of software, and operational and economic incentives might impact these numbers.

Throughout his talk, Ritchey emphasized that managing complexity is difficult. It requires a conscious cultural paradigm shift from the software development team to avoid and remove faults that lead to security vulnerabilities. And as a key point from the talk, a development team should know at a minimum how much complexity can be handled.

Ritchey then concluded that complexity does impact security and the complexity found in code is increasing, at a plausible rate of 2x every 5 to 8 years. The foundational code usually contributes to the majority of vulnerabilities reported. The ability to prevent vulnerability rates from increasing is tied to the ability to either limit the complexity or improve how we handle it. The speaker (calls himself an optimist and) believes that shift from software as a product to software as a service is good for security since it will promote sound software maintenance and move industry away from adding features just to sell new versions.


  1. Coverity, Inc. Scan Open Source Report 2008. Available at
  2. Shin, Y. and Williams, L.: Is complexity really the enemy of software security? In: 4th ACM workshop on Quality of protection, pp. 47—50. ACM, New York, NY, USA.
  3. Ozment, A. and Schechter, S.: Milk or Wine: Does Software Security Improve with Age? In: 15th USENIX Security Symposium, pp. 93—104. Usenix, Berkeley, CA, USA.
  4. National Institute of Standards and Technology. National Vulnerability Database. Available at

Symposium Summary: Unsecured Economies Panel

A panel summary by Kripa Shankar.

Panel Members:

  • Karthik Kannan, Krannert School of Management, Purdue University
  • Jackie Rees, Krannert School of Management, Purdue University
  • Dmitri Alperovitch, McAfee
  • Paul Doyle, ProofSpace
  • Kevin Morgan, Arxan Technologies

Adding a new dimension to the CERIAS 10th Annual Security Symposium, five of the panelists with varied background came together on the final day to share their work and experiences on “Unsecured Economies: Protecting Vital IP”.

Setting the platform for this discussion was this report. “Together with McAfee, an international team of data protection and intellectual property experts undertook extensive research and surveyed more than 1,000 senior IT decision makers in the US, UK, Japan, China, India, Brazil and the Middle East regarding how they currently protect their companies digital data assets and intellectual property. A distributed network of unsecured economies has emerged with the globalization of many organizations, leaving informational assets even more at risk to theft and misuse. This report investigates the cybercrime risks in various global economies, and the need for organizations to take a more holistic approach to vulnerability management and risk mitigation in this ever-evolving global business climate.”

Karthik Kannan, Assistant Professor of Management Information Systems, CERIAS, Krannert School of Management, Purdue University was the first to start the proceedings. He gave a brief overview of the above report, which was the product of the collaborative research done by him, Dr. Jackie Rees and Prof. Eugene Spafford as well. The motivation behind this work, was that more and more information was becoming digital and traditional geographic boundaries were blurring. Information was being outsourced to faraway lands and as a result protecting leaks was becoming harder and harder. Kannan, put forth questions like: “How do perceptions and practices vary across economies and cultures?”, and sighted an example from India where salary was not personal information, and was shared and discussed informally. To get answers for more such questions, a survey was devised. This survey was targeted at senior IT decision makers, Chief Information Officers and directors of various firms across the globe. US, UK, Germany, Brazil, China and India were among the countries chosen, giving the survey the cultural diversity element that it needed. Adding more value to the survey was the variety of sectors: Defense, Retail, Product Development, Manufacturing and Financial Services. According to results of the survey, a majority of the intellectual property (47%) originates from North America and Western Europe, and on an average firms lost $4.6 million worth of IP last year. Kannan went on to explain how security was being perceived in developing countries, and also discussed how respondents reacted to security investment during the downturn. Statistics like: 42% of the respondents saying laid-off employees are the biggest threat caused by the economic downturn, showed that insider threats were on the rise. The study put forth many case studies to show that data thefts from insiders tend to have greater financial impact given the high level of data access, and an even greater financial risk to corporations.

Jackie Rees, also an Assistant Professor of Management Information Systems, CERIAS, Krannert School of Management, Purdue University took it up from where Kannan had left and brought to light some of the stories that did not go into the report. Rees explained the reasons behind the various sectors storing information outside the home country. While Finance sector viewed it as being safer to store data elsewhere; the IT , Product Development and Manufacturing sectors found it to be more efficient for the supply chain; and the Retail and Defense sector felt better expertise was available elsewhere. Looking at the perspective on the amount that these sectors were spending on security, 67% of the Finance industry said it was “just right”, while “30%” of Retail felt it was “too little”. The other results seemed varied but consistent with our intuitions, however all sectors seemed to agree that the major threat to deal with was “its own employees”. The worst impact of a breach was on the reputation of the organization. Moving on to the global scene where geopolitical perceptions have become a reality in information security policies, Rees shared that certain countries are emerging as clear sources of threats to sensitive data. She added that Pakistan is seen as big threat by most industries according to respondents while China and Russia are in the mix. Poor law enforcement, corruption and lack of cooperation in these economies were sighted as a few reasons for them to emerge as threats.

Dmitri Alperovitch, Vice President of Threat Research, McAfee Corporation began by expressing his concern over the fact that Cybercrime is one of the headwinds hitting our economy. He pointed out that the economic downturn has resulted in less spending on security, and as a result increased vulnerabilities and laid of employees were now the serious threats. Elucidating, he added that most of the vulnerabilities are used by insiders who not only know what is valuable, but also know how to get it. Looking back at the days when a worm such as Melissa that was named after the attacker’s favorite stripper seems to be having a much lesser malicious intent that those of today, where virtually all threats now are financially motivated and more to do with money laundering. Sighting examples, Alperovitch told us stories of an organization in Turkey that was recently caught for credit and identity theft, of members of law enforcement being kidnapped, and of how Al-Qaeda and other terrorist groups were using such tools to finance terrorist groups and activities. Alperovitch vehemently stressed on the problem that this threat model was not understood by the industry and hence the industry is not well protected.

Paul Doyle, Founder Chairman & CEO, Proofspace began by thanking CERIAS and congratulating the researchers at McAfee for their contributions. Adding a new perspective of thinking to the discussion, Doyle proposed that there has not been enough control over the data. Data moves over supply chain, but “Control” does not move. Referring to yesterday’s discussion on cloud computing, where it was pointed out that availability is a freebie, Doyle said the big challenge here was that of handling integrity of data. Stressing on the point he added that integrity of data is the least common divisor, and that it was the least understood area in security as well. How do we understand when a change has occurred? In the legal industry, we have a threat factor in the form of a cross-examining attorney. What gives us certainty in other industries? We have not architected our systems to handle the legal threat vector. Systems lack the controls and audit ability needed for provenance and ensured integrity. Trust Anchor of Time has to be explored. “How do we establish the trust anchor of time and how confidentiality tools can help in increasing reliabilities?” are important areas to work on.

Kevin Morgan, Vice President of Engineering, Arxan Technologies began with an insight on how crime evolves in perfect synchrony with the socio-economic system. Every single business record is accessible in the world of global networking, and access enables crime. Sealing enterprise perimeters has failed, as there is no perimeter any more. Thousands and thousands of nodes execute business activity, and most of the nodes (like laptops and smart phones) are mobile, which in turn means that data is mobile and perimeter-less. Boundary protection is not the answer. We have to assume that criminals have access to enterprise data and applications. Assets, data and applications must be intrinsically secure and the keys protecting them must be secure too. Technology can help a great deal in increasing the bar for criminals and the recent trends are really encouraging.

After the highly informative presentations, the panel opened up for questions for the next hour. A glimpse of the session can be found in the transcript of the Q&A session below.

Q&A Session: A transcript snapshot

Q: We are in the Mid-West, no one is going to come after us. What should I as a security manager consider doing? How do you change the perception that organizations in “remote” locations are also subject to attack?

  • Alperovitch: You are cyber and if you have valuable information you will be targeted. Data manipulation is what one has to worry about the most.
  • Morgan: Form Red teams, perform penetration tests and share the results with the company.
  • Doyle: Employ allies and make sure you are litigation ready. Build a ROI model and lower total cost of litigation.

Q: CEOs consider cutting costs. They cut bodies. One of the biggest threats to security is letting the people go. It’s a paradox. How do we handle this?

  • Kannan: We have not been able to put a dollar value to loss of information. Lawrence Livermore National Lab has a paper on this issue which might be of interest to you.
  • Rees: Try to turn it into a way where you can manage information better by adding more controls.

Q: How do we stress our stand on why compliance is important?

  • Doyle: One of our flaws as a professional committee is that we are bad in formulating business cases. We have to take a leaf out of Kevin’s (of Cisco) books who formulates security challenges into business proposals. Quoting an analogy, at the end of the day it is the brakes and suspensions are the ones that decide the maximum speed of the automobile, not the engine or the aerodynamics. The question is: How fast we can go safely? Hence compliance becomes important.

Q: Where do we go from here to find out how data is actually being protected?

  • Kannan: Economics and behavioral issues are more important for information security. We need to define these into information security models.
  • Rees: Governance structure of information must also be studied.
  • Alperovitch: The study has put forth those who may be impacted by the economy. We need to expose them to the problem. Besides we also need to help law enforcement get information from the private sector as the laws are not in place. We also need to figure out a way to motivate companies to share security information and threats with the community.
  • Doyle: Stop thinking about security and start thinking about risk and risk management. Model return-reward proposition in terms of risk.
  • Morgan: We need to step up as both developers and consumers.

Q: The $4.6 million estimate. How was it estimated?

  • Rees: We did a rolling average across the respondents, keeping in mind the assumption that people underestimate problems.

Q: Was IP integral to the business model of a company that there was a total loss causing the company to go bust?

  • Rees: We did not come across any direct examples of firms that tanked and fell because of IP loss.

Q: Could you suggest new processes to enforce security of data?

  • Doyle: We need to find ways from the other side. If we cannot stop them, how do we restrict and penalize them using the law?

Q: Infrastructure in Purdue and US has been there for long and we have adapted and evolved to newer technologies. However other old organization and developing countries are still backward, and it actually seems to be helping them, as they need to be less bothered with the new-age threats. What’s your take on that?

  • Kannan: True. We spoke to the CISO of a company in India. His issues were much less as it was a company with legacy systems.
  • Alperovitch: There is a paradigm shift in the industry. Security is now becoming a business enabler.

Symposium Summary: Distinguished Lecture

A summary written by Nabeel Mohamed.

The main focus of the talk was to highlight the need for “information-centric security” over existing infrastructure centric security. It was an interesting talk since John was instrumental in providing real statistics to augment his thesis.

Following are some of the trends he pointed out from their research:

  • Explosive growth of information: Digital content in organization grows by about 50% every year.
  • Most of the confidential/sensitive information or trade secrets of companies are in the form of unstructured data such as emails, messages, blogs, etc.
  • The growth of malicious code in the market place out-paces that of legitimate code.
  • Attackers have found ways to get around network protection and get at the sensitive/confidential information leaving hardly any trace most of the time. Attackers have changed their motivation; they no longer seek big press and they want to hide every possible trace regarding the evidence of attacks.
  • Threat landscape has changed markedly over the last ten years. Ten years ago there were only about five viruses/malicious attacks a day, but now it’s about staggering 15,000 a day.
  • The research conducted by the Pondemon Group asked laid-off employees if they left with something from the company and 60% said yes. John thinks that the figure could be still higher as there may be employees who are not willing to disclose it.

These statistics show that data is becoming increasingly important than ever before. Due to the above trends, he argued that protecting infrastructure alone is not sufficient and a shift in the paradigm of computing and security is essential. We need to change the focus from infrastructure to information.

He identified three elements in the new paradigm:

  1. It should be risk-based.
  2. It should be information centric.
  3. It should be managed well over a well-managed infrastructure.

John advocated to adopt a risk-based/policy-based approach to manage data. A current typical organization has strong policies on how we want to manage the infrastructure, but we don’t have a stronger set of policies to manage the information that is so critical to the business itself. He pointed out that it is high time that organizations assess the risk of loosing/leaking different types information they have and devise policies accordingly. We need to quantify the risk and protect those data that could cause high damage if compromised. Identifying what we want to protect most is important as we cannot protect all adequately.

While the risk assessment should be information-centric, one may not achieve security only by using encryption. Encryption can certainly help protect data, but what organizations need to take is a holistic approach where management (for example: data, keys, configurations, patches, etc.) is a critical aspect.

He argued that it is impossible to secure without having knowledge about the content and without having good policies on which to base organizational decisions. He reiterated that “you cannot secure what you do not manage”. To reinforce the claim, he pointed out that 90% of attacks could have been prevented had the systems came under attack been managed well (for example, Slammer attack). The management involves having proper configurations and applying critical updates which most of the vulnerable organizations failed to perform. In short, well-managed systems could mitigate many of the attacks.

Towards the end of his talk, he shared his views for better security in the future. He predicted that “reputation-based security” solutions to mitigate threats would augment current signature-based anti-virus mechanisms. In his opinion, reputation-based security produces a much more trusted environment by knowing users’ past actions. He argued that this approach would not create privacy issues if we change how we define privacy and what is sensitive in an appropriate way.

He raised the interesting question: “Do we have a society that is sensitive to and understands what security is all about?” He insisted that unless we address societal and social issues related to security, the technology alone is not sufficient to protect our systems. We need to create a society aware of security and create an environment for students to learn computing “safely”. This will lead us to embed safe computing into day- to-day life. He called for action to have national approach to security and law enforcement. He cited that it is utterly inappropriate to have data breach notification on a state-by- state basis. He also called for action to create an information-based economy where all entities share information about attacks and to have information-centric approach for security. He mentioned that Symantec is already sharing threat information with other companies, but federal agencies are hardly sharing any threat information. We need greater collaboration between public and private partnerships.

Symposium Summary: Fireside Chat

A panel summary by Utsav Mittal.

Panel Members:

  • Eugene H. Spafford, CERIAS
  • Ron Ritchey, IATAC
  • John Thompson, Symantec

It’s an enlightening experience to listen to some of the infosec industry’s most respected and seasoned professionals sitting around a table to discuss information security.

This time it was Eugene Spafford , John Thompson and Ron Ritchey. The venue was Lawson computer science building. The event was a fireside chat as a part of the CERIAS 10th Annual Security Symposium.

Eugene Spafford started the talk by stating that security is a continuous process not a goal. He compared security with naval patrolling. Spaf said that security is all about managing and reducing risks on a continuous basis. According to him, nowadays a lot of stress is placed on data leakage. This is undoubtedly one of the major concerns today, but it should not be the only concern. When people are focused more on data leakage instead of addressing the core of the problem, which is in the insecure design of the systems, they get attacked which gives rises to an array of problems. He further added that the amount of losses in cyber attacks are equal to losses incurred in hurricane Katrina. Not much is being done to address this problem. This is partly due to the fact that losses in cyber attacks, except a few major ones, occur in small amounts which aggregate to a huge sum.

With regards to the recent economic downturn, Spaf commented that many companies are cutting down on the budget of security, which is a huge mistake. According to Spaf, security is an invisible but vital function, whose real presence and importance is not felt unless an attack occurs and the assets are not protected.

Ron Ritchey stressed upon the issues of data and information theft. He said that the American economy is more of a design-based economy. Many cutting edge products are researched and designed in the US by American companies. These products are then manufactured in China, India and other countries. The fact that the US is a design economy further encompasses the importance of information security for US companies and the need to protect their intellectual property and other information assets. He said that attacks are getting more sophisticated and targeted. Malware is getting carefully social engineered. He also pointed out there is a need to move from signature-based malware detection to behavior-based detection.

John Thomson arrived late as his jet was not allowed to land at the Purdue airport due to high winds. John introduced himself in a cool way as a CEO of a ‘little’ company named Symantec in Cupertino. Symantec is a global leader in providing security, storage and systems management solutions; it is one of the world’s largest software companies with more than 17,500 employees in more than 40 countries.

John gave some very interesting statistics about the information security and attack scene these days. John said that about 10 years ago when he joined Symantec, Symantec received about five new attack signatures each day. Currently, this number stands about 15000 new signatures each day with an average attack affecting only 15 machines. He further added that the attack vectors change every 18-24 months, and new techniques and technologies are being used extensively by criminals to come out with new and challenging attacks. He mentioned that attacks today are highly targeted, intelligently socially engineered, are more focused on covertly stealing information from a victim’s computer, and silently covering its tracks. He admitted that due to increasing sophistication and complexity of attacks, it is getting more difficult to rely solely on signature-based attack detection. He stressed the importance on behavior-based detection techniques. With regards to the preparedness of government and law enforcement, he said that law enforcement is not skilled enough to deal with these kind of cyber attacks. He said that in the physical world people have natural instincts against dangers. This instinct needs to be developed for the cyber world, which can be just as dangerous if not more so.

Symposium Summary: Security in the Cloud Panel

A panel summary by Ashish Kamra.

Panel Members:

  • Lorenzo D. Martino, College of Technology, Purdue University
  • Keith Watson, CERIAS, Purdue University
  • Dennis R. Moreau, Configuresoft Inc.
  • Christoph Schuba, Sun Microsystems

The panel discussion included a 5-10 minute presentation from each of the panelists followed by a question and answer session with the audience.

The first presentation was from Lorenzo Martino. Lorenzo defined cloud computing as ‘computing on demand’ with the prominent manifestations being high-performance computing, and use of virtualization techniques. He also included other forms of pervasive computing such as body-nets, nano-sensors, intelligent energy grid, and so forth as cloud computing. The main security challenge identified by Lorenzo was coping with the complexity of the cloud environment due to the increasing scale of nodes. Another security issue identified by him was the decreasing knowledge of the locality of nodes as well as their trustworthiness with increasing scale of nodes. This in turn, introduces issues related to accountability and reliability. He outlined two main issues to be resolved in the context of a cloud computing environment. First issue is to strike the right balance between network security and end-point security. Second issue is the lack of clarity on attack/risk/trust models in a cloud environment.

According to Keith, the best way to explain cloud computing to a layman is to think of cloud computing services as utilities such as heat and electricity that we use and pay for as needed. His main security concern in the cloud environment was on the legal ramifications of the locality of data. For example, a company in the European Union (EU) might want to use Amazon cloud services. But it may not be legal if the data at the backend is stored in United States (US) as the data privacy laws of EU and US differ. Another concern raised by Keith was the lack of standards among cloud computing services. Because of insufficient standardization, verification of cloud services for compliance with regulations is very difficult. Finally, according to Keith, despite all the current security challenges in an cloud environment, the cloud will ultimately be widely adopted due to the lower costs associated with it.

Christoph reiterated that cloud services will become mainstream because of theirs flexibility (don’t have to worry about under-provisioning or over provisioning of resources) and lower total cost of ownership. But the key point to understand is that the cloud computing paradigm is not for everyone and everything as cloud computing means different things to different people based on their requirements. Citing the example of grid computing, Christoph highlighted the fact that there is inherent lack of demand for security in the cloud. For example in grid computing, more than 95% of the customers opt out of the grid security services due to various reasons. According to Christoph, the main challenge in cloud security is that increased abstraction and complexity of cloud computing technology introduces potential security problems as there are many unknown failure modes and unfamiliar tools and processes. He also added that the cloud security mechanisms are not unlike traditional security mechanisms, but they must be applied to all components in the cloud architecture.

Security issues arising out of the sheer complexity of the cloud environment were also raised by Dennis. There are layers of abstraction in the cloud environment with multiple layers of technologies each with their own configuration parameters, vulnerabilities and attack surface. There is also lots of resource sharing, and the hosting environment is more tightly coupled. All such factors make articulating sound security policies for such environment a potential nightmare. The questions that seem most difficult to answer in a cloud environment are how to achieve compliance, how is the trust shared across different regulatory domains and how much of the resources are shared or coupled. There is an inherent lack of visibility inside a cloud environment. A potential solution is to expose configuration visibility in to the cloud for performing root cause analysis of problems. Also, maintaining smaller virtualization kernels as close as possible to the hardware so that they can be more trustworthy and verifiable will help address some of the security risks.

Question and Answer Session with the Audience

A recurring theme during the question/answer session was related to trust management in the cloud environment. The first question was that how do ‘trust anchors’ in the traditional computing environment apply to the cloud? Dennis replied that the same trust anchors such as TPM etc. apply but the interplay between them is different in the cloud. There are some efforts to create virtual TPMs so that each virtual operating system gets its own vie of trust. But how that affects integrity is still being worked upon. Keith however was more skeptical of the ongoing work in trust anchors solutions because of the complexity of the provider stacks. Another question related to trust was that why should a normal user trust his private information to a cloud provider. Dennis replied that users will be driven to use the cloud services based on cost, and use such utilities because of the savings passed to them. The onus is thus on the cloud service providers to take of security and privacy issues. Christoph added that with respect to trust, the cloud is no different than other computing paradigms such as web services or grid computing. Companies providing cloud services will need to answer trust issues for their own customers. The next question was on the legal/compliance hang-ups of the cloud with respect to location of the data. Specifically, there were concerns on the chain of custody data and accountability. Dennis replied that the legal and compliance requirements haven’t and can’t (yet) keep up with change in technology. But this also provides an incentive for the providers to differentiate themselves from the rest on the basis of auditing and legal support offered with their services. Christoph suggested that such concerns should be addressed in the Service Level Agreements (SLAs) which then become binding on the service providers. Lorenzo pointed out that coming up with new regulations for the cloud environment will be a very difficult task as even the current regulations such as HIPPA have gray areas.

Another interesting question was whether outsourcing IT services to a cloud provider is a win in security because of the expertise at the cloud provider? Dennis agreed that at least with respect to managing configuration complexity, it is better off for the organizations to outsource their services to a cloud provider. Christoph pointed out that availability as a security issue is a big win in the cloud environment as it almost comes for “free” with any cloud provider. But maintaining in-house 24x7 availability is a very resource consuming task for most organizations.

There was a comment that there is a perception among customers that if the data is outside the organization, it is not safe. In reply to this comment, Christoph reiterated that cloud computing is not for everyone. If data is sensitive such that it cannot be outsourced then probably cloud computing is not the right answer. Strong SLAs can help mitigate many concerns, but still cloud computing may not be for everyone. Some related queries were on the safety of applications and on the enhanced insider threat concerns in the cloud environment. Dennis echoed similar concerns and mentioned that people in working in cloud environment are audited very carefully as they have a lot of leverage. Christoph added that there are techniques beyond auditing to mitigate insider threats such as application firewalls, no sniffing, scanning the data out of the hosted images, intrusion detection of hosted images, and so forth. Such controls can be articulated in the SLAs as well. Dennis gave an example that ESX with OVF-I can associate security controls in the models for each hosted instance which also puts less burden on the application developers.

Next question was whether it is possible to but keep applications local but still get the benefits of cloud security? Dennis replied that it is possible but the organizations need to be careful about applying security requirements consistently when using cloud only for some things. It is also important to understand the regulations and the risks involved before doing so.

To the query as to what data can be put into cloud without worries about security (public data), Keith replied that organizations needs to have a classification system in place to figure out what is acceptable for public access.

Many questions that followed at this stage were related to general cloud computing requirements such as on sustainability of cloud computing, applications of cloud computing, distinction between a hosting provider and cloud provider, and so forth. This was expected as the audience primarily consisted of people from academia who are yet to come to grips with the aggressive adoption of cloud services in the industry.

Finally, at the end there were two very interesting questions related to cloud security. First question was with regard to integrity of data in a cloud environment; how can the data integrity be preserved and also legally proved in a cloud environment? Christoph pointed out that data integrity is guaranteed as part of the SLAs. Dennis pointed out that many of the distributed systems concepts such as 2-phase commit protocol are applicable to the cloud environment. Integrity issues may be temporal as replication of data may not be immediate. But surely long term integrity of data is preserved. The second comment was that whether the clouds are now a more appealing target for attackers in the sense that does it increases reward over risk? Dennis agreed that putting all our eggs in one basket is a big risk, and also that such convergence will lead to a greater attack surface and will give much greater leverage to the attackers. But the underlying premise is that the benefits of the cloud technology are amazing, so we have to work towards mitigating the associated risks.

Symposium Summary: Transitive Security & Standards Adoption Panel

A panel summary by Jason Ortiz.

Panel Members:

  • Pascal Meunier, CERIAS, Purdue University
  • Tim Grance, NIST
  • Shimon Modi, Biometrics Standards, Performance and Assurance Laboratory, Purdue University
  • Rao Vasireddy, Alcatel-Lucent

There has been a lot of discussion recently surrounding the issue of standards and standard adoption. Many questions have been posed and openly debated in an attempt to find the correct formula for standards. When can a standard be considered a “good” standard, and when should that standard be adopted?

According to Dr. Pascal Meunier of Purdue University CERIAS, standard adoption should be based on what he calls transitive trust. Transitive trust indicates that an evaluation of the standard using criteria appropriate to the adopters has been done by an outside source. This ensures the standard applies to the adopter and that it has been evaluated or tested. Dr. Meunier says this allows for sound justification that a standard is appropriate. Unfortunately, most adoption and creation of standards are focused on assumptive trust, or simply knowing someone, somewhere did an evaluation.

Another concern surrounding the creation and adoption of standards raised during the panel discussion was, when standards interfere with economical development or technological progress, should they be adopted, even if they are well-tested, “good” standards? Tim Grance from NIST responded by saying as of right now, standards are mostly voluntary recommendations and they must be in accordance with economical and technological desires of industry in order for them to be widely adopted and widely accepted. There are very few punishments for not following standards and thus there must exist other motivation for industries to spend time and money implementing these standards.

Along with this, the audience posed a question surrounding the practical use of a standard. Even if a partner does decide to comply with a standard there is no easy method of ensuring they actually understand the standard or have the same interpretation of the standard as other partners. Simply establishing a mutual understanding of a standard within an industry poses another obstacle that requires time and resources.

As a result of this, “good” standards may never be used in practice if they are too costly to implement. Therefore, currently used standards may be out of date, flawed, or simply untested. This discussion lends itself to the question of which is better, a standard which is known to be flawed or no standard at all? There is no clear answer to this question, as there exists sufficient evidence supporting both sides.

An argument for the idea that a standard is better than no standard (even if it is a flawed or insecure standard) is that in this scenario, at least the flaw will be know, recognized and consistent throughout the industry. However, others point to the idea that this would actually be detrimental, as now any entity which has adopted the standard becomes vulnerable to the standard’s flaws as opposed to only a small number of industries.

It is clear that industries need standards to follow in many scenarios. However, the difficult questions include when a standard is needed, when a specific standard should be adopted versus when it could reasonably be adopted, and whether or not a flawed standard is better than no standard at all.

CERIAS Symposium on Twitter

A quick note as we’re getting started: we’ll be live-tweeting events at the 10th Annual CERIAS Information Security Symposium on the @cerias account. If you’ll be tweeting along with us, we encourage you to use the #cerias tag. Thanks!

Unsecured Economies, and Overly-secured Reports

The Report

Over the last few months, CERIAS faculty members Jackie Rees and Karthik Kannan have been busy analyzing data collected from IT executives around the world, and have been interviewing a variety of experts in cybercrime and corporate strategy. The results of their labors were published yesterday by the McAfee Corporation (a CERIAS Tier II partner) as the report Unsecured Economies: Protecting Vital Information.

The conclusions of the report are somewhat pessimistic about prospects for cyber security in the coming few years. The combination of economic pressures, weak efforts at law enforcement, international differences in perceptions of privacy and security, and the continuing challenges of providing secured computing are combining to place vast amounts of valuable intellectual property (IP) at risk. The report presents estimates that IP worth billions of dollars (US) was stolen or damaged last year, and we can only expect the losses to increase.

Additionally, the report details five general conclusions derived from the data:

  • The recession will put intellectual property at risk
  • There is considerable international variation in the commitment (management and resources) to protect cyber
  • Intellectual property is now an "international currency" that is as much a target as actual currency
  • Employees steal intellectual property for financial gain and competitive advantage
  • Geopolitical aspects present differing risk profiles for information stored "offshore" from "home" countries.

None of these should be a big surprise to anyone who has been watching the field or listening to those of us who are working in it. What is interesting about the report is the presented magnitude and distribution of the issues. This is the first truely global study of these issues, and thus provides an important step forward in understanding the scope of these issues.

I will repeat here some of what I wrote for the conclusion of the report; I have been saying these same things for many years, and the report simply underscores the importance of this advice:

“Information security has transformed from simply ’preventing bad things from happening ’into a fundamental business component.' C-level executives must recognize this change. This includes viewing cybersecurity as a critical business enabler rather than as a simple cost center that can be trimmed without obvious impact on the corporate bottom line; not all of the impact will be immediately and directly noticeable. In some cases, the only impact of degraded cybersecurity will be going from ‘Doing okay’ to ‘Completely ruined’ with no warning before the change.

Cybersecurity fills multiple roles in a company, and all are important for organizational health.

  • First, cybersecurity provides positive control over resources that provide the company a competitive advantage: intellectual property, customer information, trends and projections,financial and personnel records and so on. Poor security puts these resources at risk.
  • Second, good security provides executives with confidence that the data they are seeing is accurate and true, thus leading to sound decisions and appropriate compliance with regulation and policy
  • Third, strong cybersecurity supports businesses taking new risks and entering new markets with confidence in their ability to respond appropriately to change
  • And fourth, good cybersecurity is necessary to build and maintain a reputation for reliability and sound behavior, which in turn are necessary to attract and retain customers and partners.
  • This study clearly shows that some customers are unwilling to do business with entities they consider poorly secured. Given massive market failures, significant fraud and increasing threats of government oversight and regulation, companies with strong controls, transparent recordkeeping, agile infrastructures and sterling reputations are clearly at an advantage -- and strong cybersecurity is a fundamental component of all four. Executives who understand this will be able to employ cybersecurity as an organic element of company (and government) survival -- and growth.“

We are grateful to McAfee, Inc. for their support and assistance in putting this report together.

Getting the Report

Update: You can now download the report sans-registration from CERIAS.

Report cover The report is available at no charge and the PDF can be downloaded (click on the image of the report cover to the left, or here). Note that to download the report requires registration.

Some of you may be opposed to providing your contact information to obtain the report, especially as that information may be used in marketing. Personally, I believe that the registration should be optional. However, the McAfee corporation paid for the report, and they control the distribution.

As such, those of us at CERIAS will honor their decision.

However, I will observe that many other people object to these kinds of registration requirements (the NY Times is another notable example of a registration-required site). As a result, they have developed WWW applications, such as BugMeNot, which are freely available for others to use to bypass these requirements. Others respond to these requests by identifying company personnel from information on corporate sites and then using that information to register -- both to avoid giving out their own information and to add some noise to the data being collected.

None of us here at CERIAS are suggesting that you use one of the above-described methods. We do, however, encourage you to get the report, and to do so in an appropriate manner. We hope you will find it informative.

Barack Obama, National Security and Me, Take II

Over the last month or so, many people who read my first post on Senator Obama’s “security summit” at Purdue have asked me about followup, I’ve been asked “Did you ever hear back from the Senator?”, “Has the McCain campaign contacted you?”, and “What do you think about the candidates?” I’ve also been asked by a couple of my colleagues (really!) “Why would they bother to contact you?”

So, let me respond to these, with the last one first.

Why would someone talk with you about policy?

So, I haven’t been elected or served in a cabinet-level position in DC. I haven’t won a Nobel prize (there isn’t one in IT), I’m not in the National Academies (and unlikely to be—few non-crypto security people are), and I don’t have a faculty appointment in a policy program (Purdue doesn’t have one). I also don’t write a lot of policy papers—or any other papers, anymore: I have a persistent RSI problem that has limited my written output for years. However, those aren’t the only indicators that someone has something of value to say.

As I’ve noted in an earlier post, I’ve had some involvement in cyber security policy issues at the Federal level. There’s more than my involvement with the origins of the SfS and Cyber Trust, certainly. I’ve been in an advising role (technology and policy) for nearly 20 years with a wide range of agencies, including the FBI, Air Force, GAO, NSA, NSF, DOE, OSTP, ODNI and more. I’ve served on the PITAC. I’ve testified before Congressional committees a half-dozen times, and met with staff (officially and unofficially) of the Senate and House many times more than that. Most people seem to think I have some good insight into Federal policy in cyber, but additionally, in more general issues of science and technology, and in defense and intelligence.

From another angle, I’ve also been deeply involved in policy. I served on the CRA Board of Directors for 9 years, and have been involved with its government affairs committee for a decade. I’ve been chair or co-chair of the ACM’s US Public Policy committee for a dozen years. From these vantage points I have gained additional insights into technology policy and challenges in a broad array of issues related to cyber, education, and technology.

And I continue to read a lot about these topics and more, including material in a number of the other sciences. And I’ve been involved in the practice and study of cyber security for over 30 years.

I can, without stretching things, say that I probably know more about policy in these areas than about 99.995% of the US population, with some people claiming that I’m in the top 10 or so with respect to broad issues of cyber security policy. That may be why I keep being asked to serve in advisory positions. A lot of people tend to ask me things, and seem to value the advice.

One would hope that at least some of the candidates would be interested in such advice, even if not all of my colleagues (or my family grin are interested in what I have to say.

Have any of the other candidates contacted you?

Simply put—no. I have gotten a lot of mailings from the Republican (and Democratic) campaigns asking me to donate money, but that’s it.

I’m registered as an independent, so that may or may not have played a role. For instance, I can’t volunteer to serve as a poll worker in Indiana because I’m not registered in one of the two main parties! I don’t show up in most of the databases (and that may be a blessing of sorts).

To digress a moment…. I don’t believe either party has a lock on the best ideas—or the worst. I’m not one of those people who votes a straight-ticket no matter what happens. I have friends who would vote for anyone so long as the candidate got the endorsement of “their” party. It reminds me of the drunken football fans with their shirts off in -20F weather cheering insanely for “their” team and willing to fight with a stranger who is wearing the wrong color. Sad. Having read the Constitution and taken the oath to defend it, I don’t recall any mention of political parties or red vs. blue….

That said, I would be happy to talk with any serious candidate (or elected official) about the issues around cyber, security, education, and the IT industry. They are important, and impact the future of our country…and of much of the world.

So, has anyone with the Obama campaign contacted you since his appearance at Purdue?

Well, the answer to this is “yes and no.”

I was told, twice, by a campaign worker that “Someone will call you—we definitely want more advice.” I never got that phone call. No message or explanation why. Nothing.

A few weeks after the second call I did get a strange email message. It was from someone associated with the campaign, welcoming me to some mailing list (that I had not asked to join) and including several Microsoft Word format documents. As my correspondents know, I view sending email with Word documents to be a bad thing. I also view being added to mailing lists without my permission to be a hostile act. I responded to the maintainer of the list and his reply was (paraphrased) “I don’t know why you were added. Someone must have had a reason. I’ll check and get back to you.” Well, I have received no more email from the list, and I never got any followup from that person.

So, in summary, I never got any follow-up from the campaign. I don’t think it is an issue with the Senator (who wouldn’t have been the one to contact me anyhow) but a decision by his staff.

So, depending your level of cynicism, the mentions of my name, of CERIAS, and of follow-up was either (a) a blown opportunity caused by an oversight, or (b) a cynical political ploy to curry local favor.

(My daughter suggested that they are waiting until after the election to appoint me to a lofty position in government. Uh, yeah. That probably explains why I haven’t gotten that MacArthur “genius grant” yet and why Adriana Lima hasn’t called asking me to run away with her—the timing just isn’t right yet. grin

What are your opinions on the Presidential candidates?

I’m not allowed to be partisan in official Purdue outlets. So, in some further posts here over the next week or two I will provide some analysis of both major candidates (NB. Yes, I know there are over 300 candidates for President on the ballots across the country. However, I don’t think there is much chance of Baldwin, Barr, McKinney, Nader, Paul or the rest getting into office. So, I’ll limit my comments to the two main candidates.

If you really want to know who I’m probably voting for, you can see my Facebook page or send me email.

[Update 10/16: After this was published I sent a link to this entry to several people associated with the Obama campaign. Only one responded, and it was clear from his email that there had been a mixup in getting back to me—but no interest in rectifying it.]