Reports and Papers Archive

Page Content


Browse All Papers »       Submit A Paper »

Learning from past queries for resource selection

luo si

Federated text search provides a unified search interface for multiple search engines of distributed text information sources. Resource selection is an important component for federated text search, which selects a small number of information sources that contain the largest number of relevant documents for a user query. Most prior research of resource selection focused on selecting information sources by analyzing static information of available information sources that is sampled in the offline manner. On the other hand, most prior research ignored a large amount of valuable information like the results from past queries.

This paper proposes a new resource selection technique (which is called qSim) that utilizes the search results of past queries for estimating the utilities of available information sources for a specific user query. Experiment results demonstrate the effectiveness of the new resource selection algorithm.

Added 2010-04-26

Graph-based Signatures for Kernel Data Structures

CERIAS TR 2010-04
Zhiqiang Lin, Junghwan Rhee, Xiangyu Zhang, Dongyan Xu, Xuxian Jiang
Download: PDF
Added 2010-04-21

Broadcast Group Key Management with Access Control Vectors

CERIAS TR 2010-03
Ning Shang, Mohamed Nabeel, Elisa Bertino, Xukai Zou
Download: PDF

Secure collaborative applications currently enabled by the Internet need flexible and efficient mechanisms for managing and distributing group keys.  The secure transmission of information among collaborating users should be efficient as well as flexible in order to support access control models with different granularity levels for different kinds of applications such as secure group communication, secure dynamic conferencing,  and selective/hierarchical access control disseminated information. In this paper, we propose the first provably secure broadcast Group Key Management (BGKM) scheme where each user in a group shares a secret with the trusted key server and the subsequent rekeying for join or departure of users requires only one broadcast message. Our scheme satisfies all the requirements laid down for an effective GKM scheme and requires no change to secret shares existing users possess. We analyze the security of our BGKM scheme and compare it with the existing BGKM schemes which are mostly ad-hoc.

Added 2010-04-05

LiveDM: Temporal Mapping of Dynamic Kernel Memory for Dynamic Kernel Malware Analysis and Debugging

CERIAS TR 2010-02
Junghwan Rhee, Dongyan Xu
Download: PDF

Dynamic kernel memory is difficult to analyze due to its volatile status; numerous kernel objects are frequently allocated or freed in a kernel’s heap, and their data types are missing in the memory systems of current commodity operating systems. Since the majority of kernel data is stored dynamically, this memory has been a favorite target of many malicious software and kernel bugs. In order to analyze dynamic kernel memory, a global technique that systematically translates a given memory address into a data type is essential. Previous approaches had a limited focus in the analysis of either a malware’s execution or a snapshot of kernel memory. We present here a new memory interpretation system called LiveDM that can automatically translate dynamic kernel memory addresses into data types. This system enables the accurate memory analysis of the entire kernel execution, ranging from malware activity to legitimate kernel code execution, over a period of time beyond the instant of a snapshot by using these two novel techniques. (1) The system identifies an individual dynamic kernel object with its systematically-determined runtime identifier that points to the code where the object is allocated. (2) The data type then can be automatically extracted from the code using static code analysis offline. We have implemented a prototype of LiveDM that supports three Linux kernels where LiveDM dynamically tracks tens of thousands of dynamic kernel memory objects that can be accurately translated into data types in the offline process. We have evaluated and validated its general applicability and effectiveness in extensive case studies of kernel malware analysis and kernel debugging.

Added 2010-02-18

Secure and Robust Communication in Wireless Mesh Networks

CERIAS TR 2009-30
Jing Dong
Download: PDF

Wireless mesh networks (WMNs) have become the focus of research in recent years, owing to their great promise in realizing numerous next-generation wireless services. Driven by the demand for rich and high-speed content access, recent research has focused on developing high performance communication protocols, while the security of the proposed protocols has received relatively little attention.  However, given the wireless and multi-hop nature of the communication, WMNs are subject to a wide range of security threats. In this dissertation, we study the security of two main design methodologies that emerged from recent research for achieving high performance data delivery in WMNs, namely, dynamic topology-aware adaptation and network coding.  In addition, we also study the principles of designing efficient application layer security protocols for WMNs.

Dynamic topology-aware adaption presents an important design principle that underlies many high performance network layer protocols proposed for WMNs. We study the unique security threats that exploit the cooperative nature of such protocols. The identified attacks can allow even only a few attacker nodes to distort the path selection process in the entire network and to gain control on a large portion of the traffic in the network. Our proposed defense mechanism relies on passive measurements for detecting attacks and cooperative accusation for identifying and isolating attacker nodes. Through both analysis and experimental evaluations, we show that our defense protocol is effective and incurs low overhead.

Network coding is a major performance improvement technique for WMNs that has emerged in recent years. Numerous practical systems have been designed and demonstrated that network coding is able to achieve significantly improved performance over the traditional packet forwarding approach.  We focus on studying the security aspects of applying network coding on WMNs. We first perform a systematic security analysis on existing network coding systems and uncover numerous security threats on various system components. We then focus on addressing a severe and generic attack against network coding systems, known as packet pollution attack. We propose the first practical defense mechanisms to pollution attacks for both of the two major wireless network coding approaches, intra-flow network coding and inter-flow network coding. Our defense uses efficiently computable random linear checksums and an efficient traceback mechanism to filter polluted packets and identify attacker nodes. The experimental results show that the proposed mechanisms can effectively filter out polluted packets and quickly identify and isolate attacker nodes while incurring small computation and bandwidth overhead.

On the application layer, we demonstrate the unique challenges and opportunities in designing efficient security protocols. We focus on the problem of providing data confidentiality for group communication on WMNs, and present a protocol framework designed specifically for WMNs. Our design employs decentralized group membership, promotes localized communication, and exploits the nature of wireless broadcast. Through both analytical and experimental evaluations, we demonstrate the importance of the design principles for WMNs for the efficiency and performance of the application layer protocols.

Added 2009-12-16

Reuse-Oriented Camouflaging Attack: Vulnerability Detection and Attack Construction

CERIAS TR 2009-29
Zhiqiang Lin, Xiangyu Zhang, Dongyan Xu
Download: PDF

We introduce a reuse-oriented camouflaging attack – a new threat to legal software binaries. To perform a malicious action, such an attack will identify and reuse an existing function in a legal binary program instead of implementing the function itself. Furthermore, the attack is stealthy in that the malicious invocation of a targeted function usually takes place in a location where it is legal to do so, closely mimicking a legal invocation. At the network level, the victim binary can still follow its communication protocol without exhibiting any anomalous behavior. Meanwhile, many close-source shareware binaries are rich in functions that can be maliciously “reused,” making them attractive targets of this type of attack. In this paper, we present a framework to determine if a given binary program is vulnerable to this attack and to construct a concrete attack if so. Our experiments with a number of real-world software binaries demonstrate that the reuse-oriented camouflaging attacks are real and vulnerabilities in the binaries can be effectively revealed and confirmed.

Added 2009-11-11

Unsecured Economies Report

Karthik Kannan and Jackie Rees and Eugene H. Spafford
Added 2009-11-02

The Association between the Disclosure and the Realization of Information Security Risk Factors

CERIAS TR 2009-28
Tawei Wang and Jackie Rees and Karthik Kannan
Download: PDF

Firms often disclose information security risk factors in public filings such as 10-K reports.  The internal information associated with disclosures may be positive or negative.  In this paper, we are interested in evaluating how the nature of security risk factors disclosed, which is believed to represent the internal information regarding information security, is associated with future breach announcements.  For this purpose, we build a decision tree model, which classifies the occurrence of future security breaches based on the textual contents of the disclosed security risk factors.  The model is able to accurately associate disclosure characteristics with breach announcements about 77% of the time.  We further explore the contents of the security risk factors using text mining techniques to provide a richer interpretation of the results.  The results show that the security risk factors with action-oriented terms and phrases are less likely to be related to future incidents.  We also conduct a cross-sectional analysis to study how the market interprets the nature of information security risk factors in annual reports at different time points.  We find that the market reaction following the security breach announcement is different depending on the nature of disclosure.  Thus, our paper contributes to the literature in information security and sheds light on how market participants can better interpret security risk factors disclosed in financial reports at the time when financial reports are released.

Added 2009-11-02

A Privacy-Preserving Approach to Policy-Based Content Dissemination

CERIAS TR 2009-27
Ning Shang, Mohamed Nabeel, Federica Paci, Elisa Bertino
Download: PDF

We propose a novel scheme for selective distribution of content, encoded as documents, that preserves the privacy of the users to whom the documents are delivered and is based on an efficient and novel group key management scheme. Our document broadcasting approach is based on access control policies specifying which users can access which documents, or subdocuments. Based on such policies, a broadcast document is segmented into multiple subdocuments, each encrypted with a different key. In line with modern attribute-based access control, policies are specified against identity attributes of users. However our broadcasting approach is privacy-preserving in that users are granted access to a specific document, or subdocument, according to the policies without the need of providing in clear information about their identity attributes to the document publisher. Under our approach, not only does the document publisher not learn the values of the identity attributes of users, but it also does not learn which policy conditions are verified by which users, thus inferences about the values of identity attributes are prevented. Moreover, our key management scheme on which the proposed broadcasting approach is based is efficient in that it does not require to send the decryption keys to the users along with the encrypted document. Users are able to reconstruct the keys to decrypt the authorized portions of a document based on subscription information they have received from the document publisher. The scheme also efficiently handles new subscription of users and revocation of subscriptions. Please note that this is an improved and extended version of our previous report [1].

Added 2009-10-13

Annual Report 2008

Southwest Research Institute
Added 2009-09-22

Application of VMware Anti-Detection Methods on the ReAssure Testbed

CERIAS TR 2009-26
Daryel Wisely and Pascal Meunier

We reviewed common methods for detecting a VMware guest OS, with a focus on Linux OSes. We ported relevant Windows code, and measured the performance impact of trying to evade detection. We discuss the applicability of those evasion techniques to testbeds such as the Purdue CERIAS ReAssure testbed. This work was funded under the NSF Research Experience for Undergraduates program.

Added 2009-09-16

Essays on information security from an economic perspective

CERIAS TR 2009-24
Ta-Wei Wang
Download: PDF
Added 2009-09-14

ACHIEVING HIGH SURVIVABILITY IN DISTRIBUTED SYSTEMS THROUGH AUTOMATED RESPONSE

CERIAS TR 2009-22
Yu-Sung Wu
Download: PDF

We propose a new model for automated response in distributed systems. We formalize the process of providing automated responses and the criterion for asserting global optimality of the selection of responses. We show that reaching the globally optimal solution is an NP-hard problem. Therefore we design a genetic algorithm framework for searching for good selections of responses in the runtime. Our system constantly adapts itself to the changing environment based on short-term history and also tracks the patterns of attacks in a long-term history.  Unknown security attacks, or zero-day attacks, exploit unknown or undisclosed vulnerabilities and can cause devastating damage. The escalation pattern, commonly represented as an attack graph, is not known a priori for a zero-day attack. Hence, a typical response system provides ineffective or drastic responses. Our system �conceptualizes� nodes in an attack graph, whereby they are generalized based on the object-oriented hierarchy for components and alerts. This is done based on our insight that high level manifestations of unknown attacks may bear similarity with those of previously seen attacks. This allows the use of history such as effectiveness of each response from past attacks to assist responses to the unknown attack.  This thesis lays down three distinct claims and validates them empirically. The claims are: (i) For automated response, consider a baseline mechanism that has a static mapping from the local detector symptom to a local response. This corresponds to the state-of-the-art in deployed response systems. Now consider our proposed model which takes into account global optimality from choosing a set of responses and also does a dynamic computation of the response combination from the set of detectors and other system parameters (inferences about the accuracy of the attack diagnosis, response effectiveness, etc.). The survivability of the application system with our proposed model is an upper bound of the survivability achievable through the baseline model. (ii) In some practical situations, the proposed model gives higher survivability than the baseline model. (iii) The survivability with our proposed model is improved when the system takes into account history from prior similar attacks. This kind of history is particularly important when the system deals with zero-day attacks.

Added 2009-09-11