Reports and Papers Archive - Reports & Papers

Challenge benchmarks for verification of real-time programs

CERIAS TR 2010-21

Jan Vitek

Real-time systems, and in particular safety-critical systems, are a rich source of challenges for the program verification community as software errors can have catastrophic consequences. Unfortunately, it is nearly impossible to find representative safety-critical programs in the public domain. This has been significant impediment to research in the field, as it is very difficult to validate new ideas or techniques experimentally. This paper presents open challenges for verification of real-time systems in the context of the Real-time Specification for Java. But, our main contribution is a family of programs, called CDx, which we present as an open source benchmark for the verification community.

Added 2010-10-07

An analysis of the dynamic behavior of JavaScript programs

CERIAS TR 2010-20

Jan Vitek

The JavaScript programming language is widely used for web programming and, increasingly, for general purpose computing. As such, improving the correctness, security and performance of JavaScript applications has been the driving force for research in type systems, static analysis and compiler techniques for this language. Many of these techniques aim to reign in some of the most dynamic features of the language, yet little seems to be known about how programmers actually utilize the language or these features. In this paper we perform an empirical study of the dynamic behavior of a corpus of widely-used JavaScript programs, and analyze how and why the dynamic features are used. We report on the degree of dynamism that is exhibited by these JavaScript programs and compare that with assumptions commonly made in the literature and accepted industry benchmark suites.

Added 2010-10-07

Implicit ownership types for memory management

CERIAS TR 2008-32

Jan Vitek

Download: PDF

The Real-time Specification for Java (RTSJ) introduced a range of language features for explicit memory management. While the RTSJ gives programmers fine control over memory use and allows linear allocation and constant-time deallocation, the RTSJ relies upon dynamic runtime checks for safety, making it unsuitable for safety critical applications. We introduce ScopeJ, a statically-typed, multi-threaded, object calculus in which scopes are first class constructs. Scopes reify allocation contexts and provide a safe alternative to automatic memory management. Safety follows from the use of an ownership type system that enforces a topology on run-time patterns of references. ScopeJ’s type system is novel in that ownership annotations are implicit. This substantially reduces the burden for developers and increases the likelihood of adoption. The notion of implicit ownership is particularly appealing when combined with pluggable type systems, as one can apply different type constraints to different components of an application depending on the requirements without changing the source language. In related work we have demonstrated the usefulness of our approach in the context of highly-responsive systems and stream processing.

Added 2010-10-07

Timetraveler: exploiting acyclic races for optimizing memory race recording

CERIAS TR 2010-19

T. N. Vijaykumar

As chip multiprocessors emerge as the prevalent microprocessor architecture, support for debugging shared-memory parallel programs becomes important. A key difficulty is the programs’ nondeterministic semantics due to which replay runs of a buggy program may not reproduce the bug. The non-determinism stems from memory races where accesses from two threads, at least one of which is a write, go to the same memory location. Previous hardware schemes for memory race recording log the predecessor-successor thread ordering at memory races and enforce the same orderings in the replay run to achieve deterministic replay. To reduce the log size, the schemes exploit transitivity in the orderings to avoid recording redundant orderings. To reduce the log size further while requiring minimal hardware, we propose Timetraveler which for the first time exploits acyclicity of races based on the key observation that an acyclic race need not be recorded even if the race is not covered already by transitivity. Timetraveler employs a novel and elegant mechanism called post-dating which both ensures that acyclic races, including those through the L2, are eventually ordered correctly, and identifies cyclic races. To address false cycles through the L2, Timetraveler employs another novel mechanism called time-delay buffer which delays the advancement of the L2 banks’ timestamps and thereby reduces the false cycles. Using simulations, we show that Timetraveler reduces the log size for commercial workloads by 88% over the best previous approach while using only a 696-byte time-delay buffer.

Added 2010-10-07

Remote reprogramming of wireless sensor networks

CERIAS TR 2010-18

Rajesh Krishna Panta

Download: PDF

In recent years, advances in hardware and software tools have led to many realworld sensor network deployments. Management of already deployed sensor networks is a very important issue. One of the crucial management tasks is that of software reconfiguration. During the lifetime of a sensor network, software running on the sensor nodes may need to be changed for various reasons like correcting software bugs, modifying the application to meet the changing environmental conditions in which the network is deployed, adapting to evloving user requirements, etc. The frequency of software updates is very high in sensor networks due to various reasons — harsh and unpredictable environments in which the sensor nodes are deployed, time-varying nature of the wireless channel, lack of robust tools for developing software, interference in the unlicensed ISM band, topology change casued by node mobility, battery outages, etc. Since a sensor network may consist of hundreds or even thousands of nodes which may be situated at places which are difficult or, sometimes, impossible to access physically, remote reprogramming of sensor networks is essential. This thesis presents energy efficient and fast reprogramming services for wireless sensor networks. Sensor networks are often battery-powered and need to be operated unattended for long periods of time. Radio transmission is often the most energy-expensive operation in sensor networks. Since energy is a very scarce resource, this thesis focuses on conserving energy during reprogramming. Also, since the performance of the network may be degraded, or even reduced to zero, during software update process, the reprogramming techniques proposed here minimize reprogramming time significantly compared to existing reprogramming protocols.

Added 2010-10-07

Analytical characterization of internet security attacks

CERIAS TR 2010-17

Sarah Sellke

Internet security attacks have drawn significant attention due to their enormously adverse impact. These attacks includes Malware (Viruses, Worms, Trojan Horse), Denial of Service, Packet Sniffer, and Password Attacks. There is an increasing need to provide adequate defense mechanisms against these attacks. My thesis proposal deals with analytical aspects of the Internet security attacks, as well as practical solutions based on our analysis.

First, We focus on modeling and containment of internet worms. We present a branching process model for the propagation of worms. Our model leads to the development of automatic worm containment strategies, which effectively contain both uniform scanning worms and local preference scanning worms. Incremental deployment of our scheme provides worm containment for local networks when combined with traditional firewalls.

Next, we study the capacity of Bounded Service Timing Channels. We derive an upper bound and two lower bounds on the capacity of such timing channels. We show that when the length of the support interval is small, the uniform BSTC has the smallest capacity among all BSTCs. Based on our analysis, we design and implement a covert timing channel over TCP/IP networks. We are able to quantify the achievable data rate (or leak rate) of such a covert channel. Moreover, by sacrificing data rate, we are able to mimic normal traffic patterns, which makes detecting such communication virtually impossible.

Added 2010-10-06

Memory balancing for large-scale network simulation in power-law networks

CERIAS TR 2008-31

Hyojeong Kim

Large-scale network simulation has grown in importance due to a rapid increase in Internet size and the availability of Internet measurement topologies with applications to computer networks and network security. A key obstacle to large-scale network simulation over PC clusters is the memory balancing problem, where a memory-overloaded machine can slow down a distributed simulation due to disk I/O overhead. Network partitioning methods for parallel and distributed simulation are insufficiently equipped to handle new challenges brought on by memory balancing due to their focus on CPU and communication balancing.

This dissertation studies memory balancing for large-scale network simulation in power-law networks over PC clusters. First, we design and implement a measurement subsystem for dynamically tracking memory consumption in DaSSFNet, a distributed network simulator. Accurate monitoring of memory consumption is difficult due to complex protocol interaction through which message related events are created and destroyed inside and outside a simulation kernel. Second, we achieve efficient memory cost monitoring by tackling the problem of estimating peak memory consumption of a group of simulated network nodes in power-law topologies during network partitioning. In contrast to CPU balancing where the processing cost of a group of nodes is proportional to their sum, in memory balancing this closure property need not hold. Power-law connectivity injects additional complications due to skews in resource consumption across network nodes. Third, we show that the maximum memory cost metric outperforms the total cost metric for memory balancing under multilevel recursive partitioning but the opposite holds for CPU balancing. We show that the trade-off can be overcome through joint memory-CPU balancing—in general not feasible due to constraint conflicts—which is enabled by network simulation having a tendency to induce correlation between memory and CPU costs. Fourth, we evaluate memory balancing in the presence of virtual memory (VM) management which admits larger problem instances to be run over limited physical memory. VM introduces complex memory management dependencies that make understanding and evaluating simulation performance difficult. We provide a performance evaluation framework wherein the impact of memory thrashing in distributed network simulation is incorporated which admits quantitative performance comparison and diagnosis. Fifth, we show that improved memory balancing under the maximum cost metric in the presence of VM manifests as faster completion time compared to the total cost metric despite the CPU balancing advantage of the latter. In the cases where the CPU balancing advantage of the total cost metric is strong, we show that joint memory-CPU balancing can achieve the best of both worlds.

We carry out performance evaluation using benchmark applications with varying traffic characteristics: BGP routing, worm propagation under local and global scanning, and distributed client/server system. We use a testbed of 32 Intel x86 machines running a measurement-enhanced DaSSFNet over Linux.

Added 2010-10-06

Toward privacy-preserving database management systems --- Access control and data anonymization

CERIAS TR 2007-104

Ji-won Byun

In this thesis, we identify basic requirements for privacy-preserving DBMS and focus on two core techniques, namely purpose-based access control and data anonymization, that are essential to address some of the requirements. Specifically, purpose-based access control enables DBMS to tightly control data access with respect to privacy requirements and preferences, and data anonymization provides a way to guarantee privacy protection in data itself even if the control of access is not feasible. We present formal models and develop mechanisms for realizing such models. In addition, we introduce two conceptual models, micro-view and integrity-control, which are designed to enhance data utility and integrity, respectively.

Added 2010-10-06

Hybrid Data and Text System for Downgrading Sensitive Documents

CERIAS TR 2001-154

M. J. Atallah and V. Raskin (with C. F. Hempelmann and D. Mohamed)

Download: PDF

An application of ontological semantics to the problems od declassification, sanitization, or down grading of natural language text by automatically accessing and representing the full meaning of the texts.

Added 2010-09-27

Effects of Anonymity, Pre-Employment Integrity and Antisocial Behavior on Self-Reported Cyber Crime Engagement: An Exploratory Study

CERIAS TR 2009-31

Ibrahim M. Baggili

Download: PDF

A key issue facing today’s society is the increase in cyber crimes. Cyber crimes pose threats to nations, organizations and individuals across the globe. Much of the research in cyber crime has risen from computer science-centric programs and little experimental research has been performed on the psychology of cyber crime. This has caused a knowledge gap in the study of cyber crime. To this end, this dissertation focuses on understanding psychological concepts related to cyber crime. Through an experimental design, participants were randomly assigned to three groups with varying degrees of anonymity. After each treatment, participants were asked to self-report their cyber crime engagement, antisocial behavior and pre-employment integrity. Results indicated that the anonymity manipulation had a main effect on self-reported cyber crime engagement. The results also showed that there is a statistically significant positive relationship between self-reported antisocial behaviors and cyber crime engagement, and a statistically significant negative relationship between self-reported cyber crime engagement and pre- employment integrity. Suggestions for future research are also discussed.

Added 2010-08-30

An Evaluation of Template Splitting to Prevent Sample Reconstruction from Fingerprint Templates

CERIAS TR 2010-10

Ashwin Mohan

Download: PDF

Current research in fingerprint recognition systems have shown that given a fingerprint template, an approximation of the original fingerprint sample can be created. In this thesis, the capability of template splitting to prevent sample reconstruction from fingerprint templates is evaluated. An attack simulation was formulated as part of this thesis for testing template splitting within a fingerprint verification setup in its ability to prevent sample reconstruction. False Non Match Rate (FNMR) was used as the performance metric. Statistical analysis of the FNMR showed that the use of template splitting results in a significant decrease in the capability of approximate fingerprint samples to get matched within the fingerprint system.

Added 2010-07-12

Authorship attribution of SMS messages using an N-grams approach

CERIAS TR 2010-11

Ashwin Mohan, Ibrahim M. Baggili, Marcus K. Rogers

Download: PDF

The pervasive use of SMS is increasing the amount of digital evidence available on cellular phones. Consequently it has become important to detect SMS authors, as a post-hoc analysis technique deemed useful in criminal persecution cases. This paper investigates an N-grams based approach for determining the authorship of SMS messages. Despite the scarcity of words in SMS messages and the differences between SMS language and natural language characteristics, the chosen method shows encouraging results in identification of authors. In this paper the effects of the gram size and the similarity scoring technique on the prediction of SMS message authors are also examined.

Added 2010-07-12

Assessing the Trustworthiness of Streaming Data

CERIAS TR 2010-09

Hyo-Sang Lim, Yang-Sae Moon, Elisa Bertino

Download: PDF

The notion of confidence policy is a novel notion that exploits trustworthiness of data items in data management and query processing. In this paper we address the problem of enforcing confidence policies in data stream management systems (DSMSs), which is crucial in supporting users with different access rights, processing confidence-aware continuous queries, and protecting the secure streaming data. For the paper, we first propose a DSMS-based framework of confidence policy management and then present a systematic approach for estimating the trustworthiness of data items. Our approach uses the data item provenance as well as their values. We introduce two types of data provenance: the physical provenance which represents the delivering history of each data item, and the logical provenance which describes the semantic meaning of each data item. The logical provenance is used for grouping data items into semantic events with the same meaning or purpose. By contrast, the tree-shaped physical provenance is used in computing trust scores, that is, quantitative measures of trustworthiness. To obtain trust scores, we propose a cyclic framework which well reflects the inter-dependency property: the trust scores of data items affect the trust scores of network nodes, and viceversa. The trust scores of data items are computed from their value similarity and provenance similarity. The value similarity comes from the principle that “the more similar values for the same event, the higher the trust scores,” and we compute it under the assumption of normal distribution. The provenance similarity is based on the principle that “the more different physical provenances with similar values, the higher the trust scores,” and we compute it using the tree similarity. Since new data items continuously arrive in DSMSs, we need to evolve (i.e., recompute) trust scores to reflect those new items. As evolution scheme, we propose the batch mode for computing scores (non)periodically along with the immediate mode. To our best knowledge, our approach is the first supporting the enforcement of confidence policies in DSMSs. Experimental results show that our approach is very efficient.

Added 2010-07-10

Privacy-aware Role-Based Access Control

Qun Ni

Current proposals for access control languages cannot specify policies required by speciﬁc application scenarios (e.g. a database system to enforce privacy regulations), may also contain design ﬂaws, and are incompatible. In this dissertation, we extend RBAC with new components to meet requirements from privacy-aware access control which is required to enforce privacy laws and regulations in organizational computing environments.

We propose an access control language for provenance access control which re- quires aggregating access decisions from diﬀerent sources and controlling the access to diﬀerent sections of provenance information. We investigate various problems in risk-based access control. Risk-based access control is particularly useful for making access decisions in an emergency. Sub jects without suﬃcient privilege in an emergency have to be given authorization to access sensitive information in diﬀerent ways, based on their risk estimations. We also identify design ﬂaws in representative proposals, e.g. XACML, and present corresponding solutions.

We ﬁnally propose an extensible functional access control language that com- bines the beneﬁts of XACML and RBAC without their drawbacks. The language is attribute-based and context-centric and supports sophisticated error handling and ﬂexible decision aggregation methods. We also show the language is able to meet requirements from all speciﬁc application domains discussed in this dissertation.

Added 2010-06-29

Structural Signatures: How to Authenticate Trees Without Leaking

CERIAS TR 2010-08

Ashish Kundu, Elisa Bertino

Data sharing over a third-party distribution framework such as the cloud computing paradigm requires that both data authenticity and confidentiality be assured. One of the most widely used data organization structures is the tree structure. When such structures encode sensitive information (such as in XML documents), it is crucial that authenticity and confidentiality be assured not only for the content, but also for the structure. There is a plethora of work on data authentication in the literature; however, none of them address the problem of leakage-free authentication of tree-structured data, especially when such structures encode sensitive information (such as in XML documents). The most widely used technique for trees is the Merkle hash technique (MHT), which however is known to be ``not hiding’‘, i.e., it leads to leakage of information. Most existing data authentication techniques are based on the MHT and thus suffer from the problem of information leakages. In this paper, we propose the first leakage-free authentication scheme for tree data structures, which is also efficient. Our scheme, referred to as the ``structural authentication scheme’’ is based on the structure of the tree as defined by tree traversals, and aggregate signatures. In addition to formally defining the technique, we prove that it protects against violations of content and structural integrity and information leakages. Complexity analysis shows that our scheme incurs comparable cost for signing and user-side authentication, and less communication overhead while providing stronger security properties. We also have shown how our scheme can handle leakage-free authentication of dynamic trees. Two applications of the proposed scheme are presented: (1) automatic correction and recovery from structural errors, and (2) structure-based routing secure publish/subscribe of XML documents.

Added 2010-06-26