Priority inversion occurs in concurrent programs when low-priority threads hold shared resources needed by some high-priority thread, causing them to block indefinitely. Shared resources are usually guarded by low-level synchronization primitives such as mutual-exclusion locks, semaphores, or monitors. There are two existing solutions to priority inversion. The first, establishing high-level scheduling invariants over synchronization primitives to eliminate priority inversion a priori, is difficult in practice and undecidable in general. Alternatively, run-time avoidance mechanisms such as priority inheritance still force high-priority threads to wait until desired resources are released. We describe a novel compiler and run-time solution to the problem of priority inversion, along with experimental evaluation of its effectiveness. Our approach allows preemption of any thread holding a resource needed by higher-priority threads, forcing it to release its claim on the resource, roll back its execution to the point at which the shared resource was first acquired, and discard any updates made in the interim. The compiler inserts code at synchronization points, permitting rollback of thread execution, and efficient revocation of interim updates. Our design and implementation are realized in the context of IBM’s Jikes RVM, a high-quality compiler and runtime system for Java. Our performance results show that throughput of high-priority threads using our scheme can be improved by 30% to 100% when compared with a classical scheduler that does not address priority inversion.
This paper presents an efficient framework for error-bounded compression of high-dimensional discrete attributed datasets. Such datasets, which frequently arise in a wide variety of applications, pose some of the most significant challenges in data analysis. Subsampling and compression are two key technologies for analyzing these datasets. PROXIMUS provides a technique for reducing large datasets into a much smaller set of representative patterns, on which traditional (expensive) analysis algorithms can be applied with minimal loss of accuracy. We show desirable properties of PROXIMUS in terms of runtime, scalability to large datasets, and performance in terms of capability to represent data in a compact form. We also demonstrate applications of PROXIMUS in association rule mining. In doing so, we establish PROXIMUS as a tool for preprocessing data before applying computationally expensive algorithms or as a tool for directly extracting correlated patterns. Our experimental results show that use of the compressed data for association rule mining provides excellent precision and recall values (near 100%) across a range of support thresholds while reducing the time required for association rule mining drastically.
Software systems often undergo many revisions during their lifetime as new features are added, bugs repaired, abstractions simplified and refactored, and performance improved. When a revision, even a minor one, does occur, the changes it induces must be tested to ensure that invariants assumed in the original version are not violated unintentionally. In order to avoid testing components that are unchanged across revisions, impact analysis is often used to identify code blocks or functions that are affected by a change. In this paper, we present a novel solution to this general problem that uses dynamic programming on instrumented traces of different program binaries to identify longest common subsequences in strings generated by these traces. Our formulation allows us to perform impact analysis and also to detect the smallest set of locations within the functions where the effect of the changes actually manifests itself. Sieve is a tool that incorporates these ideas. Sieve is unobtrusive, requiring no programmer or compiler intervention to guide its behavior. Our experiments on multiple versions of open-source C programs shows that Sieve is an effective and scalable tool to identify impact sets and can locate regions in the affected functions where the changes manifest. These results lead us to conclude that Sieve can play a beneficial role in program testing and software maintenance.
This article presents the design and implementation of a software tool, PROXIMUS, for error-bounded approximation of high-dimensional binary attributed datasets based on nonorthogonal decomposition of binary matrices. This tool can be used for analyzing data arising in a variety of domains ranging from commercial to scientific applications. Using a combination of innovative algorithms, novel data structures, and efficient implementation, PROXIMUS demonstrates excellent accuracy, performance, and scalability to large datasets. We experimentally demonstrate these on diverse applications in association rule mining and DNA microarray analysis. In limited beta release, PROXIMUS currently has over 300 installations in over 10 countries.
We study the problem of detecting and eliminating redundancy in a sensor network with a view to improving energy efficiency, while preserving the network’s coverage. We also examine the impact of redundancy elimination on the related problem of coverage-boundary detection. We reduce both problems to the computation of Voronoi diagrams, prove and achieve lower bounds on the solution of these problems, and present efficient distributed algorithms for computing and maintaining solutions in cases of sensor failures or insertion of new sensors. We prove the correctness and termination properties of our distributed algorithms, and analytically characterize the time complexity and traffic generated by our algorithms. Using detailed simulations, we also quantify the impact of system parameters such as sensor density, transmission range, and failure rates on network traffic.
The goal of this SciDAC project is to develop a scalable parallel and distributed computational framework consisting of methods, algorithms, and integrated software tools for: 1) multi Tera-to-Petascale simulations with quantum-level accuracy; 2) multimillion-to-multibillion-to-trillion atom molecular dynamics (MD) simulations based on density functional theory (DFT) and temperature dependent model generalized pseudopotential theory; 3) quasicontinuum (QC) method embedded with classical atomistic and quantum simulations based on DFT; and 4) accelerated molecular dynamics (AMD) coupled with hierarchical atomistic/QC simulations to reach macroscopic length and time scales relevant to SCC. Scalability is being achieved beyond 105 processors through linear-scaling algorithms and performance-optimization techniques. We are employing automated model transitioning to embed higher fidelity simulations concurrently inside coarser simulations only when and where they are required. Each scale and model has well-defined error bounds with controlled error propagation across the scales and models to estimate uncertainty in predictions.
Security is an important consideration for many applications of ad-hoc networks. While security aspects of the routing layer have been addressed extensively, there is relatively lesser work on establishing a viable public key infrastructure, which is the basis for most security protocols. We present a distributed algorithm for validating the association between the network identifier of a host and its public key without relying on a priori shared secrets or a trusted certification authority.
The past few years have seen tremendous advances in distributed storage infrastructure. Unstructured and structured overlay networks have been successfully used in a variety of applications, ranging from file-sharing to scientific data repositories. While unstructured networks benefit from low maintenance overhead, the associated search costs are high. On the other hand, structured networks have higher maintenance overheads, but facilitate bounded time search of installed keywords. When dealing with typical data sets, though, it is infeasible to install every possible search term as a keyword into the structured overlay.
State-of-the art semantic indexing techniques have been successfully integrated into peer-to-peer (P2P) systems using semantic overlays. However, exiting approaches are based on the premise that the fundamental ingredient of semantic indexing, a semantic basis for the underlying data, is globally available, which is not likely to be the case in practice. Therefore, development of techniques to efficiently compute basis vectors for data distributed across peers is important for large-scale deployment of semantic indexing in P2P systems.
In this paper, we present a novel structured overlay that integrates aspects of semantic indexing using non-orthogonal matrix decompositions, with the hash structure of the overlay. We adopt PROXIMUS, a recursive decomposition method for computing concise representations for binary data sets, to locally identify latent patterns in data distributed across peers. To enable efficient consolidation of patterns, we rely on distributed hash tables (DHT), commonly used in various applications in P2P networks. The discrete nature of non-orthogonal matrix decomposition is well suited to the binary key structure of DHTs, resulting in an indexing method, PMINER, that enables the network to deliver efficient and accurate responses to semantic queries. We present the algorithmic underpinnings of PMINER and demonstrate its excellent performance characteristics on real, as well as synthetic data sets.
Test case prioritization for regression testing can be performed using different metrics (e.g., statement coverage, path coverage) depending on the application context. Employing different metrics requires different prioritization schemes (e.g., maximum coverage, dissimilar paths covered). This results in significant algorithmic and implementation complexity in the testing process associated with various metrics and prioritization schemes. In this paper, we present a novel approach to the test case prioritization problem that addresses this limitation. We devise a framework, Phalanx, that identifies two distinct aspects of the problem. The first relates to metrics that define ordering relations among test cases; the second defines mechanisms that implement these metrics on test suites. We abstract the information into a test-case dissimilarity graph—a weighted graph in which nodes specify test cases and weighted edges specify user-defined proximity measures between test cases. We argue that a declustered linearization of nodes in the graph results in a desirable prioritization of test cases, since it ensures that dissimilar test cases are applied first. We explore two mechanisms for declustering the test case dissimilarity graph—Fiedler (spectral) ordering and a greedy approach. We implement these orderings in Phalanx, a highly flexible and customizable testbed, and demonstrate excellent performance for test-case prioritization. Our experiments on test suites available from the Subject Infrastructure Repository (SIR) show that a variety of user-defined metrics can be easily incorporated in Phalanx.
Dense operators for preconditioning sparse linear systems have traditionally been considered infeasible due to their excessive computational and memory requirements. With the emergence of techniques such as block low-rank approximations and hierarchical multipole approximations, the cost of computing and storing these preconditioners has reduced dramatically. In our prior work [15], we have demonstrated the use of multipole-based techniques as effective parallel preconditioners for sparse linear systems. At one extreme, multipole-based preconditioners behave as dense (bounded interaction) matrices (multipole degree 0), while at the other extreme, they are represented entirely as series expansions. In this paper, we show that: (i) merely truncating the kernel of the integral operator generating the preconditioner leads to poor convergence properties; (ii) far-field interactions, in the form of multipoles, are critical for rapid convergence; (iii) the importance and required accuracy of far-field interactions varies with the complexity of the problem; and (iv) the preconditioner resulting from a judicious mix of near and far-field interactions yields excellent convergence and parallelization properties. Our experimental results are illustrated on the Poisson problem and the generalized Stokes problem arising in incompressible fluid flow simulations.
Trie-based data structures for implementing IP lookups have attracted considerable research attention. Techniques such as path compression, level compression, generalized level compression, and controlled prefix expansion are commonly used to implement lookup tables. In this paper, we present a fundamentally new technique that relies on directed acyclic graphs (DAGs), which, when coupled with generalized level compression, yield significantly better performance than existing techniques. Current implementations of trie-based lookup tables utilize a route validation table in addition to a trie to enable fixed-length subprefix resolution to support path compression. This route validation enables us to merge different, partially filled subtrees to form full subtrees. The resulting DAGs introduce spurious routes that are eliminated in the validation phase. When combined with level compression (and generalized level compression), this structure yields considerably shorter paths than existing approaches. In this paper, we describe a transformation of tries to DAGs, algorithms for packing subtrees, and profile performance of these algorithms and resulting improvements in lookup time. Specifically, we demonstrate, on actual lookup tables, performance gains of up to 34% compared to LC tries with minimal memory overhead (a little over 1%). Considering the fact that an LC trie is already a highly optimized structure, these gains are remarkable.