Reports and Papers Archive - Reports & Papers

Center For Reliable And High-Performance Computing

R. K. Iyer,J. H. Patel

Added 2002-07-26

Paradigms for the Reduction of Audit Trails

Bradford Rice Wetmore

Most automated packages for intrusion detection focus on determining if a collection of audit data is suspicious. Package developers assume that the System Security Officer (SSO) will combine the results of their tools with a careful inspection of the logs to determine if indeed there is evidence of intrusive activity. In practice, most administrators rely exclusively on the conclusions generated by such packages. As a result, very few methods have been developed to browse the raw audit trails. This thesis presents a new approach to this problem. By treating conceptual entities in an audit trail as objects, a framework for observing how entities interact can be developed. All of the records of interest are first scanned to determine the objects and actions of interest. During this initial scanning phase, the objects are interconnected based on how each affects the other, much like a directed graph. The vertices and edges represent the objects and actions respectively. Then, by focusing initially on one object of interest, a SSO can quickly determine how that object affected or was affected by any other object by noting the direction and type of edge connecting the nodes. Say, for example, a process with limited privilege was able to create a new process with unlimited privileges by executing one action. The two processes are represented by the vertices, and the action of gaining privilege could be represented by a directed edge from the first process to the second. Thus by focusing on these new objects, the SSO can then determine how other nodes were directly or indirectly affected by the first object simply by following the next set of edges. An initial prototype program was produced and focused on the UNIX operating system model, and was fairly successful in following entities in the audit trail. Later efforts tried to extrapolate the model to more general comutational systems. Of course, the SSO must still possess technical knowledge of any system to fully analyze the data and realize the implications of the actions therein: there is no substitute for such expertise. This thesis presents a new methodology for browsing such data.

Added 2002-07-26

Security Criteria for Distributed Systems: Functional Requirements

Terry Mayfield,Virgil D Gligor,Janet A Cugini,John M Boone,Robert W Dobry

This document was prepared by the Institute for Defense Analyses (IDA) under the task order, Federal Criteria Development, and fulfills the objective of extending the Federal Criteria to support distributed operating systems. The study was sponsored by the National Security Agency (NSA) with the joint involement of the National Institute of Standards and Technology. The study was initiated as a separate, parallel effort to that of developing the international Common Criteria, with the intent of making this study’s material available at an appropriate time for ultimate inclusion into the Common Criteria.

Added 2002-07-26

Effective Use of Parallel Computing (CER Proposal)

The Department of Computer Science (U. of Washington)

This report is an adaptation of a proposal that was submitted to the National Science Foundation’s Coordinated Experimental Research Program in September 1986. In September 1980 the University of Washington’s Eden Project Received the first award in the CER program. Eden was a five-year effort that explored a specific approach to building “integrated distributed” computer systems. In September 1985 the University of Washington’s Heterogenous Computer Systems Project received a two-year CER award to study strategies for interconnecting hetrogenous computer systems in a research environment. In both the Eden Project and the HCS project, a relatively small subset of the department joined together to pursue a specific, unified experimental research objective. The proposal from which the current techreport is adapted adheres to a different “model”. A broad cross-section of our department currently is involved in research into various aspects of parallel computing, supported by a collection of “traditional” grants from NSF and other agencies. The proposal seeks experimental infrastucture (equipment, maintenence, staffing) to support this existing work, and to propel it in new directions. Thus, the value of this report is that it presents a coherent review (through a “parallel computing” filter) of the broad research activities of our department. The reader desiring a quick overview should direct his or her attention to Section B of the proposal (the “Executive Summary”), and the “Addendum” (a separate 9-page document prepared in January 1987 and appearing at the end of this report).

Added 2002-07-26

Research Review (1987 - 1988)

The Department of Computer Science (U. of Washington)

This report consists of brief synopses of a number of research projects underway in the University of Washington’s Department of Computer Science. Beyond indicating the breadth and depth of our research activities, we hope to stimulate readers to request futher information on projects of specific interest.

Added 2002-07-26

The Heterogeneous Computer System Project: Collected Papers

Edward D Lazowska

This report gathers together and reprints six technical papers produced by the University of Washington’s Heterogeneous Computer Systems (HCS) Project. Our objective is to make the major contributions of the project available in a single place.

Added 2002-07-26

A Simple Approach To Specifying Concurrent Systems

Leslie Lamport

In the transition axiom method, safety properties of a concurrent system can be specified by programs; liveness properties are specified by assertions in a simple temporal logic. The method is described with some simple examples, and its logical foundation is informally explored through a careful examination of what it means to implement a specification. Language issues and other practical details are largely ignored.

Added 2002-07-26

Evolving the Unix System Interface to Support Multithreaded Programs

Paul R McJones,Garret F Swart

Multiple threads (program counters executing in the same address space) make it easier to write programs that deal with related asynchronous activities and that execute faster on shared-memory multiprocessors. Supporting multiple threads places new constraints on the design of operating system interfaces. Part I of this report presents guidelines for designing (or redesigning) interfaces for multithreaded clients. We show how these guidelines were used to design an interface to UNIX-compatible file and process management facilities in the Topaz operating system. Two implementations of this interface are in everyday use: a native one for the Firefly multiprocessor, and a layered one running within a UNIX process. Part II is the actual programmer’s manual for the interface discussed in Part I.

Added 2002-07-26

The Effect of Scheduling Discipline on Spin Overhead in Shared Memory Parallel Processors

John Zahorjan,Edward D Lazowska

Spinning, or busy waiting, is commonly employed in parallel processors when threads of execution must wait for some event, such as synchonization with another thread. Because spinning is purely overhead, it is detrimental to both user response time and system throughput. In the paper we study how spinning is affected by two environmental factors, multiprogramming and data-dependent execution times, and how the choice of scheduling discipline can be used to reduce the amount of spinning in each case. Both evironmental factors increase the variation in delay until the awaited event occurs. In the case of multiprogramming, the thread that will eventually generate the event must compete with uncertain number of other threads for processors. In the case of data-dependent behavior, the delay until the event occurs may depend strongly on the data presented at runtime. We are interested in both the extent to which spin times increase over a simple baseline environment when multiprogramming or data-dependency is intorduced and how this can be reduced through scheduling. The scheduling disciplines we examine are independent of the semantics of the parallel applications. They are thus applicable to the parallel solution of a wide variety of problems and to alternative approaches to solving any single problem independently of algorithm employed. Our analysis is conducted using simple, abstract models of parallel hardware and software. We investigate how software models, one representing the archetypical construct of fork/join rendezvous and the other mutual exclusion using a lock. Our hardware model is most naturally applied to shared memory multiprocessors, since we assume that each thread is capable of running on many different processors during its execution with negligible penalty. both simulation and exact analytic techniques are used to obtain performance results.

Added 2002-07-26

Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism

Thomas E Anderson,Brian N Bershad,Edward D Lazowska,Henry M Levy

‘Threads’ are the vehicle for concurrency in many approaches to parallel programming. Threads separate the notion of a sequential execution stream from the other aspects of traditional UNIX-like processes such as address spaces and I/O descriptors. The goal is to make the expression and control of parallelism sufficiently cheap that the “natural parallel decomposition” of an application can be exploited by the programmer or compiler with acceptable overhead, even in the case where this decomposition is relatively fine-grained. Threads can be supported either by the operating system kernel or by user level library code in the application address space. Neither approach has been fully satisfactory. The performance of kernel-level threads, although at least an order of magnitude better than that of traditional processes, has been atleast an order of magnitude ‘worse’ than that of user-level threads. User-level threads, on the other hand, have suffered from lack of system integration, leading to poor performance or even incorrect behavior in the presence of “real world” factors such as multiprogramming, I/O, and page faults. Thus the parallel programmer has been faced with a difficult dilemma: employ kernel-level threads, which “work right” but perform poorly, or employ user-level threads, which typically will perform but sometimes are seriously deficient. This paper addresses this dilemma. First, we argue that the performance of kernel level threads is inherently worse than that of user-level threads, rather than this being an artifact of exsisting implementations; we thus argue that managing parallelism at the user level is essential to high-performance parallel computing. Next, we argue that the lack of system integration exhibited by user-contemporary multiprocessor operating systems; we thus argue that kernel-level threads or processes, as currently conceived, are the ‘wrong abstraction’ on which to support user-level management of parallelism. Finally, we describe the design implementation, and performance of a new kernel interface and user-level thread package that together provide the same functionality as kernel-level threads without compromising the performance and flexibility advantages of managing parallelism at the user level within each applicatins address space.

Added 2002-07-26

A Dynamic Processor Allocation Policy for Multiprogrammed, Shared Memory Multiprocessors

Cathy McCann,Raj Vaswani,John Zahorjan

We propose and evaluate empirically the performance of a dynamic processor scheduling policy for multiprogrammed, shared memory multiprocessors. The policy is dynamic in that it reallocates processors from one parallel job to another based on the currently realized paralelism of those jobs. The policy is suitable for implementation in production systems in that: - it interacts well with very efficient user-level thread packages, leaving to them many low level thread operations that do not require kernel intervention. - it deals with thread blocking due to user I/O and page faults. - it ensures fairness in delivering resources to jobs. - its performance, measured in terms of average job response time, is superior to that of previously proposed schedulers, including those implemented in exsisting systems. - it provides good performance to very short, sequential (e.g., interactive) requests. We have evaluated our scheduler and compared it to alternatives using a set of prototype implementations running on a Sequent Symmetry multiprocessor. Using a number of parallel applications with distinct qualitative behaviors, we have both evaluated the policies according to the major criterion of overall performance and examined a number of more general policy issues, including the advantage of “space sharing” over “time sharing” the processors of a multiprocessor, the importance of cooperation between the kernel and the applications in reallocating processors between jobs, and the impact of scheduling policy on an application’s cache behavior. We have also compared the policies according to other criteria important in real implementations: fairness, resiliency to countermeasures, and response time to short, sequential requests. We conclude that a combination of performance and implementation considerations makes a compelling case for our dynamic scheduling policy.

Added 2002-07-26

Scheduling a Mixed Interactive and Batch Workload on a Parallel, Shared Memory Supercomputer

Ashok Immaneni,John Zahorjan

Scientific computations are the ‘raison d’etre’ of supercomputers. These computations have very large resource requirements, not only in terms of the number of calculations they must make but also in terms of the amount of storage they must use. The primary goal of a supercomputer system is to provide these resources as efficiently as possible. Besides servicing production large scale computations, a supercompter must also accommodate the steady stream of interacive jobs that arises because of program development and initial debugging of new applications on test data sets. While the resource requirements of these jobs are relatively small, by their nature they demand reasonable response times. Because the batch and interactive workloads compete for resources, one of the objectives of a supercomputer scheduler is to balance the high resource consumption rate of the batch workload against the response time requirements of the interactive workload. In doing this the scheduler must consider both processor and memory demands of the jobs. In this paper we analyze three approaches to scheduling mixed batch and interactive work loads on a supercomputer: (i) Fixed Partition, in which memory resources are statically allocated between the workloads, (ii) No Partition, in which the interactive preempts resources as needed from the batch workload, and (iii) No Partition With Grouped Admission, in which the interactive workload preempts resources only when the number of waiting interactive jobs reaches a threshold value. We also investigate the potential benefits of using virtual memory instantaneously available to them. Using analytic tools, we compare the different policies according to the average speedup achieved by the batch workload given that a mean interactive job response time objective must be met by each. We show that, under a wide variety of conditions, Fixed Partition performs better then the other polices.

Added 2002-07-26

The Performance Implications of Thread Management Alternatives for Shared-Memory Multiprocessors

Thomas E Anderson,Edward D Lazowska,Henry M Levy

Download: PDF

Threads (“lightweight” processes) have become a common element of new languages and operating systems. This paper examines the performance implications of several data structure and algorithm alternatives for thread management in shared-memory multi- processors. Both experimental measurements and analytical model projections are presented. For applications with fine-grained parallelism, small differences in thread management are shown to have significant performance impact, often posing a tradeoff between throughput and latency. Pre-processor data structures can be used to improve throughput, and in some circumstances to avoid locking, improving latency as well. The method used by processors to queue for locks is also shown to affect performance significantly. Normal methods of critical resource waiting can substantially degrade performance with moderate numbers of waiting processors. We present an Ethernet-style backoff algorithm that largely eliminates this effect.

Added 2002-07-26

Reliable Scheduling of Database Transactions

Marc H. Graham,Nancy Griffeth,Barbara Smith-Thomas

A database transaction is usually required to have an “all-or-none” property, that is, either all of its changes to the database are installed in the database or none of them are. This property can be violated by a hardware or software failure occurring in the middle of the execution of a transaction or even by a decision to abort a transaction to avoid an inconsistent (non-serializable) schedule. In such a situation, we must recover from the failure by restoring the database to a state in which the failed transaction never executed. The idea of “recoverable schedules” was defined in [H83]. In a recoverable schedule, a transaction does not commit until there is no possibility that it will be rolled back. If the underlying hardware or software is unreliable, this much not occure until all transactions which have written values read by the transaction have themselves committed.

Added 2002-07-26

Hashing with SL2

Jean-Pierre Tillich,Gilles Zemore

We propose a new family of hash functions based on computations over a finite field of characteristic 2. These functions can be computed quickly, detect small modifications of the input text, and their security is equivalent to a precise mathematical problem. They rely on the arithmetic of the group of matrices SL2, and improve upon previous functions based on the same strategy.

Added 2002-07-26