Reports and Papers Archive
Incorporating Temporal Capabilities in Existing Key Management Schemes
The problem of key management in access hierarchies studies ways to assign keys to users and classes such that each user, after receiving her secret key(s), is able to independently compute access keys for (and thus obtain access to) the appropriate resources defined by the hierarchical structure. If user privi- leges additionally are time-based, the key(s) a user receives should permit access to the resources only at the appropriate times. This paper presents a new, prov- ably secure, and efficient solution that can be used to add time-based capabilities to existing hierarchical schemes. It achieves the following performance bounds: (i) to be able to obtain access to an arbitrary contiguous set of time intervals, a user is required to store at most 3 keys; (ii) the keys for a user can be computed by the system in constant time; (iii) key derivation by the user within the authorized time intervals involves a small constant number of inexpensive cryptographic op- erations; and (iv) if the total number of time intervals in the system is n, then the server needs to maintain public storage larger than n by only a small asymptotic factor, e.g., O(log ∗ n log log n) with a small constant.
Passwords for Everyone: Secure Mnemonic-based Accessible Authentication
Passwords Decay, Words Endure: Secure and Re-usable Multiple Password Mnemonics
Secure and Private Collaborative Linear Programming
Point-Based Trust: Define How Much Privacy Is Worth
This paper studies the notion of point-based policies for trust management, and gives protocols for realizing them in a disclosure-minimizing fashion. Specifically, Bob values each credential with a certain number of points, and requires a minimum total threshold of points before granting Alice access to a resource. In turn, Alice values each of her credentials with a privacy score that indicates her reluctance to reveal that credential. Bob’s valuation of credentials and his threshold are private. Alice’s privacy-valuation of her credentials is also private. Alice wants to find a subset of her credentials that achieves Bob’s required threshold for access, yet is of as small a value to her as possible. We give protocols for computing such a subset of Alice’s credentials without revealing any of the two parties’ above-mentioned private information.
Words are Not Enough: Sentence Level Natural Language Watermarking
The Hiding Virtues of Ambiguity: Quantifiably Resilient Watermarking of Natural Language Text through Synonym Substitutions
ViWiD: Visible Watermarking-Based Defense Against Phishing
Privacy-preserving distributed mining of association rules on horizontally partitioned data
Data mining can extract important knowledge from large data collections ut sometimes these collections are split among various parties. Privacy concerns may prevent the parties from directly sharing the data and some types of information about the data. We address secure mining of association rules over horizontally partitioned data. The methods incorporate cryptographic techniques to minimize the information shared, while adding little overhead to the mining task.
TopCat: data mining for topic identification in a text corpus
TopCat (topic categories) is a technique for identifying topics that recur in articles in a text corpus. Natural language processing techniques are used to identify key entities in individual articles, allowing us to represent an article as a set of items. This allows us to view the problem in a database/data mining context: Identifying related groups of items. We present a novel method for identifying related items based on traditional data mining techniques. Frequent itemsets are generated from the groups of items, followed by clusters formed with a hypergraph partitioning scheme. We present an evaluation against a manually categorized ground truth news corpus; it shows this technique is effective in identifying topics in collections of news articles.
Change Detection in Overhead Imagery Using Neural Networks
Identifying interesting changes from a sequence of overhead imagery—as opposed to clutter, lighting/seasonal changes, etc.—has been a problem for some time. Recent advances in data mining have greatly increased the size of datasets that can be attacked with pattern discovery methods. This paper presents a technique for using predictive modeling to identify unusual changes in images. Neural networks are trained to predict “before†and “after†pixel values for a sequence of images. These networks are then used to predict expected values for the same images used in training. Substantial differences between the expected and actual values represent an unusual change. Results are presented on both multispectral and panchromatic imagery.
Emerging standards for data mining
This paper presents an overview of data mining, then discusses standards (both existing and proposed) that are relevant to data mining. This includes standards that affect several stages of a data mining project. Summaries of several emerging standards are given, as well as proposals that have the potential to change the way data mining tools are built.
Using sample size to limit exposure to data mining
Data mining introduces new problems in database security. The basic problem of using non-sensitive data to infer sensitive data is made more difficult by the “probabilistic†inferences possible with data mining. This paper shows how lower bounds from pattern recognition theory can be used to determine sample sizes where data mining tools cannot obtain reliable results.
SEMINT: A tool for identifying attribute correspondences in heterogeneous databases using neural networks
One step in interoperating among heterogeneous databases is semantic integration: Identifying relationships between attributes or classes in different database schemas. SEMantic INTegrator (SEMINT) is a tool based on neural networks to assist in identifying attribute correspondences in heterogeneous databases. SEMINT supports access to a variety of database systems and utilizes both schema information and data contents to produce rules for matching corresponding attributes automatically. This paper provides theoretical background and implementation details of SEMINT. Experimental results from large and complex real databases are presented. We discuss the effectiveness of SEMINT and our experiences with attribute correspondence identification in various environments.

