Identification, Authentication, and Privacy
There is a tension between increased confidence and granularity of authorization provided by better identification of on-line entities, and with the need to protect the privacy rights of individuals and organizations. This area includes research in role-based access control (RBAC), biometrics, pervasive surveillance ("Panoptic Effects"), privacy-protecting transformations of data, privacy-protecting data mining methods, privacy regulation (e.g., HIPAA and COPPA), oblivious multiparty computation, and trusted proxy research.
A Framework for Managing the Assured Information Sharing Lifecycle
An Integrated and Utility-Centric Framework for Federated Text Search
Traditional search engines like Google typically ignore a large amount of information behind the search engines of many online text information sources. Federated text search provides one-stop access to the hidden information via a single interface that connects to multiple search engines of text information sources. Existing federated search solutions only focus on content relevance and ignore a large amount of valuable information about users and information sources. This project includes novel research on: (1) Multiple Type Resource Representation: model important information of text information sources such as search response time and search engine effectiveness; (2) Utility-Centric Resource Selection: satisfy a user’s search criteria by considering multiple types of evidence such as content relevance, search results from past queries, personal information needs, and search response time; (3) Effective and Efficient Results Merging: produce accurate merged ranked results with little cost of acquiring the content information of the returned documents; (4) System Adaptation by Results Analysis: analyze the search results from past queries for more accurate federated search solutions; (5) System Development and Evaluation: build and test algorithms within research environments as well as a new FedLemur system for a real world application. The project advances the state-of-the-art of research in federated search. It will have broad impacts for other applications such as peer to peer search. The project Web site (http://www.cs.purdue.edu/~lsi/Federated_Search_Career_Award.html) will be used for results dissemination. The education component of the project will expand information retrieval instruction to address multi-disciplinary requirements, improve the education of information technology workforce, and arouse interests of K-12 students for search technologies.
Intelligent Privacy Manager
The StreamShield Project
The goal of our research in the StreamShield project is to investigate security and privacy constraints on both data and queries in the context of data stream management systems (DSMS). Unlike in traditional DBMSs where access control policies are persistently stored on the server and tend to remain stable, in streaming applications the contexts and with them the access control policies on the real-time data may rapidly change. We propose a novel “stream-centric” approach, where security restrictions are not persistently stored on the server, but rather streamed together with the data. The data provider access control policies are expressed via security constraints called “data security punctuations” (or short, dsps). Server-side policies are specified by administrators in the form of “continuous policy queries” which emit query security constraints called “query security punctuations” (or short, qsps). The advantages of our model include flexibility, dynamicity and speed of enforcement as both data and query security punctuations are embedded inside data streams. Administrators can specify complex context-aware authorization policy queries. At run-time, continuous policy queries are evaluated, authorizations are produced and the engine can enforce any context-aware policy automatically. Moreover, DSMSs can adapt to not only data-related but also security-related selectivities, which helps reduce the waste of resources, when few subjects have access to data.
Systematic Control and Management of Data Integrity, Quality and Provenance for Command and Control Applications
Access Control Policy Verification Through Security Analysis And Insider Threat Assessment
Access control is one of the most fundamental security mechanisms in use today; however, the specification and management of access control policies remains a challenging problem, and today’s administrators have no effective tools to assist them. This research addresses these needs and arising challenges by developing new verification techniques for access control policies, and verification tools that will help administrators specify, understand, and manage their access control policies. In particular, this research studies security analysis and insider threat assessment. Security analysis techniques answer the fundamental question of whether an access control system preserves essential security properties across changes to the authorization state. Insider threat assessment techniques determine what damages insiders can cause if they misuse the trust that has been placed on them. While focusing primarily on the widely-deployed Role-Based Access Control model, this project also aims at developing theoretical foundations and general techniques for access control policy verification. Insights obtained from this research will be applicable to other richer access control models and will help improve the understanding of the power and limitation of access control.
Text Extraction and Data Visualization for Pet Health Surveillance
A Comprehensive Policy-Driven Framework for Online Privacy Protection: Integrating IT, Human, Legal and Economic Perspectives
Privacy is increasingly a major concern that prevents the exploitation of the Internet’s full potential. Consumers are concerned about the trustworthiness of the websites to which they entrust their sensitive information. Although significant industry efforts are seeking to better protect sensitive information online, existing solutions are still fragmented and far from satisfactory. Specifically, existing languages for specifying privacy policies lack a formal and unambiguous semantics, are limited in expressive power and lack enforcement as well as auditing support. Moreover, existing privacy management tools aimed at increasing end-users’ control over their privacy are limited in capability or difficult to use. This project seeks to provide a comprehensive framework for protecting online privacy, covering the entire privacy policy life cycle. This cycle includes enterprise policy creation, enforcement, analysis and auditing, as well as end user agent presentation and privacy policy processing. The project integrates privacy-relevant human, legal and economic perspectives in the proposed framework. This project will develop an expressive, semantics-based formal language for specifying privacy policies, an access control and auditing language for enforcing privacy policies in applications, as well as theory and tools for verifying privacy policies. Additionally, experiments and surveys will be conducted to better understand the axes of users’ privacy concerns and protection objectives. Results from this empirical work will be used to develop an effective paradigm for specifying privacy preferences and methods to present privacy policies to end users in an accurate and accessible way.
The Design and Use of Digital Identities
Digital identity management (DIM) has emerged as a critical foundation for supporting successful interactions in today’s globally interconnected society. It is crucial not only for the conduct of business and government but also for a large and growing body of electronic or online social interactions. In its broadest sense, identity management encompasses definitions and life-cycle management for digital identities and profiles, and the environments for exchanging and validating such information, including anonymous and pseudonymous representations. The project addresses a wide variety of digital identity needs by developing required Flexible, Multiple and Dependable Digital Identity (FMDDI) technology, based on a sound underlying set of definitions and principles. The FMDDI technology developed in the project will support multiple forms of identity, including nyms, partial identities, and a variety of user properties, credentials, and roles. Relevant research trusts in the project include: identity schemes and representation formats; use of ontology and issues related to identity interoperability; anonymity, dependability, accountability, and forensic-friendly identification schemes; psychological and social aspects related to the use of digital identities.
Privacy-Preserving Data Integration and Sharing
Integrating and sharing data from multiple sources has been a long-standing challenge in the database community. This problem is crucial in numerous contexts, including data integration for enterprises and organizations, data sharing on the Internet, collaboration among government agencies, and the exchange of scientific data. Many applications of national importance, such as emergency preparedness and response; as well as research in many scientific domains, require integrating and sharing data among participants.
Data integration is seriously hampered by an inability to ensure privacy. Without a privacy framework, sources are reluctant to share their data. Problems include fear of disclosing confidential information as well as regulations protecting individual privacy. While there has been progress in computing aggregations of distributed data without disclosing that data; e.g., privacy-preserving distributed data mining, it assumes data integration problems (schema matching, record linkage) are solved. As a consequence, the lack of a privacy-preserving data integration framework has become a key bottleneck to deploying data integration.
This project will develop the technology needed to create and manage federated databases while controlling the disclosure of private data. While the emphasis will be on general techniques for data integration that preserve privacy, the project will work in the context of diverse but particularly relevant problem domains, including scientific research and emergency preparedness. Involvement of domain experts from these fields in developing and testing the techniques will ensure impact on areas of national importance.


