Gerome Miklau - University of Massachusetts, Amherst
Students: Fall 2024, unless noted otherwise, sessions will be virtual on Zoom.
Safely Analyzing Sensitive Network Data
Nov 18, 2009
Download: MP4 Video Size: 477.3MBWatch on YouTube
Abstract
Social and communication networks are formed by entities (such as individuals or computer hosts) and their connections (which may be contacts, relationships, or flows of information). Such networks are analyzed to understand the influence of individuals in organizations, the transmission of disease in communities, the operation of computer networks, among many other topics. While network data can now be recorded at unprecedented scale, releasing it can result in unacceptable disclosures about participants and their relationships. As a result, privacy concerns are severely constraining the dissemination of network data and disrupting the emerging field of network science.Our recent work investigates the properties of a network that can be accurately studied without threatening the privacy of individuals and their connections. We adopt the rigorous condition of differential privacy, and develop algorithms for releasing randomly perturbed statistics about the topology of a sensitive network. This talk will focus on two basic analysis tasks: the estimation of the degree distribution of a network and the study of small structural patterns that occur in a network (sometimes called motif analysis). We show that the degree distribution of a network can be very accurately estimated by a novel technique in which constraints are applied to the noisy output to improve utility. This technique is of general interest, and can be used to boost the accuracy of differentially private output in other tasks as well. We show that studying motifs is fundamentally harder, but can be done with acceptable accuracy if the privacy condition is relaxed.
About the Speaker
Gerome Miklau is an Assistant Professor at the University of Massachusetts, Amherst. His primary research interest is the secure management of large-scale data. This includes evaluating threats to privacy in published data, devising techniques for the safe publication of social networks, network traces, and audit logs, designing database management systems to implement security policies, and theoretically analyzing information disclosure. He received an NSF CAREER Award in 2007 and won the 2006 ACM SIGMOD Dissertation Award. He received his Ph.D. in Computer Science from the University of Washington in 2005. He earned Bachelor's degrees in Mathematics and in Rhetoric from the University of California, Berkeley, in 1995.