Privacy Preserving Data Mining on Vertically Partitioned Data
Jaideep Vaidya - Purdue University
Jan 14, 2004
AbstractPrivacy and security concerns can prevent sharing of data, derailing data mining
projects. Distributed knowledge discovery, if done correctly, can alleviate
this problem. The problem lies not so much with the results of data mining, but
rather with the process of data mining. Current data mining algorithms require
some form of access to all of the data, which in and of itself provides
oppurtunity for misuse.
The key is to obtain valid results, while providing guarantees on the
(non)disclosure of data. We focus on situations where different sites contain
different attributes for a common set of entities. We present solutions for
doing data mining in such scenarios. Related work in cryptography provides a
strong theoretical foundation for secure computation. Cryptographic approaches
to preserving privacy enable formal guarantees for privacy preservation.
This talk provides a brief introduction to the area as well as a brief synopsis
of solutions for several data mining algorithms. We present an efficient
protocol for securely determining the size of set intersection, and show how
this can be used to perform decision tree classification where multiple parties
have different (and private) information about the same set of individuals. We
also present a privacy-preserving method for k-means clustering. Each site
learns the cluster of each entity, but learns nothing about the attributes at
other sites. This work was presented at KDD '03 where it received the honorable
mention award for the best research paper.
About the SpeakerJaideep Vaidya is a Ph.D. candidate working with Prof. Chris Clifton in the
Department of Computer Sciences at Purdue University. He received his B.E.
degree in Computer Engineering from the University of Mumbai, India in 1999 and
his M.S. degree from Purdue in 2001. His research interests lie at the
confluence of privacy, security and data mining.
The views, opinions and assumptions expressed in these videos are those of the presenter and do not necessarily reflect the official policy or position of CERIAS or Purdue University. All content included in these videos, are the property of Purdue University, the presenter and/or the presenter’s organization, and protected by U.S. and international copyright laws. The collection, arrangement and assembly of all content in these videos and on the hosting website exclusive property of Purdue University. You may not copy, reproduce, distribute, publish, display, perform, modify, create derivative works, transmit, or in any other way exploit any part of copyrighted material without permission from CERIAS, Purdue University.