CERIAS - Center for Education and Research in Information Assurance and Security

Skip Navigation
Purdue University - Discovery Park
Center for Education and Research in Information Assurance and Security

Anonymized Data

Koray Mancuhan

Koray Mancuhan - Purdue University

Oct 21, 2015

Size: 94.6MB

Download: Video Icon MP4 Video  
Watch in your Browser   Watch on Youtube Watch on YouTube


Privacy has been a hot issue since early 2000s, in particular with the rise of social network and data outsourcing. Data privacy is a big concern in data outsourcing because it involves sharing personal data with third parties. In this talk, I will give an introduction to data privacy on topics such as privacy standards, data anonymization techniques, and data anonymization usage in data outsourcing and data mining. Then, I will present our work in data mining using anonymized data. We propose a data publisher-third party decision tree learning method for outsourced private data. The privacy model is anatomization/fragmentation: the third party sees data values, but the link between sensitive and identifying information is encrypted with a key known only to data publisher. Data publishers have limited processing and storage capability. Both sensitive and identifying information thus are stored on the third parties. The approach presented also retains most processing at the third parties, and data publisher-side processing is amortized over predictions made by the data publishers. Experiments on various datasets show that the method produces decision trees approaching the accuracy of a non-private decision tree, while substantially reducing the data publisher's computing resource requirements.

About the Speaker

Koray is a PhD student in the Department of Computer Science at Purdue University. He is currently a member of the privacy preserving data mining lab under the supervision of Chris Clifton. His research elaborates the data mining models from the anonymized data. The challenge in his research is the injected uncertainty into data because of anonymization methods. In most cases, uncertainty slows down the data mining models and require special mechanisms to exploit noisy data. His work includes learning algorithms such as k-NN classification, SVM classification, decision tree classification and frequent itemset mining.

Koray received his masters degree in Computer Science from Purdue University and his undergraduate degree in Computer Engineering from Galatasaray University. Throughout his masters degree, he studied on data mining and social fairness, and authored papers in this topic. Before joining to Purdue CS, he did his research in semantic web area. He was a former member of Complex Networks lab in Galatasaray University where he worked in developing a new automatic web service annotation tool.

Unless otherwise noted, the security seminar is held on Wednesdays at 4:30P.M. STEW G52 (Suite 050B), West Lafayette Campus. More information...


The views, opinions and assumptions expressed in these videos are those of the presenter and do not necessarily reflect the official policy or position of CERIAS or Purdue University. All content included in these videos, are the property of Purdue University, the presenter and/or the presenter’s organization, and protected by U.S. and international copyright laws. The collection, arrangement and assembly of all content in these videos and on the hosting website exclusive property of Purdue University. You may not copy, reproduce, distribute, publish, display, perform, modify, create derivative works, transmit, or in any other way exploit any part of copyrighted material without permission from CERIAS, Purdue University.