The Center for Education and Research in Information Assurance and Security (CERIAS)

The Center for Education and Research in
Information Assurance and Security (CERIAS)

Koray Mancuhan - Purdue University

Students: Spring 2024, unless noted otherwise, sessions will be virtual on Zoom.

Anonymized Data

Oct 21, 2015

Download: Video Icon MP4 Video Size: 94.6MB  
Watch on Youtube Watch on YouTube

Abstract

Privacy has been a hot issue since early 2000s, in particular with the rise of social network and data outsourcing. Data privacy is a big concern in data outsourcing because it involves sharing personal data with third parties. In this talk, I will give an introduction to data privacy on topics such as privacy standards, data anonymization techniques, and data anonymization usage in data outsourcing and data mining. Then, I will present our work in data mining using anonymized data. We propose a data publisher-third party decision tree learning method for outsourced private data. The privacy model is anatomization/fragmentation: the third party sees data values, but the link between sensitive and identifying information is encrypted with a key known only to data publisher. Data publishers have limited processing and storage capability. Both sensitive and identifying information thus are stored on the third parties. The approach presented also retains most processing at the third parties, and data publisher-side processing is amortized over predictions made by the data publishers. Experiments on various datasets show that the method produces decision trees approaching the accuracy of a non-private decision tree, while substantially reducing the data publisher's computing resource requirements.

About the Speaker

Koray Mancuhan
Koray is a PhD student in the Department of Computer Science at Purdue University. He is currently a member of the privacy preserving data mining lab under the supervision of Chris Clifton. His research elaborates the data mining models from the anonymized data. The challenge in his research is the injected uncertainty into data because of anonymization methods. In most cases, uncertainty slows down the data mining models and require special mechanisms to exploit noisy data. His work includes learning algorithms such as k-NN classification, SVM classification, decision tree classification and frequent itemset mining.

Koray received his masters degree in Computer Science from Purdue University and his undergraduate degree in Computer Engineering from Galatasaray University. Throughout his masters degree, he studied on data mining and social fairness, and authored papers in this topic. Before joining to Purdue CS, he did his research in semantic web area. He was a former member of Complex Networks lab in Galatasaray University where he worked in developing a new automatic web service annotation tool.


Ways to Watch

YouTube

Watch Now!

Over 500 videos of our weekly seminar and symposia keynotes are available on our YouTube Channel. Also check out Spaf's YouTube Channel. Subscribe today!