Privacy Preservation in Data Publishing and Sharing

Get BibTex-formatted data

Download

PDF

Author

Tiancheng Li

Tech report number

CERIAS TR 2010-16

Entry type

phdthesis

Abstract

In this information age, data and knowledge extracted by data mining techniques represent a key asset driving research, innovation, and policy-making activities. Many agencies and organizations have recognized the need of accelerating such trends and are therefore willing to release the data they collected to other parties, for purposes such as research and the formulation of public policies. However the data publication processes are today still very difficult. Data often contains personally identifiable information and therefore releasing such data may result in privacy breaches; this is the case for the examples of microdata, e.g., census data and medical data. This thesis studies how we can publish and share microdata in a privacy-preserving manner. We present an extensive study of this problem along three dimensions: (1) designing a simple, intuitive, and robust privacy model; (2) designing an effective anonymization technique that works on sparse and high-dimensional data; and (3) developing a methodology for evaluating privacy and utility tradeoff.

Download

PDF

Date

2010 – 6 – 13

Key alpha

Publication Date

2010-06-13

BibTex-formatted data

To refer to this entry, you may select and copy the text below and paste it into your BibTex document. Note that the text may not contain all macros that BibTex supports.

@Phdthesis{ Li,
	title = "Privacy Preservation in Data Publishing and Sharing",
	author = "Tiancheng Li",
	year = "2010",
	month = "6",
	day = "13",
	abstract = "In this information age, data and knowledge extracted by data mining techniques represent a key asset driving research, innovation, and policy-making activities. Many agencies and organizations have recognized the need of accelerating such trends and are therefore willing to release the data they collected to other parties, for purposes such as research and the formulation of public policies. However the data publication processes are today still very difficult. Data often contains personally identifiable information and therefore releasing such data may result in privacy breaches; this is the case for the examples of microdata, e.g., census data and medical data.

This thesis studies how we can publish and share microdata in a privacy-preserving manner. We present an extensive study of this problem along three dimensions: (1) designing a simple, intuitive, and robust privacy model; (2) designing an effective anonymization technique that works on sparse and high-dimensional data; and (3) developing a methodology for evaluating privacy and utility tradeoff.",
}