Providing Privacy through Plausibly Deniable Search

Feb 25, 2009

Query-based web search is becoming an integral part of many people's daily activities. Most do not realize that their search history can be used to identify them (and their interests). In July 2006, AOL released an anonymized search query log of some 600K randomly selected users. While valuable as a research tool, the anonymization was insufficient: individuals could be identified from the contents
of the queries alone Government requests for such logs serves to increase the concern. To address this problem, we propose a client-centered approach of "plausibly deniable search". Each user query is substituted with a standard, closely-related query intended to fetch the desired results. In addition, a set of k-1 cover queries are issued; these have characteristics similar to the standard query but on unrelated topics. The system provides a property that any of these k queries will produce the same of set of k queries, giving k possible topics the user could have been searching for. We use Latent Semantic Indexing (LSI) based technique to generate queries, and evaluate on the DMOZ webpage collection to show the effectiveness of the proposed approach.

