An Integrated and Utility-Centric Framework for Federated Text Search


Principal Investigator: Luo Si

Traditional search engines like Google typically ignore a large amount of information behind the search engines of many online text information sources. Federated text search provides one-stop access to the hidden information via a single interface that connects to multiple search engines of text information sources. Existing federated search solutions only focus on content relevance and ignore a large amount of valuable information about users and information sources. This project includes novel research on: (1) Multiple Type Resource Representation: model important information of text information sources such as search response time and search engine effectiveness; (2) Utility-Centric Resource Selection: satisfy a user’s search criteria by considering multiple types of evidence such as content relevance, search results from past queries, personal information needs, and search response time; (3) Effective and Efficient Results Merging: produce accurate merged ranked results with little cost of acquiring the content information of the returned documents; (4) System Adaptation by Results Analysis: analyze the search results from past queries for more accurate federated search solutions; (5) System Development and Evaluation: build and test algorithms within research environments as well as a new FedLemur system for a real world application. The project advances the state-of-the-art of research in federated search. It will have broad impacts for other applications such as peer to peer search. The project Web site (http://www.cs.purdue.edu/~lsi/Federated_Search_Career_Award.html) will be used for results dissemination. The education component of the project will expand information retrieval instruction to address multi-disciplinary requirements, improve the education of information technology workforce, and arouse interests of K-12 students for search technologies.

Keywords: federated search, privacy, personal data