Towards Automated Problem Inference from Trouble Tickets
Rahul Potharaju - Purdue University
Apr 17, 2013Size: 165.0MB
Download: MP4 Video
Watch in your Browser Watch on YouTube
AbstractThe growing demand for cloud services is driving the need to deliver an always-on and safe user experience in accessing their data and applications. Examples include web search, social networking, email, ecommerce, video streaming, data analytics and even mission-critical services such as power grid control. Such environments are required to be highly available and secure. This is often satisfied by having experts monitoring the system 24x7 to ensure that problems, if any, are resolved within a reasonable time. The need to solve a problem within the minimum time gives rise to a "whatever-it-takes-to-fix-the-problem" attitude amongst experts and produces a constant flow of informal text documenting the debugging steps taken to resolve problems. Understanding the content within this informal text at scale is the key to uncovering big problem trends that will enable us learn from mistakes and improve system design.
In this talk, I will present NetSieve, a system that we built that aims to do automated problem inference from trouble tickets. Specifically, I will show you how statistical natural language processing (NLP) can be combined with knowledge representation, ontology modeling and human-guided learning to automatically analyze natural language text in trouble tickets to infer the problem symptoms, troubleshooting activities and resolution actions. I will further discuss fundamental challenges which arise when extracting meaning from such massive open-domain text corpora. Finally, I will then discuss how we applied NetSieve in a massive data center setting to automatically analyze 10K+ network trouble tickets and how we used these results to improve several key network operations.
About the SpeakerRahul Potharaju is a PhD student in the Computer Science department of Purdue University and a member of CERIAS, advised by Prof. Cristina Nita-Rotaru. Prior to that, in 2009, he earned his Masters Degree in Computer Science from Northwestern University. He has over two years of industrial research experience working on collaboration projects with Microsoft Research, Redmond and Motorola Applied Research Center. His current work focuses on large-scale Internet measurements, problem inference system, intrusion detection and security aspects of smartphones architectures and reliability aspects of data centers both from a hardware and a software perspective. A recurring theme in all his research is combining cross-domain techniques such as those from natural language processing with statistical machine learning and data mining to make surprising inferences in the networking and smartphone areas.
The views, opinions and assumptions expressed in these videos are those of the presenter and do not necessarily reflect the official policy or position of CERIAS or Purdue University. All content included in these videos, are the property of Purdue University, the presenter and/or the presenter’s organization, and protected by U.S. and international copyright laws. The collection, arrangement and assembly of all content in these videos and on the hosting website exclusive property of Purdue University. You may not copy, reproduce, distribute, publish, display, perform, modify, create derivative works, transmit, or in any other way exploit any part of copyrighted material without permission from CERIAS, Purdue University.