Delivering "Always-on" Services Despite Flaky Network Infrastructure
Rahul Potharaju - Purdue University
Feb 26, 2014Size: 129.3MB
Download: MP4 Video
Watch in your Browser Watch on YouTube
AbstractAs computing shifts to a service-oriented world, a key need is to deliver an always-on experience to the end-users. However, providing a 24x7x365 available service is challenging because failures are the norm rather than an exception in distributed systems. While there has been significant work to improve server and software reliability, networks have become the new “weakest link” in delivering reliable services. Towards improving network service reliability, my research focuses on (a) studying reliability of datacenter networks, (b) building automated systems for problem inference, and (c) gaining operational experience from real-world deployment of the systems I built.
In this talk, I will answer three key questions on improving service reliability in datacenters:
1. What is the service impact due to network failures? What are their root causes?
2. How to build geo-distributed cloud services?
3. How do we analyze unstructured data from network operators to improve network management?
The outcomes of this work have either undergone a tech-transfer or are being used by multiple business groups inside a large cloud provider.
About the SpeakerRahul Potharaju is a PhD candidate in the Computer Science department of Purdue University and a member of CERIAS, advised by Prof. Cristina Nita-Rotaru. Prior to that, in 2009, he earned his Masters Degree in Computer Science from Northwestern University. He has over three years of industrial research experience working on collaboration projects with Microsoft Research, Redmond and Motorola Applied Research Center. He is passionate about building large-scale data-intensive systems, with a particular interest in analytics-as-a-service clouds and automated problem inference systems. His research has been adopted by several business groups inside Microsoft and has won the Microsoft Trustworthy Reliability Computing Award for 2013.
The views, opinions and assumptions expressed in these videos are those of the presenter and do not necessarily reflect the official policy or position of CERIAS or Purdue University. All content included in these videos, are the property of Purdue University, the presenter and/or the presenter’s organization, and protected by U.S. and international copyright laws. The collection, arrangement and assembly of all content in these videos and on the hosting website exclusive property of Purdue University. You may not copy, reproduce, distribute, publish, display, perform, modify, create derivative works, transmit, or in any other way exploit any part of copyrighted material without permission from CERIAS, Purdue University.