The Center for Education and Research in Information Assurance and Security (CERIAS)

The Center for Education and Research in
Information Assurance and Security (CERIAS)

Rahul Potharaju - Purdue University

Students: Spring 2024, unless noted otherwise, sessions will be virtual on Zoom.

Delivering Always-on Services Despite Flaky Network Infrastructure

Feb 26, 2014

Download: Video Icon MP4 Video Size: 129.3MB  
Watch on Youtube Watch on YouTube

Abstract

As computing shifts to a service-oriented world, a key need is to deliver an always-on experience to the end-users. However, providing a 24x7x365 available service is challenging because failures are the norm rather than an exception in distributed systems. While there has been significant work to improve server and software reliability, networks have become the new "weakest link" in delivering reliable services. Towards improving network service reliability, my research focuses on (a) studying reliability of datacenter networks, (b) building automated systems for problem inference, and (c) gaining operational experience from real-world deployment of the systems I built.

In this talk, I will answer three key questions on improving service reliability in datacenters:

1. What is the service impact due to network failures? What are their root causes?
2. How to build geo-distributed cloud services?
3. How do we analyze unstructured data from network operators to improve network management?

The outcomes of this work have either undergone a tech-transfer or are being used by multiple business groups inside a large cloud provider.

About the Speaker

Rahul Potharaju is a PhD candidate in the Computer Science department of Purdue University and a member of CERIAS, advised by Prof. Cristina Nita-Rotaru. Prior to that, in 2009, he earned his Masters Degree in Computer Science from Northwestern University. He has over three years of industrial research experience working on collaboration projects with Microsoft Research, Redmond and Motorola Applied Research Center. He is passionate about building large-scale data-intensive systems, with a particular interest in analytics-as-a-service clouds and automated problem inference systems. His research has been adopted by several business groups inside Microsoft and has won the Microsoft Trustworthy Reliability Computing Award for 2013.


Ways to Watch

YouTube

Watch Now!

Over 500 videos of our weekly seminar and symposia keynotes are available on our YouTube Channel. Also check out Spaf's YouTube Channel. Subscribe today!