2015 Symposium Posters

Posters > 2015

Analytical Environment for Large Scale Network Data


PDF

Project Members
Ashrith Barthur, William S. Cleveland
Abstract
Network devices, like IDS, IPS, switches, routers, and spam traps generate a lot of data as they track the complicated web of inter-connected devices and people on the Internet. This data needs to be processed, filtered, stored and available for analysis in near real time. The data must be informative, concise, and must exhibit a efficient data structure. The environment must be able to provide depth and breadth as and when needed. The analytical algorithms must be efficient and custom build for robustly analyzing the data. In our research work we create customized packet capturing and data processing applications. We use Hadoop and R using a custom, in-house library called RHIPE that interfaces Hadoop and R. RHIPE is an application originating from the larger theoretical framework called Divide & Recombine that efficiently divides data, analyses them and recombines then to form end result.