Sivaram Ramanathan - University of Southern California
Students: Fall 2021, unless noted otherwise, sessions will be virtual on Zoom.
Improving the Accuracy of Blocklists by Aggregation and Address Reuse Detection
Nov 18, 2020PDF
Download: MP4 Video Size: 265.4MB
Watch on YouTube
IP address blocklists are a useful source of information about repeat attackers. Such information can be used to prioritize which traffic to divert for deeper inspection (e.g., repeat offender traffic), or which traffic to serve first (e.g., traffic from sources that are not blocklisted). But blocklists also suffer from overspecialization -- each list is geared towards a specific purpose -- and they may be inaccurate due to misclassification or stale information. We propose BLAG, a system that evaluates and aggregates multiple blocklists feeds, producing a more useful, accurate and timely master blocklist, tailored to the specific customer network. BLAG uses a sample of the legitimate sources of the customer network's inbound traffic to evaluate the accuracy of each blocklist over regions of address space. It then leverages recommendation systems to select the most accurate information to aggregate into its master blocklist. Finally, BLAG identifies portions of the master blocklist that can be expanded into larger address regions (e.g. /24 prefixes) to uncover more malicious addresses with minimum collateral damage. Our evaluation of blocklists of various attack types and three ground-truth datasets shows that BLAG achieves high specificity up to 99%, improves recall by up to 114 times compared to competing approaches, and detects attacks up to 13.7 days faster, which makes it a promising approach for blocklist generation.
Although performance of blocklists can be improved, they need to be used carefully. Blocklists can potentially lead to unjust blocking to legitimate users due to IP address reuse, where more users could be blocked than intended. IP addresses can be reused either at the same time (Network Address Translation) or over time (dynamic addressing). We present two new techniques to identify reused addresses. We built a crawler using the BitTorrent Distributed Hash Table to detect NATed addresses and use the RIPE Atlas measurement logs to detect dynamically allocated address spaces. We then analyze 151 publicly available IPv4 blocklists to show the implications of reused addresses and find that 53--60% of blocklists contain reused addresses having about 30.6K--45.1K listings of reused addresses. We also find that reused addresses can potentially affect as many as 78 legitimate users for as many as 44 days.
About the Speaker
Sivaram is a fifth-year Ph.D. student at the University of Southern California. His research focuses on developing systems to improve internet security and providing better measurements in the network.