Reverse Engineering of Data Structures from Binary

Get BibTex-formatted data

Download

PDF

Author

Zhiqiang Lin

Tech report number

CERIAS TR 2011-05

Entry type

phdthesis

Abstract

Reversing engineering of data structures involves two aspects: (1) given an application binary, infers the data structure definitions; and (2) given a memory dump, infers the data structure instances. These two capabilities have a number of security and forensics applications that include vulnerability discovery, kernel rootkit detection, and memory forensics. In this dissertation, we present an integrated framework for reverse engineering of data structures from binary. There are three key components in our framework: REWARDS, SigGraph and DIMSUM. REWARDS is a data structure definition reverse engineering component that can automatically uncover both the syntax and semantics of data structures. SigGraph and DIMSUM are two data structure instance reverse engineering components that can recognize the data structure instances in a memory dump. In particular, SigGraph can systematically generate non-isomorphic signatures for data structures in an OS kernel and enable the brute force scanning of kernel memory to find the data structure instances. SigGraph relies on memory mapping information, but DIMSUM, which leverages probabilistic inference techniques, can directly scan memory without memory mapping information. We have developed a number of enabling techniques in our framework that include (1) bi-directional (i.e., backward and forward) data flow analysis, (2) signature graph generation and comparison, and (3) belief propagation based probabilistic inference. We demonstrate how we integrate these techniques into our reverse engineering framework in this dissertation. We have obtained the following preliminary experimental results. REWARDS achieved over 80�curacy in revealing data structure definitions accessed during an execution. SigGraph recognized Linux kernel data structure instances with zero false negative and close-to-zero false positives, and had strong robustness in the presence of malicious pointer manipulations. DIMSUM achieved over 20�curacy improvement than previous nonprobabilistic approaches without memory mapping information.

Download

PDF

Date

2011 – 11 – 1

Key alpha

Data Structure Reverse Engineering, Binary Analysis, Memory Forensics, Kernel Rootkit Defense

Publication Date

2011-11-01

BibTex-formatted data

To refer to this entry, you may select and copy the text below and paste it into your BibTex document. Note that the text may not contain all macros that BibTex supports.

@Phdthesis{ Data Structure Reverse Engineering, Binary Analysis, Memory Forensics, Kernel Rootkit Defense,
	title = "Reverse Engineering of Data Structures from Binary",
	author = "Zhiqiang Lin",
	year = "2011",
	month = "11",
	day = "1",
	abstract = "Reversing engineering of data structures involves two aspects: (1) given an application binary, infers the data structure definitions; and (2) given a memory dump, infers the data structure instances. These two capabilities have a number of security and forensics applications that include vulnerability discovery, kernel rootkit detection, and memory
forensics.

In this dissertation, we present an integrated framework for reverse engineering of data structures from binary. There are three key components in our framework: REWARDS, SigGraph and DIMSUM. REWARDS is a data structure definition reverse engineering component that can automatically uncover both the syntax and semantics of data structures. SigGraph and DIMSUM are two data structure instance reverse engineering components that can recognize the data structure instances in a memory dump. In particular, SigGraph can systematically generate non-isomorphic signatures for data structures in an OS kernel and enable the brute force scanning of kernel memory to find the data structure instances. SigGraph relies on memory mapping information, but DIMSUM, which leverages probabilistic inference techniques, can directly scan memory without memory mapping information.

We have developed a number of enabling techniques in our framework that include (1) bi-directional (i.e., backward and forward) data flow analysis, (2) signature graph generation and comparison, and (3) belief propagation based probabilistic inference. We demonstrate how we integrate these techniques into our reverse engineering framework
in this dissertation.

We have obtained the following preliminary experimental results. REWARDS achieved over 80�curacy in revealing data structure definitions accessed during an execution. SigGraph recognized Linux kernel data structure instances with zero false negative and close-to-zero false positives, and had strong robustness in the presence of malicious pointer
manipulations. DIMSUM achieved over 20�curacy improvement than previous nonprobabilistic approaches without memory mapping information.",
}