Ph.D. Dissertation Defense: Khaled ElWazeer
Tuesday, February 25, 2014
2:00 p.m. Room 3258, AVW Bldg.
For More Information:
301 405 3681 firstname.lastname@example.org
ANNOUNCEMENT: Ph.D. Dissertation Defense
Name: Khaled ElWazeer
Committee: Professor Rajeev Barua, Chair/Advisor Professor Manoj Franklin Professor Shuvra Bhattacharyya Professor Donald Yeung Professor Jeffery Foster, Dean's Representative
Date: February 25, 2014 (Tuesday)
Time: 2:00 - 4:00 PM
Location: Conference Room (3258 AVW)
Title: Deep Analysis of Binary Code to Recover Program Structure
Reverse engineering binary executable code is gaining more interest in the research community. Agencies as diverse as anti-virus companies, security consultants, code forensics consultants, law-enforcement agencies and national security agencies routinely try to understand binary code. Engineers also often need to debug, optimize or instrument binary code during the software development process.
In this dissertation, we present novel techniques to extend the capabilities of existing binary analysis and rewriting tools to be more scalable, handling a larger set of stripped binaries with better and more understandable outputs as well as ensuring correct recovered intermediate representation (IR) from binaries such that any modified or rewritten binaries compiled from this representation work correctly.
In the first part of the dissertation, we present techniques to recover accurate function boundaries from stripped executables. Our techniques as opposed to current techniques ensure complete live executable code coverage, high quality recovered code, and functional behavior for most application binaries. We use static and dynamic techniques to remove as much spurious code as possible in a safe manner that does not hurt code coverage or IR correctness. Next, we present static techniques to recover correct prototypes for the recovered functions. The recovered prototypes include the correct number of arguments and returns. Our techniques ensure correct behavior of rewritten binaries for both internal and external functions.
Finally, we present scalable and precise techniques to recover local variables for every function obtained as well as global and heap variables. Different techniques are represented for floating point stack allocated variables and memory allocated variables. Data type recovery techniques are presented to declare meaningful data types for the detected variables. Our data type recovery techniques can recover integer, pointer, structural and recursive data types. We discuss the correctness of the recovered representation.
The evaluation of all the methods proposed is conducted on SecondWrite, a binary rewriting framework developed by our research group. An important metric in the evaluation is to be able to recompile the IR with the recovered information and run it producing the same answer that is produced when running the input executable. Another metric is the analysis time. Some other metrics are proposed to measure the quality of the IR with respect to the IR with source code information available.