Carnegie Mellon University
Browse
Li_cmu_0041E_11130.pdf (835.89 kB)

Pangine Disassembler: Implementing and Evaluating Non-sequential Code Analysis with Non-chronological Backtracking

Download (835.89 kB)
thesis
posted on 2024-04-19, 19:00 authored by Kaiyuan LiKaiyuan Li

 

Accurate binary disassembly is crucial for understanding and analyzing compiled binaries, particularly in cybersecurity and reverse engineering. Traditional disassembly methods often face accuracy challenges, especially when dealing with complex binaries and error correction. This research proposes a unique methodology that integrates analysis heuristics and algorithms with an advanced form of non-chronological backtracking, known as "time travel." This method seeks to improve the accuracy and reliability of the disassembly process, addressing the limitations of conventional sequential disassemblers.

At the heart of this research is the development of the Pangine Disassembler, a tool that embodies this new approach. The Pangine Disassembler employs a micro-service architecture, combining pattern-matching heuristics with K-set data-flow analysis. This design focuses on simplicity and modularity, facilitating ease of construction and code comprehension. A distinctive feature of the Pangine Disassembler is its capability to have subsequent decisions in the disassembly process supersede earlier ones, effectively tackling the issue of premature optimization in traditional methods. This approach enhances the flexibility and precision of the disassembly process, allowing dynamic adaptation and error correction.

Empirical evaluations of the Pangine Disassembler confirm its effectiveness. The tool, tested against a comprehensive ground truth dataset, showed significant advancements in disassembly accuracy. The Pangine Disassembler achieved a notable harmonic mean F1 score of 96.31% on the benchmark dataset, demonstrating its competitive performance against many established open-source sequential disassemblers. This thesis advocates for the effectiveness of non-sequential, non-chronological strategies in binary analysis. By introducing a clean, modular solution, the research makes a substantial contribution to the field, presenting a viable alternative to traditional sequential disassembly models.

History

Date

2024-02-16

Degree Type

  • Dissertation

Department

  • Electrical and Computer Engineering

Degree Name

  • Doctor of Philosophy (PhD)

Advisor(s)

Jia Limin

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC