Carnegie Mellon University
Browse

Improving Patch Quality by Enhancing Key Components of Automatic Program Repair

Download (3.9 MB)
thesis
posted on 2021-05-06, 19:33 authored by Mauricio Soto GonzalezMauricio Soto Gonzalez
The error repair process in software systems is, historically, a resource-consuming task that relies heavily on manual developer effort. Automatic program repair approaches
have enabled the repair of software with minimum human interaction mitigating the burden on developers, reducing the costs of manual debugging and increasing software quality. However, a fundamental problem current automatic program repair approaches suffer is the possibility of generating low-quality patches that overfit to one program
specification as described by the guiding test suite and not generalizing to the intended specification. This dissertation rigorously explores this phenomenon on real-world Java programs and describes a set of mechanisms to enhance key components of the automatic program repair process to generate higher quality patches. These mechanisms include
an analysis of test suite behavior and their key characteristics for automatic program repair. We analyze the effectiveness of three well-known repair techniques: GenProg, PAR, and TrpAutoRepair, on defects made by the projects’ developers during their regular development process, and modify and analyze the impact modifying characteristics such as size, coverage, provenance, and number of failing test cases has on the quality of the produced patches. A second mechanism toward increase patch quality describes a set of research questions aimed at analyzing developer code changes to inform the mutation operator selection distribution. We create a probabilistic model that describes how often human developers choose each of the different mutation operators available to automated repair techniques, and we later use this probabilistic model to create an APR approach informed by this distribution to generate higher quality patches.
Finally, the third mechanism describes a repair technique based on patch diversity as a means increase the quality of the best performing patch in a patch population, and an evaluation of patch consolidation as a mechanism to increase patch quality. Some of the main findings in this dissertation are:
• Using our open-source framework JaRFly we were able to generate 68 patches for the 357 analyzed defects.
• Fundamental test suite characteristics such as test suite coverage, size, provenance, and number of triggering test cases determine the quality of the resulting plausible patches generated by automated program repair.
• An automatic program repair technique informed in human-based mutation operator distribution increases the quality of the patches generated when compared to other APR techniques.
• We analyze how current APR approaches typically lack diversity in their generated patches. We propose and evaluate a set of diversity-driven techniques that lead to an increase in semantic diversity of the patch pool and an increase in the best performing patch of the patch population. Finally, we analyze how patch consolidation can be used to increase patch quality.

History

Date

2021-02-24

Degree Type

  • Dissertation

Department

  • Institute for Software Research

Degree Name

  • Doctor of Philosophy (PhD)

Advisor(s)

Claire Le Goues

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC