Carnegie Mellon University
Browse
Mariposa: Measuring SMT Instability in Automated Program Verification.pdf (2.82 MB)

Mariposa: Measuring SMT Instability in Automated Program Verification (Technical Report)

Download (2.82 MB)
report
posted on 2023-08-23, 18:20 authored by Yi ZhouYi Zhou, Jay BosamiyaJay Bosamiya, Yoshiki TakashimaYoshiki Takashima, Jessica Li, Marijn Heule, Bryan ParnoBryan Parno

This is a technical report in supplementary to the paper titiled "Mariposa: Measuring SMT Instability in Automated Program Verification" at  FMCAD 2023.This report includes extended experiments results not presetned the conference paper. 


Program verification has been successfully applied to increasingly  large  and  complex  systems.  Much of  this  recent  success  can  be  attributed  to  the  automation provided  by  dispatching  verification  condition  queries  viaSMT  solvers.  However,  multiple  teams  anecdotally  report that this style of automated verification is plagued by proof instability,  where  semantically  irrelevant  changes  to  the query can have large effects on the SMT solver’s response.

In this work, we present Mariposa, a tool to detect and quantify  instability.  To  better  understand  the  status  quo of  instability,  we  apply  Mariposa  to  a  set  of  17,043  SMT queries from six existing program verification projects. We discover  that  SMT  solver  upgrades  often  make  projects less stable, and that the most recent SMT solver version is unstable on 2.6% of the queries. For individual projects, the unstable ratio can grow to 5.0%. Based on our experimental results, we curate the Mariposa benchmark, which we hope will  help  measure  and  incentivize  stability  improvements in  SMT-based  program  verification

Funding

NSF 1901136

NSF 2015445

NSF 2224279

Grant from Intel Corporation

Grant from Intel Rolls-Royce

Amazon Research Awards (Fall 2022 CFP)

Prabhu and Poonam Goel Graduate Fellowship

Future Enterprise Security initiative at Carnegie Mellon CyLab

History

Date

2023-08-08

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC