Carnegie Mellon University
Browse

Topes: Reusable Abstractions for Validating Data

journal contribution
posted on 2008-05-01, 00:00 authored by Chris Scaffidi, Brad Myers, Mary Shaw
<p>Programmers often omit input validation when inputs can appear</p> <p>in many different formats or when validation criteria cannot be</p> <p>precisely specified. To enable validation in these situations, we</p> <p>present a new technique that puts valid inputs into a consistent</p> <p>format and that identifies “questionable” inputs which might be</p> <p>valid or invalid, so that these values can be double-checked by a</p> <p>person or a program. Our technique relies on the concept of a</p> <p>“tope”, which is an application-independent abstraction describing</p> <p>how to recognize and transform values in a category of data.</p> <p>We present our definition of topes and describe a development</p> <p>environment that supports the implementation and use of topes.</p> <p>Experiments with web application and spreadsheet data indicate</p> <p>that using our technique improves the accuracy and reusability of</p> <p>validation code and also improves the effectiveness of subsequent</p> <p>data cleaning such as duplicate identification.</p>

History

Date

2008-05-01

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC