Using traits of web macro scrips to predict reuse

Scaffidi, Chris; Bogart, Chris; Burnett, Margaret; Cypher, Allen; Shaw, Mary; Myers, Brad

doi:10.1184/R1/6626432.v1

file.pdf (671.02 kB)

Using traits of web macro scrips to predict reuse

journal contribution

posted on 2009-09-01, 00:00 authored by Chris Scaffidi, Chris Bogart, Margaret Burnett, Allen Cypher, Mary Shaw, Brad Myers

To help people find code that they might want to reuse, repositories of end-user code typically sort scripts by number of downloads, ratings, or other information based on prior uses of the code. However, this information is unavailable when code is new or when it has not yet been reused. Addressing this problem requires identifying reusable code based solely on information that exists when a script is created. To provide such a model for web macro scripts, we identified script traits that might plausibly predict reuse, then used IBM CoScripter repository logs to statistically test how well each corresponded to actual reuse. These tests confirmed that the traits generally did correspond to higher levels of reuse as anticipated. We then developed a machine learning model that uses these traits as features to predict reuse of macros. Evaluating this model on repository logs showed that its accuracy is comparable to that of existing machine learning models for predicting reuse—but with a much simpler structure. Sensitivity analysis revealed that our model is quite robust; its quality is greatly reduced only when parameters are set to such extreme values that the model becomes inordinately selective. Testing the model with individual traits revealed those that provided the best predictions on their own. Based on these results, we outline opportunities for using our model to improve repositories of end-user code.

History

Publisher Statement

This is the author’s version of a work that was accepted for publication. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version is available at http://dx.doi.org/10.1016/j.jvlc.2010.08.003

Date

2009-09-01

Usage metrics

Keywords

end-user programming end-user software engineering repositories reuse web macros

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Using traits of web macro scrips to predict reuse

History

Publisher Statement

Date

Usage metrics

Categories

Keywords

Licence

Exports