1/1
3 files

Modeling Productivity in Open Source GitHub Projects: A Dataset and Codebase

dataset
posted on 21.01.2020 by Samridhi Choudhary, Christopher Bogart, Carolyn Rose, James Herbsleb
Contains events associated with 16,337 Python PyPI projects from Github from 2012 through Oct 2017. Scraped from Github (acceptable use per terms of service: https://help.github.com/en/github/site-policy/github-terms-of-service). Also contains analysis of bursts of activity within each project, performed by (1) using a hidden markov model to identify "busy" spans of days, (2) calculating metrics of social and technical dependencies among people and artifacts involved in each burst, and (3) calculating sociotechnical congruence by comparing the social and technical dependency networks.

Funding

BIGDATA: Collaborative Research: F: Study of a Cyber-Enabled Social Computing Framework for Improving Practice in Online Computing Communities

Directorate for Computer & Information Science & Engineering

Find out more...

CIF21 DIBBs: Building a Scalable Infrastructure for Data-Driven Discovery and Innovation in Education

Directorate for Computer & Information Science & Engineering

Find out more...

BIGDATA: Collaborative Research: IA: OSCAR - Open Source Supply Chains and Avoidance of Risk: An Evidence Based Approach to Improve FLOSS Supply Chains

Directorate for Computer & Information Science & Engineering

Find out more...

History

Date

29/12/2019

Licence

Exports

Licence

Exports