This README.txt file was generated on 20170210 by Alan Shteyman ------------------- GENERAL INFORMATION ------------------- 1. Title of Dataset:Saved Data Results from the Workflow to Analysis Imputation for Microsatellite / Cell Datasets for Reconstructing Cell Lineage Maps. 2. Author Information First Author Contact Information Name: Alan Shteyman Institution: Carnegie Mellon University Address: Email: ashteyma@andrew.cmu.edu Corresponding Author Contact Information Name: Russell Schwartz Institution: Carnegie Mellon University Address: Email: russells@andrew.cmu.edu Author Contact Information (if applicable) Name: Ruchi Asthana Institution: Carnegie Mellon University Address: Email: rasthana@andrew.cmu.edu --------------------- DATA & FILE OVERVIEW --------------------- Directory of Files A. Filename: "Frumkin2005_treea_processedandreformateddata_Fri Feb 10 2017" Short description: A folder that holds one of three datasets that were taken from Frumkin et al. 2005, processed into a different format, and then analyzed with a workflow mentioned in the paper. The original dataset is in init_datasets folder and the various outputs of the paper as shown in the paper are in the other subfolders. B. Filename: "Frumkin2005_treeb_processedandreformateddata_Fri Feb 10 2017" Short description: A folder that holds one of three datasets that were taken from Frumkin et al. 2005, processed into a different format, and then analyzed with a workflow mentioned in the paper. The original dataset is in init_datasets folder and the various outputs of the paper as shown in the paper are in the other subfolders. C. Filename: "Frumkin2005_treec_processedandreformateddata_Fri Feb 10 2017" Short description: A folder that holds one of three datasets that were taken from Frumkin et al. 2005, processed into a different format, and then analyzed with a workflow mentioned in the paper. The original dataset is in init_datasets folder and the various outputs of the paper as shown in the paper are in the other subfolders. D. Filename:"frumkin2015work" Short description: A folder that holds all the code used to generate the dataset files entered in the analysis as well as the semi-processed datasets that the code turned into full usable datasets for the workflow analysis. E. Filename: instructionstoeasilyaccessdata.txt Short description: Instructions to open and analyze the data File Naming Convention: For each higher level folder: "datasetsource"_"dataset”_”processedandreformateddata”_”dategenerated" ----------------------------------------- DATA DESCRIPTION FOR: "Frumkin2005_treea_processedandreformateddata_Fri Feb 10 2017" ----------------------------------------- 1. In subfolder init_datasets\data_tables2a:file with data accessible as excel file: preprocessMSdataproperlyformateddatafortables2a.xlsx 2. In subfolder init_datasets: file with all data saved in python based objects as a pickle file:saveddatastructs.pkl 3. All other subfolders: saved data for each phase of analysis workflow saved as a folder tree with data in either excel file or pickle(.pkl) formats ----------------------------------------- DATA DESCRIPTION FOR: "Frumkin2005_treeb_processedandreformateddata_Fri Feb 10 2017" ----------------------------------------- 1. In subfolder init_datasets\data_tables2b:file with data accessible as excel file: preprocessMSdataproperlyformateddatafortables2b.xlsx 2. In subfolder init_datasets: file with all data saved in python based objects as a pickle file:saveddatastructs.pkl 3. All other subfolders: saved data for each phase of analysis workflow saved as a folder tree with data in either excel file or pickle(.pkl) formats ----------------------------------------- DATA DESCRIPTION FOR: "Frumkin2005_treec_processedandreformateddata_Fri Feb 10 2017" ----------------------------------------- 1. In subfolder init_datasets\data_tables2c: file with data accessible as excel file: preprocessMSdataproperlyformateddatafortables2c.xlsx 2. In subfolder init_datasets: file with all data saved in python based objects as a pickle file:saveddatastructs.pkl 3. All other subfolders: saved data for each phase of analysis workflow saved as a folder tree with data in either excel file or pickle(.pkl) formats -------------------------- METHODOLOGICAL INFORMATION -------------------------- 1. Software-specific information: Name:python Version:2.7.10 on Ubuntu 14 System Requirements:none Additional Notes: Install all packages needed to run the workflow in order to access the pickle files without error. Name:Java Version:1.5+ System Requirements:default Name:FisherExact Version:default System Requirements:default Name:TreeCmp Version:1.0 System Requirements:java 1.5+ (used java 1.7 and 1.8) Additional Notes: see other set of stored files which include the actual workflow code to see what packages are used and what versions 3. Date of data collection:20170210