This README.txt file was generated on <2021-01-20> by ------------------- GENERAL INFORMATION ------------------- 1. Title of Dataset: The evolution of quantitative sensitivity 2. Author Information Co-First Author Name: Margaret A. H. Bryer Institution: Carnegie Mellon University and University of California, Berkeley Co-First Author Name: Sarah E. Koopman Institution: University of St. Andrews Corresponding Author Name: Jessica F. Cantlon Institution: Carnegie Mellon University Author Name: Steven T. Piantadosi Institution: University of California Berkeley Author Name: Evan L. MacLean Institution: University of Arizona Author Contact Information Name: Joseph M. Baker Institution: Stanford University Author Name: Michael J. Beran Institution: Georgia State University Author Name: Sarah M. Jones Institution: Berea College Author Name: Kerry E. Jordan Institution: Utah State University Author Name: Salif Mahamane Institution: Western Colorado University Author Name: Andreas Nieder Institution: University of Tübingen Author Name: Bonnie M. Perdue Institution: Agnes Scott College Author Name: Friederike Range Institution: University of Veterinary Medicine Vienna Author Name: Jeffrey R. Stevens Institution: University of Nebraska-Lincoln Author Name: Masaki Tomonaga Author Name: Dorottya J. Ujfalussy Institution: Eötvös Loránd University of Sciences Author Name: Jennifer Vonk Institution: Oakland University --------------------- DATA & FILE OVERVIEW --------------------- # # Directory of Files in Dataset: List and define the different # files included in the dataset. This serves as its table of # contents. # Directory of Files: A. Filename: NumberData.csv Short description: Compiled performance data from quantitative discrimination studies across 33 bird and mammal species. B. Filename: SpeciesPredictors.csv Short description: Compiled brain, social and ecological predictors for 33 bird and mammal species. C. Filename: WSpeciesTree.nex Short description: Phylogenetic tree information for 33 species, generated at timetree.org. D. Filename: model1.stan Short description: Bayesian model that predicts Weber fraction per species. E. Filename: runmodel1.R Short description: R script to run model 1 (File D) using the package rstan and generate Figure 2 boxplot of posterior quantiles of Weber fraction for each species. F. Filename: TreeplotWSpecies.R Short description: R script to create phylogenetic tree of 33 species for Figure 2. G. Filename: Figure2inset.R Short description: R script to generate figure of lambda parameter output of model 2 (File H). H. Filename: model2.stan Short description: Bayesian model that predicts influence of a predictor on Weber fraction. I. Filename: runmodel2.R Short description: R script to run model 2 (File H) using the package rstan. J. Filename: Figure3.R Short description: R script to generate B1 posterior credible intervals from model2 (File H) output by predictor. K. Filename: Figure4.R Short description: R script to generate posterior credible intervals for scale parameters and B1 for one predictor from model 2 (File H) output. L. Filename: Figure5.R Short description: R script that pulls from SpeciesPredictors.csv (File B) where mean species Weber fraction was recorded from model 1 (File D) output to generate scatterplots. Additional Notes on File Relationships, Context, or Content (for example, if a user wants to reuse and/or cite your data, what information would you want them to know?): Our main dataset (NumberData.csv, File A) is a compilation of behavioral performance data collected by coauthors as well as published open access data on number discrimination tasks. The social, ecological and brain variables (SpeciesPredictors.csv, File B) were collected from published literature. The two Bayesian models implemented in this analysis (File D and File H) were written by author Steven T. Piantadosi. Bayesian models were run using Stan and rstan (the R interface with Stan), so files for model runs and generating figures are .stan or .R files. # # File Naming Convention: Define your File Naming Convention # (FNC), the framework used for naming your files systematically # to describe what they contain, which could be combined with the # Directory of Files. # File Naming Convention: File names are kept short and simple and correspond to either the figure in the publication they are generating, the model they are or are linked to, or a simple description of content. # # Data Description: A data description, dictionary, or codebook # defines the variables and abbreviations used in a dataset. This # information can be included in the README file, in a separate # file, or as part of the data file. If it is in a separate file # or in the data file, explain where this information is located # and ensure that it is accessible without specialized software. # (We recommend using plain text files or tabular plain text CSV # files exported from spreadsheet software.) # ----------------------------------------- DATA DESCRIPTION FOR: [NumberData.csv] ----------------------------------------- 1. Number of variables: 8 2. Number of cases/rows: 6756 3. Missing data codes: Code/symbol Definition NA Missing data Note there are no missing data in this particular file 4. Variable List A. Name: species Description: Species of animal, numbered 1-33 B. Name: study Description: Quantitative research study numbered. Study number is unique for each species and task. C. Name: subject Description: Individual animal research subject numbered. D. Name: n1 Description: The first number being compared in the quantity discrimination. E. Name: n2 Description: The second number being compared in the quantity discrimination. F. Name: ntrials Description: The total number of trials of that quantity discimination. G. Name: ncorrect Description: The number of correct trials of that quantity discrimination in choosing the larger quantity. H. Name: task Description: Task paradigm 1-controlled array 2-sequential 3-simultaneous I. Name: speciesname Description: Species scientific name for clarity to the reader to supplement species numbers (A). J. Name: refTableS1 Description: The publication from which the line of data is drawn. Note that one publication can contain multiple studies (B). K. Name: taskname Description: Task paradigm name for clarity to reader to supplement task paradigm number (I). ----------------------------------------- DATA DESCRIPTION FOR: [SpeciesPredictors.csv] ----------------------------------------- 1. Number of variables: 33 2. Number of cases/rows: 33 3. Missing data codes: Code/symbol Definition NA Missing data 4. Variable List A. Name: species Description: Scientific name of animal species. B. Name: name Description: Common name of animal species. C. Name: species Description: Animal species numbered 1-33 D. Name: group Description: Taxonomic group to which an animal species belongs: Bird Mammal (non-primate mammal) Primate E. Name: w Description: Mean Weber fraction (w) per species generated by model1 F. Name: sd Description: Standard deviation in Weber fraction per species generated by model1 G. Name: endocranial Description: Endocranial volume for each species (see Figure S1 for references) H. Name: residualbrain Description: Residual brain volume for each species (see Figure S1 for references) I. Name: corticalneurons Description: Number of cortical or pallium neurons (see Figure S1 for references) J. Name: logcortneurons Description: Log of number of cortical or pallium neurons K. Name: cortneurmillions Description: Number of cortical or pallium neurons in millions L. Name: ecv.no.outlier Description: Endocranial volume with one outlier removed M. Name: rbv.no Description: Residual brain volume with two outliers removed N. Name: cortneur.no Description: Cortical or pallium neuron number with one outlier removed O. Name: frugivory Description: Percentage of the diet made up of fruit (see Figure S1 for references) P. Name: logpopsize Description: Log of population size (see Figure S1 for references) Q. Name: logforagesize Description: Log of foraging group size (variable not used in final analyses). Distinction made between foraging group size and population size in Nunn and van Schaik (2001) R. Name: loghomerange Description: Log of home range size (see Figure S1 for references) S. Name: logdayjourney Description: Log of day journey length (see Figure S1 for references) T. Name: selfcontrol Description: Performance on self-control tasks excluding more recently published corvid values (see Figure S1 for references) U. Name: selfcontrol2 Description: Performance on self-control tasks with all available values including corvids (see Figure S1 for references) V. Name: gencog Description: General cognitive score from Deaner, van Schaik, and Johnson 2006 W. Name: gencogrev Description: General cognitive score from Deaner, van Schaik, and Johnson 2006 with score reversed for clarity so that a higher score indicates high general cognitive ability. X. Name: cortneurdensity Description: Cortical or pallium neuron density (see Figure S1 for references) Y. Name: logcortneurdens Description: Log of cortical or pallium neuron density. Z. Name: cerebellneurdensity Description: Cerebellum neuron density (see Figure S1 for references) AA. Name: logcerebneurdens Description: Log of cerebellum neuron density AB. Name: cerebellumneurons Description: Number of cerebellum neurons (see Figure S1 for references) AC. Name: logcerebneurons Description: Log of number of cerebellum neurons AD. Name: cerebellumneuronsmill Description: Number of cerebellum neurons in millions AE. Name: cerebellneurnoelephant Description: Number of cerebellum neurons with one outlier removed AF. Name: logcerebneuronsno Description: Log of number of cerebellum neurons with one outlier removed AG. Name: cerebellneurnoomill Description: Number of cerebellum neurons in millions with one outlier removed ----------------------------------------- DATA DESCRIPTION FOR: [WSpeciesTree.nex] ----------------------------------------- This phylogenetic tree information for 33 bird and mammal species was generated at timetree.org where you can load a list of species and download the tree in Newick tree format. The Newick format is used to describe a phylogenetic tree as a string of text. Parentheses are used to group sequence names, and branch lengths are included using colons followed by the length. The file was converted to NEXUS (NEXUS is considered a standard file format for phylogeny information.) The file was updated using the R package ape. Updates to tree data are based on Miller and Lambert 2006 and Savolainen 2002. ----------------------------------------- DESCRIPTION FOR: [model1.stan] ----------------------------------------- Our first model defined in the Stan modeling language. The model includes a data block, a parameters block, a model block, and a generated quantities block. Data block: Defines the data the model expects, which will be pulled from File A (and File B only for species) and assigns these data to Stan variables with specified data types. The covariance matrix from the phylogeny is also passed in here. Parameters block: Defines the parameters that will be used in the model, specifically B0, lambda_w (parameter of the phylogenetic regression), and the random effects. Model block: Defines the statistical model, including the priors and the likelihood. The likelihood in the case of our model is based in the psychophysics of number. This block also addresses missing data points through missing data imputation. Generated quantities block: This block is executed after a sample has been generated, and computes average Weber fraction value per species. ----------------------------------------- DESCRIPTION FOR: [runmodel1.R] ----------------------------------------- Using the Stan interface with R, a package called rstan, this R script runs the Stan model 1 (File D) and generates Weber fraction values and boxplot per species (posterior quantiles) to match up with phylogenetic tree. ----------------------------------------- DESCRIPTION FOR: [TreeplotWSpecies.R] ----------------------------------------- Using the R package ggtree ('ggtree' extends the 'ggplot2' plotting system for visualization and annotation of phylogenetic trees), this R script creates a tree of the 33 species using the phylogenetic tree information file (File C). ----------------------------------------- DESCRIPTION FOR: [Figure2inset.R] ----------------------------------------- This R script creates a simple figure (Figure 2 inset) of lambda parameter output from model 2 (File H). Before running this script, firt generate a .csv (name it "lambda_inset.csv") of the lambda_w posterior l.50 values in "Dall.csv" file generated by the model for all predictors for model 2 output. Then calculate mean of 0.05 and mean of 0.95 uncertainty intervals generated in "Dall.csv" file from model 2 output, and write these values into the ymin and ymax of the script. ----------------------------------------- DESCRIPTION FOR: [model2.stan] ----------------------------------------- Our second model defined in the Stan modeling language. The model includes a data block, a parameters block, and a model block. Data block: Defines the data the model expects, which will be pulled from File A and File B, and assigns these data to Stan variables with specified data types. The covariance matrix from the phylogeny is also passed in here. Unlike in model 1 (File D), model 2 defines and assigns a predictor variable (which will be one of the social, ecological or brain variables in File B). Parameters block: Defines the parameters that will be used in the model, specifically B0, B1, lambda_w (parameter of the phylogenetic regression), and the random effects. Model block: Defines the statistical model, including the priors and the likelihood. The likelihood in the case of our model is based in the psychophysics of number. This block also addresses missing data points through missing data imputation. ----------------------------------------- DESCRIPTION FOR: [runmodel2.R] ----------------------------------------- Using the Stan interface with R, a package called rstan, this R script runs the Stan model 2 (File H) and generates a .csv of posterior quantiles by predictor called "Dall.csv". For each predictor run, script is also included that uses the bayesplot R package to plot posterior interval estimates and histograms for B0, B1 and lambda parameters. ----------------------------------------- DESCRIPTION FOR: [Figure3.R] ----------------------------------------- This R script graphs B1 posterior credible intervals from model 2 (File H) output by predictor. ----------------------------------------- DESCRIPTION FOR: [Figure4.R] ----------------------------------------- This R script uses the bayesplot package to graph posterior credible intervals for scale parameters and one predictor, group size. ----------------------------------------- DESCRIPTION FOR: [Figure5.R] ----------------------------------------- This R script pulls from File B where mean species Weber fraction (w) (Variable E) was recorded from model 1 (File D) output and generates scatterplots of w by predictors of interest. -------------------------- METHODOLOGICAL INFORMATION -------------------------- # # Software: If specialized software(s) generated your data or # are necessary to interpret it, please provide for each (if # applicable): software name, version, system requirements, # and developer. #If you developed the software, please provide (if applicable): #A copy of the software’s binary executable compatible with the system requirements described above. #A source snapshot or distribution if the source code is not stored in a publicly available online repository. #All software source components, including pointers to source(s) for third-party components (if any) 1. Software-specific information: Name: Stan Version: 2.21.0 System Requirements: runs on all major platforms (Linux, Mac, Windows) Open Source? (Y/N): Y (if available and applicable) Executable URL: Source Repository URL: Developer: The Stan Development Team Product URL: https://mc-stan.org/ Software source components: Additional Notes(such as, will this software not run on certain operating systems?): Name: R Version: 4.1.1 System Requirements: runs on all major platforms (Linux, Mac, Windows) Open Source? (Y/N): Y (if available and applicable) Executable URL: Source Repository URL: Developer: R Development Core Team Product URL: https://www.r-project.org/ Software source components: Additional Notes(such as, will this software not run on certain operating systems?): Analyses were conducted using the rstan package in R, which is the R interface for Stan. # # Equipment: If specialized equipment generated your data, # please provide for each (if applicable): equipment name, # manufacturer, model, and calibration information. Be sure # to include specialized file format information in the data # dictionary. # 2. Equipment-specific information: Manufacturer: Model: (if applicable) Embedded Software / Firmware Name: Embedded Software / Firmware Version: Additional Notes: # # Dates of Data Collection: List the dates and/or times of # data collection. # 3. Date of data collection (single date, range, approximate date) : This study is based in a meta-analysis. Data collection dates for behavioral data (File A) are reported in original publications referred to in Figure S1. Predictor data (File B) were compiled from published literature by SEK and MAHB in 2017 and 2020.