This README.txt file was generated on <2021-01-20> by <Margaret A. H. Bryer>

-------------------
GENERAL INFORMATION
-------------------

1. Title of Dataset: The evolution of quantitative sensitivity


2. Author Information

Co-First Author
    Name: Margaret A. H. Bryer
    Institution: Carnegie Mellon University and University of California, Berkeley

Co-First Author
    Name: Sarah E. Koopman
    Institution: University of St. Andrews

Corresponding Author
    Name: Jessica F. Cantlon
    Institution: Carnegie Mellon University

Author
    Name: Steven T. Piantadosi
    Institution: University of California Berkeley

Author
    Name: Evan L. MacLean
    Institution: University of Arizona
    
Author Contact Information
    Name: Joseph M. Baker
    Institution: Stanford University

Author
    Name: Michael J. Beran
    Institution: Georgia State University

Author
    Name: Sarah M. Jones
    Institution: Berea College

Author
    Name: Kerry E. Jordan
    Institution: Utah State University

Author
    Name: Salif Mahamane
    Institution: Western Colorado University

Author
    Name: Andreas Nieder
    Institution: University of Tübingen

Author
    Name: Bonnie M. Perdue
    Institution: Agnes Scott College

Author
    Name: Friederike Range
    Institution: University of Veterinary Medicine Vienna

Author
    Name: Jeffrey R. Stevens
    Institution: University of Nebraska-Lincoln

Author
    Name: Masaki Tomonaga

Author
    Name: Dorottya J. Ujfalussy
    Institution: Eötvös Loránd University of Sciences 

Author
    Name: Jennifer Vonk	
    Institution: Oakland University


---------------------
DATA & FILE OVERVIEW
---------------------

#
# Directory of Files in Dataset: List and define the different 
# files included in the dataset. This serves as its table of 
# contents. 
#

Directory of Files:
   A. Filename:  NumberData.csv      
      Short description: Compiled performance data from quantitative discrimination studies across 33 bird and mammal species.  

        
   B. Filename:  SpeciesPredictors.csv     
      Short description:  Compiled brain, social and ecological predictors for 33 bird and mammal species.      


   C. Filename:  WSpeciesTree.nex      
      Short description: Phylogenetic tree information for 33 species, generated at timetree.org.


   D. Filename:  model1.stan      
      Short description: Bayesian model that predicts Weber fraction per species. 
 

   E. Filename:  runmodel1.R    
      Short description: R script to run model 1 (File D) using the package rstan and generate 
                         Figure 2 boxplot of posterior quantiles of Weber fraction for each species.
 

   F. Filename:  TreeplotWSpecies.R      
      Short description: R script to create phylogenetic tree of 33 species for Figure 2.


   G. Filename:   Figure2inset.R     
      Short description: R script to generate figure of lambda parameter output of model 2 (File H).
 

   H. Filename:  model2.stan   
      Short description: Bayesian model that predicts influence of a predictor on Weber fraction.


   I. Filename:  runmodel2.R 
      Short description: R script to run model 2 (File H) using the package rstan.


   J. Filename:  Figure3.R   
      Short description: R script to generate B1 posterior credible intervals from model2 (File H) output by predictor.


   K. Filename: Figure4.R
      Short description: R script to generate posterior credible intervals for scale parameters and B1 for one predictor from model 2 (File H) output. 


   L. Filename: Figure5.R 
      Short description: R script that pulls from SpeciesPredictors.csv (File B) where mean species Weber fraction was recorded 
                         from model 1 (File D) output to generate scatterplots.


Additional Notes on File Relationships, Context, or Content 
(for example, if a user wants to reuse and/or cite your data, 
what information would you want them to know?):       

Our main dataset (NumberData.csv, File A) is a compilation of behavioral performance data collected by coauthors
as well as published open access data on number discrimination tasks. 
The social, ecological and brain variables (SpeciesPredictors.csv, File B) were collected from published literature. 
The two Bayesian models implemented in this analysis (File D and File H) were written by author Steven T. Piantadosi.
Bayesian models were run using Stan and rstan (the R interface with Stan), so files for model runs and generating figures are 
.stan or .R files.
      

#
# File Naming Convention: Define your File Naming Convention 
# (FNC), the framework used for naming your files systematically 
# to describe what they contain, which could be combined with the
# Directory of Files. 
#

File Naming Convention: File names are kept short and simple and correspond to either the figure 
in the publication they are generating, the model they are or are linked to, or a simple description of content. 


#
# Data Description: A data description, dictionary, or codebook
# defines the variables and abbreviations used in a dataset. This
# information can be included in the README file, in a separate 
# file, or as part of the data file. If it is in a separate file
# or in the data file, explain where this information is located
# and ensure that it is accessible without specialized software.
# (We recommend using plain text files or tabular plain text CSV
# files exported from spreadsheet software.) 
#

-----------------------------------------
DATA DESCRIPTION FOR: [NumberData.csv]
-----------------------------------------
<create sections for each dataset included>


1. Number of variables: 8


2. Number of cases/rows: 6756


3. Missing data codes:
        Code/symbol        Definition
        NA        	   Missing data 
	Note there are no missing data in this particular file


4. Variable List

    A. Name: species
       Description: Species of animal, numbered 1-33

    B. Name: study
       Description: Quantitative research study numbered.
                    Study number is unique for each species and task.

    C. Name: subject
       Description: Individual animal research subject numbered.

    D. Name: n1
       Description: The first number being compared in the quantity discrimination.

    E. Name: n2
       Description: The second number being compared in the quantity discrimination.

    F. Name: ntrials
       Description: The total number of trials of that quantity discimination.

    G. Name: ncorrect
       Description: The number of correct trials of that quantity discrimination in choosing the larger quantity. 

    H. Name: task
       Description: Task paradigm
                  1-controlled array
                  2-sequential
                  3-simultaneous

    I. Name: speciesname
       Description: Species scientific name for clarity to the reader to supplement species numbers (A).
 
    J. Name: refTableS1
       Description: The publication from which the line of data is drawn.
                    Note that one publication can contain multiple studies (B).

    K. Name: taskname
       Description: Task paradigm name for clarity to reader to supplement task paradigm number (I). 

-----------------------------------------
DATA DESCRIPTION FOR: [SpeciesPredictors.csv]
-----------------------------------------


1. Number of variables: 33


2. Number of cases/rows: 33


3. Missing data codes:
        Code/symbol        Definition
        NA        	   Missing data 


4. Variable List

    A. Name: species
       Description: Scientific name of animal species. 

    B. Name: name
       Description: Common name of animal species. 

    C. Name: species
       Description: Animal species numbered 1-33

    D. Name: group
       Description: Taxonomic group to which an animal species belongs:
                    Bird
                    Mammal (non-primate mammal)
                    Primate

    E. Name: w
       Description: Mean Weber fraction (w) per species generated by model1

    F. Name: sd
       Description: Standard deviation in Weber fraction per species generated by model1

    G. Name: endocranial
       Description: Endocranial volume for each species (see Figure S1 for references)

    H. Name: residualbrain
       Description: Residual brain volume for each species (see Figure S1 for references)

    I. Name: corticalneurons
       Description: Number of cortical or pallium neurons (see Figure S1 for references)

    J. Name: logcortneurons
       Description: Log of number of cortical or pallium neurons

    K. Name: cortneurmillions
       Description: Number of cortical or pallium neurons in millions

    L. Name: ecv.no.outlier
       Description: Endocranial volume with one outlier removed

    M. Name: rbv.no
       Description: Residual brain volume with two outliers removed

    N. Name: cortneur.no
       Description: Cortical or pallium neuron number with one outlier removed

    O. Name: frugivory
       Description: Percentage of the diet made up of fruit (see Figure S1 for references)

    P. Name: logpopsize
       Description: Log of population size (see Figure S1 for references)

    Q. Name: logforagesize
       Description: Log of foraging group size (variable not used in final analyses).
                    Distinction made between foraging group size and population size in Nunn and van Schaik (2001) 

    R. Name: loghomerange
       Description: Log of home range size (see Figure S1 for references)

    S. Name: logdayjourney
       Description: Log of day journey length (see Figure S1 for references) 

    T. Name: selfcontrol
       Description: Performance on self-control tasks excluding more recently published corvid values (see Figure S1 for references) 

    U. Name: selfcontrol2
       Description: Performance on self-control tasks with all available values including corvids (see Figure S1 for references)

    V. Name: gencog
       Description: General cognitive score from Deaner, van Schaik, and Johnson 2006

    W. Name: gencogrev
       Description: General cognitive score from Deaner, van Schaik, and Johnson 2006 
                    with score reversed for clarity so that a higher score indicates 
                    high general cognitive ability.

    X. Name: cortneurdensity
       Description: Cortical or pallium neuron density (see Figure S1 for references)

    Y. Name: logcortneurdens
       Description: Log of cortical or pallium neuron density.

    Z. Name: cerebellneurdensity
       Description: Cerebellum neuron density (see Figure S1 for references)

    AA. Name: logcerebneurdens
       Description: Log of cerebellum neuron density 

    AB. Name: cerebellumneurons
       Description: Number of cerebellum neurons (see Figure S1 for references)

    AC. Name: logcerebneurons
       Description: Log of number of cerebellum neurons
 
    AD. Name: cerebellumneuronsmill
       Description: Number of cerebellum neurons in millions

    AE. Name: cerebellneurnoelephant
       Description: Number of cerebellum neurons with one outlier removed 

    AF. Name: logcerebneuronsno
       Description: Log of number of cerebellum neurons with one outlier removed 

    AG. Name: cerebellneurnoomill
       Description: Number of cerebellum neurons in millions with one outlier removed 


-----------------------------------------
DATA DESCRIPTION FOR: [WSpeciesTree.nex]
-----------------------------------------
This phylogenetic tree information for 33 bird and mammal species was generated 
at timetree.org where you can load a list of species and download the tree in 
Newick tree format. The Newick format is used to describe a phylogenetic tree 
as a string of text. Parentheses are used to group sequence names, and branch lengths
are included using colons followed by the length.

The file was converted to NEXUS (NEXUS is considered a standard file format 
for phylogeny information.)

The file was updated using the R package ape. Updates to tree data are based on 
Miller and Lambert 2006 and Savolainen 2002.


-----------------------------------------
DESCRIPTION FOR: [model1.stan]
-----------------------------------------
Our first model defined in the Stan modeling language. The model includes a data block, a parameters block, 
a model block, and a generated quantities block. 

Data block: Defines the data the model expects, which will be pulled from File A (and File B only for species) and assigns these data 
            to Stan variables with specified data types. The covariance matrix from the phylogeny
            is also passed in here.

Parameters block: Defines the parameters that will be used in the model, specifically B0, lambda_w (parameter 
                  of the phylogenetic regression), and the random effects. 

Model block: Defines the statistical model, including the priors and the likelihood. The likelihood in 
             the case of our model is based in the psychophysics of number. This block also addresses missing data
             points through missing data imputation.  

Generated quantities block: This block is executed after a sample has been generated, and computes average
                            Weber fraction value per species. 
            
            
-----------------------------------------
DESCRIPTION FOR: [runmodel1.R]
-----------------------------------------
Using the Stan interface with R, a package called rstan, this R script runs the Stan model 1 (File D) 
and generates Weber fraction values and boxplot per species (posterior quantiles) to match up 
with phylogenetic tree. 


-----------------------------------------
DESCRIPTION FOR: [TreeplotWSpecies.R]
-----------------------------------------
Using the R package ggtree ('ggtree' extends the 'ggplot2' plotting system for visualization and annotation 
of phylogenetic trees), this R script creates a tree of the 33 species using the phylogenetic tree information 
file (File C).


-----------------------------------------
DESCRIPTION FOR: [Figure2inset.R]
-----------------------------------------
This R script creates a simple figure (Figure 2 inset) of lambda parameter output from model 2 (File H). Before running this script,
firt generate a .csv (name it "lambda_inset.csv") of the lambda_w posterior l.50 values in "Dall.csv" file generated by the model for all predictors 
for model 2 output. Then calculate mean of 0.05 and mean of 0.95 uncertainty intervals generated in "Dall.csv" file from model 2 output, 
and write these values into the ymin and ymax of the script. 


-----------------------------------------
DESCRIPTION FOR: [model2.stan]
-----------------------------------------
Our second model defined in the Stan modeling language. The model includes a data block, a parameters block, and
a model block. 

Data block: Defines the data the model expects, which will be pulled from File A and File B, and assigns these data 
            to Stan variables with specified data types. The covariance matrix from the phylogeny
            is also passed in here. Unlike in model 1 (File D), model 2 defines and assigns a predictor variable 
            (which will be one of the social, ecological or brain variables in File B). 

Parameters block: Defines the parameters that will be used in the model, specifically B0, B1, lambda_w (parameter 
                  of the phylogenetic regression), and the random effects. 

Model block: Defines the statistical model, including the priors and the likelihood. The likelihood in 
             the case of our model is based in the psychophysics of number. This block also addresses missing data
             points through missing data imputation.  


-----------------------------------------
DESCRIPTION FOR: [runmodel2.R]
-----------------------------------------
Using the Stan interface with R, a package called rstan, this R script runs the Stan model 2 (File H)
and generates a .csv of posterior quantiles by predictor called "Dall.csv". For each predictor run, script
is also included that uses the bayesplot R package to plot posterior interval estimates and histograms for
B0, B1 and lambda parameters. 


-----------------------------------------
DESCRIPTION FOR: [Figure3.R]
-----------------------------------------
This R script graphs B1 posterior credible intervals from model 2 (File H) output by predictor.


-----------------------------------------
DESCRIPTION FOR: [Figure4.R]
-----------------------------------------
This R script uses the bayesplot package to graph posterior credible intervals for scale parameters and one predictor, group size.


-----------------------------------------
DESCRIPTION FOR: [Figure5.R]
-----------------------------------------
This R script pulls from File B where mean species Weber fraction (w) (Variable E) was recorded from model 1 (File D) output
and generates scatterplots of w by predictors of interest. 


--------------------------
METHODOLOGICAL INFORMATION
--------------------------

#
# Software: If specialized software(s) generated your data or
# are necessary to interpret it, please provide for each (if
# applicable): software name, version, system requirements,
# and developer. 
#If you developed the software, please provide (if applicable): 
#A copy of the software’s binary executable compatible with the system requirements described above. 
#A source snapshot or distribution if the source code is not stored in a publicly available online repository.
#All software source components, including pointers to source(s) for third-party components (if any)

1. Software-specific information:
<create a new entry for each qualifying software program>

Name: Stan
Version: 2.21.0
System Requirements: runs on all major platforms (Linux, Mac, Windows)
Open Source? (Y/N): Y

(if available and applicable)
Executable URL:
Source Repository URL:
Developer: The Stan Development Team
Product URL: https://mc-stan.org/
Software source components:


Additional Notes(such as, will this software not run on 
certain operating systems?):



Name: R 
Version: 4.1.1
System Requirements: runs on all major platforms (Linux, Mac, Windows)
Open Source? (Y/N): Y

(if available and applicable)
Executable URL:
Source Repository URL:
Developer: R Development Core Team
Product URL: https://www.r-project.org/
Software source components:


Additional Notes(such as, will this software not run on 
certain operating systems?):
Analyses were conducted using the rstan package in R, which is the R interface for Stan.


#
# Equipment: If specialized equipment generated your data,
# please provide for each (if applicable): equipment name,
# manufacturer, model, and calibration information. Be sure
# to include specialized file format information in the data
# dictionary.
#

2. Equipment-specific information:
<create a new entry for each qualifying piece of equipment>

Manufacturer:
Model:

(if applicable)
Embedded Software / Firmware Name:
Embedded Software / Firmware Version:
Additional Notes:

#
# Dates of Data Collection: List the dates and/or times of
# data collection.
#

3. Date of data collection (single date, range, approximate date) <suggested format YYYYMMDD>:
This study is based in a meta-analysis. 
Data collection dates for behavioral data (File A) are reported in original publications referred to in Figure S1. 
Predictor data (File B) were compiled from published literature by SEK and MAHB in 2017 and 2020.