BOLD5000 Collection

Posted on 14.09.2021 - 16:26 by Michael Tarr
Brain, Object, Landscape Dataset

Umbrella repository containing data and manuscripts related to BOLD5000

Items in collection:

1) BOLD5000 - BOLD 5000 Release 1.0
2) BOLD5000 Release 2.0 - A complete re-release of functional data from BOLD5000, with optimized procedures for GLM estimation of brain-wide percent signal change in response to the experimental stimuli, yielding significant increases in the reliability of BOLD signal estimates compared to the initial data release
3) BOLD5000, a public fMRI dataset while viewing 5000 visual images - PDF of the Nature Scientific Data paper [you can also get the paper directly here: ]

Motivation. Vision science - particularly machine vision - is being revolutionized by large-scale datasets. State-of-the-art artificial vision models critically depend on large-scale datasets to achieve high performance. In contrast, although large-scale learning models (e.g., AlexNet) have been applied to human neuroimaging data, the stimuli for such neuroimaging experiments include significantly fewer images. The small size of these stimulus sets also translates to limited image diversity.

Contents. Here we dramatically increase the stimulus set size deployed in an fMRI study of visual scene processing. We scanned four participants in a slow-evented related design that incorporated 4,916 unique scenes. Data was collected over 16 sessions, 15 of which were task-related sessions, plus an additional session for acquiring high resolution anatomical scans. In 8 of the 15 task-related sessions, a functional localizer was run in order to independently define scene-selective cortex. In each scanning session, participants filled out a questionnaire (Daily Intake) about their daily routine, including: current status regarding food and beverage intake, sleep, exercise, ibuprofen, and comfort in the scanner. During BOLD scanning, physiological data (heart rate and respiration) was also acquired.

The experiment including 4,803 images presented on a single trial throughout the experiment, and 112 images repeated four times, and one image repeated three times, throughout the experiment, yielding a total of 5,254 stimuli trials. The stimuli were drawn from three datasets: 1) 1000 images from Scene Images (250 scene categories, based on SUN categories, with four exemplars each); 2) 2000 images from the COCO dataset; and 3) 1916 images from the ImageNet dataset. In the experiment, images were presented for 1 second, with 9 seconds of fixation between trials. Participants were asked to judge whether they liked, disliked, or were neutral about the image.

Summary. BOLD5000 is unique in three ways: it is 1) significantly larger than existing slow-event neural datasets by an order of magnitude, 2) extremely diverse in stimuli, 3) considerably overlapping with existing computer vision datasets. Our large-scale dataset enables novel neural network training and novel exploration of benchmark computer vision datasets through neuroscience. Finally, the scale advantage of our dataset and the use of a slow event-related design enables, for the first time, joint computer vision and fMRI analyses that span a significant and diverse region of image space using high-performing models.

Please refer to individual datasets for more information and to our website for more details and future news and releases:

For documentation and processing scripts, see:


Chang, Nadine; Pyles, John; Prince, Jacob; Tarr, Michael; Aminoff, Elissa (2021): BOLD5000 Collection. Carnegie Mellon University. Collection.
Select your citation style and then place your mouse over the citation text to select it.


Understanding Scenes and Events through Joint Parsing, Cognitive Reasoning and Lifelong Learning

United States Department of the Navy

SL-CN: Mapping, Measuring, and Modeling Perceptual Expertise

Directorate for Social, Behavioral & Economic Sciences

CompCog: Human Scene Processing Characterized by Computationally-derived Scene Primitives

Directorate for Social, Behavioral & Economic Sciences


need help?