Carnegie Mellon University
Browse

Labeled_AAV8_Capsid_Data

dataset
posted on 2025-05-23, 20:22 authored by Lilianna GutierrezLilianna Gutierrez, Anne RobinsonAnne Robinson

Data & File Overview

Directory of Files:

All_Patches_resized.zip

  • Size: (9.44GB)
  • Description:
    This is a compressed folder that contains preprocessed patches. Each patch is an image of one of the following classes:
    • full capsid
    • partially full capsid
    • empty capsid
    • aggregation
    • ice
    • broken capsid
    • background

names_labels_df.csv.zip

  • Size: (293KB)
  • Description:
    This is a compressed .csv file that contains 94,830 patch instances each with a patch name and the associated class label.

Additional Notes:

This data was used in conjunction with the code at https://github.com/lcgutier/capsid-eyes to create the Capsidize app at https://github.com/lcgutier/Capsidize.

File Naming Convention:

Annotated and Preprocessed Images:

The classes are defined as follows: 1=Full capsid, 2=Partially full capsid, 3=empty capsid, 4=aggregation, 5=ice, 6=broken capsid, 7=background Training and testing sets were randomly sampled from the resulting patches and each class was balanced.

Images are named according to the dataset they belong to, followed by the original image ID number and lastly with the annotation ID number.

Data Description

names_labels_df.csv.zip

  1. Columns: patch_name, label
  2. Rows: 94,830 patch instances
  3. Class Types
    • Full Capsids, label=1, Description: A capsid with a dark ring and even dark infill density.
    • Partial Capsids, label=2, Description: A capsid with a dark ring and a mid-range infill density.
    • Empty Capsids, label=3, Description: A capsid with a dark ring and a light infill density.
    • Aggregation, label=4, Description: A cluster of capsids.
    • Ice, label=5, Description: Dark globular crystals that range in size.
    • Broken Capsids, label=6, Description: Full, partial, or empty capsids that have broken. These are often only fragments.
    • Background, label=7, Description: This class contains all background masks including those as a result of background differences seen at the carbon grid hole/grid material boundary.

Methodological Information

Software-specific information:

Additional Notes: The source code held in the capsid-eyes github repository can be run without macos, but the app was compiled on darwin-arm64.

Equipment-specific information:


  • Sample Preparation
    • Manufacturer: (Thermo Fisher Scientific [TFS], Waltham, MA, USA)
    • Model: Vitrobot Mk 4
    • Use: cooled in a bath of liquid nitrogen to freeze samples for imaging
  • Imaging
    • Manufacturer: Thermo Fisher Scientific
    • Model: Krios 3Gi microscope
    • Use: To collect TEM images. Operating at 300 kVolts and equipped with a Selectris energy filter and Falcon 4i direct electron detecting camera operating in electron counting mode.

Funding

This project was funded in part by a grant provided by the National Institute for Innovation in Manufacturing Biopharmaceuticals (NIIMBL) (grant 70NANB21H06 from the U.S. Department of Commerce, National Institute of Standards and Technology)

ARCS (Achievement Rewards for College Scientists) Fellowship awarded to LG

History

Date

2025-05-22