First, read in capture-area-level SpaceRanger https://www.10xgenomics.com/support/software/space-ranger/latest/analysis/running-pipelines/space-ranger-count outputs. Then, overwrite spatial coordinates and images to represent group-level samples using sample_info$group (though keep original coordinates in colData columns ending with the suffix "_original"). Next, add info about overlaps (via spe$exclude_overlapping and spe$overlap_key). Ultimately, return a SpatialExperiment-class ready for visualization or downstream analysis.

build_spe(
  sample_info,
  coords_dir,
  count_type = "sparse",
  reference_gtf = NULL,
  gtf_cols = c("source", "type", "gene_id", "gene_version", "gene_name", "gene_type"),
  calc_error_metrics = FALSE
)

Arguments

sample_info

A data.frame() with columns capture_area, group, fiji_xml_path, fiji_image_path, spaceranger_dir, intra_group_scalar, and group_hires_scalef. The last two are made by rescale_fiji_inputs().

coords_dir

A character(1) vector giving the directory containing sample directories each with tissue_positions.csv, scalefactors_json.json, and tissue_lowres_image.png files produced from refinement with prep_fiji_coords() and related functions.

count_type

A character(1) vector passed to type from SpatialExperiment::read10xVisium, defaulting to "sparse".

reference_gtf

Passed to spatialLIBD::read10xVisiumWrapper(). If working on the same system where SpaceRanger was run, the GTF will be automatically found; otherwise a character(1) path may be supplied, pointing to a GTF file of gene annotation to populate rowData() with.

gtf_cols

Passed to spatialLIBD::read10xVisiumWrapper(). Columns in the reference GTF to extract and populate rowData().

calc_error_metrics

A logical(1) vector indicating whether to calculate error metrics related to mapping spots to well-defined array coordinates. If TRUE, adds euclidean_error and shared_neighbors spot-level metrics to the colData(). The former indicates distance in number of inter-spot distances to "move" a spot to the new array position; the latter indicates the fraction of neighbors for the associated capture area that are retained after mapping, which can be quite time-consuming to compute.

Value

A SpatialExperiment-class object with one sample per group specified in sample_info using transformed pixel and array coordinates (including in the spatialCoords()).

Author

Nicholas J. Eagles

Examples

########################################################################
#   Prepare sample_info
########################################################################

if (file.exists("sample_info.rds")) {
    sample_info <- readRDS("sample_info.rds")
} else {
    sample_info <- dplyr::tibble(
        group = "Br2719",
        capture_area = c("V13B23-283_A1", "V13B23-283_C1", "V13B23-283_D1")
    )
    #   Add 'spaceranger_dir' column
    sr_dir <- tempdir()
    temp <- unzip(
        spatialLIBD::fetch_data("visiumStitched_brain_spaceranger"),
        exdir = sr_dir
    )
    sample_info$spaceranger_dir <- file.path(
        sr_dir, sample_info$capture_area, "outs", "spatial"
    )

    #   Add Fiji-output-related columns
    fiji_dir <- tempdir()
    temp <- unzip(
        spatialLIBD::fetch_data("visiumStitched_brain_Fiji_out"),
        exdir = fiji_dir
    )
    sample_info$fiji_xml_path <- temp[grep("xml$", temp)]
    sample_info$fiji_image_path <- temp[grep("png$", temp)]

    ## Re-size images and add more information to the sample_info
    sample_info <- rescale_fiji_inputs(sample_info, out_dir = tempdir())

    saveRDS(sample_info, "sample_info.rds")
}

## Preparing Fiji coordinates and images for build_spe()
spe_input_dir <- tempdir()
prep_fiji_coords(sample_info, out_dir = spe_input_dir)
#> [1] "/tmp/RtmpAA9v23/Br2719/tissue_positions.csv"
prep_fiji_image(sample_info, out_dir = spe_input_dir)
#> [1] "/tmp/RtmpAA9v23/Br2719/tissue_lowres_image.png"
#> [2] "/tmp/RtmpAA9v23/Br2719/scalefactors_json.json" 

########################################################################
#   Build the SpatialExperiment
########################################################################

#    Since we don't have access to the original GTF used to run SpaceRanger,
#    we must explicitly supply our own GTF to build_spe(). We use
#    GENCODE release 32, intended to be quite close to the actual GTF used,
#    which is available from:
#    https://cf.10xgenomics.com/supp/cell-exp/refdata-gex-GRCh38-2024-A.tar.gz
bfc <- BiocFileCache::BiocFileCache()
gtf_cache <- BiocFileCache::bfcrpath(
    bfc,
    paste0(
        "ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/",
        "release_32/gencode.v32.annotation.gtf.gz"
    )
)

## Now we can build the stitched SpatialExperiment object
spe <- build_spe(
    sample_info,
    coords_dir = spe_input_dir, reference_gtf = gtf_cache
)
#> Building SpatialExperiment using capture area as sample ID
#> 2024-10-03 14:58:04.842468 SpatialExperiment::read10xVisium: reading basic data from SpaceRanger
#> 2024-10-03 14:58:10.849831 read10xVisiumAnalysis: reading analysis output from SpaceRanger
#> 2024-10-03 14:58:11.244415 add10xVisiumAnalysis: adding analysis output from SpaceRanger
#> 2024-10-03 14:58:11.561437 rtracklayer::import: reading the reference GTF file
#> 2024-10-03 14:58:40.91767 adding gene information to the SPE object
#> Warning: Gene IDs did not match. This typically happens when you are not using the same GTF file as the one that was used by SpaceRanger. For example, one file uses GENCODE IDs and the other one ENSEMBL IDs. read10xVisiumWrapper() will try to convert them to ENSEMBL IDs.
#> Warning: Dropping 2226 out of 38606 genes for which we don't have information on the reference GTF file. This typically happens when you are not using the same GTF file as the one that was used by SpaceRanger.
#> 2024-10-03 14:58:41.149315 adding information used by spatialLIBD
#> Overwriting imgData(spe) with merged images (one per group)
#> Adding array coordinates and overlap info

## Let's explore the stitched SpatialExperiment object
spe
#> class: SpatialExperiment 
#> dim: 36380 14976 
#> metadata(0):
#> assays(1): counts
#> rownames(36380): ENSG00000243485.5 ENSG00000237613.2 ...
#>   ENSG00000198695.2 ENSG00000198727.2
#> rowData names(6): source type ... gene_type gene_search
#> colnames(14976): AAACAACGAATAGTTC-1_V13B23-283_A1
#>   AAACAAGTATCTCCCA-1_V13B23-283_A1 ... TTGTTTGTATTACACG-1_V13B23-283_D1
#>   TTGTTTGTGTAAATTC-1_V13B23-283_D1
#> colData names(31): sample_id in_tissue ... overlap_key
#>   exclude_overlapping
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
#> spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres
#> imgData names(4): sample_id image_id data scaleFactor