Welcome

Welcome to the spatialLIBD project! It is composed of:

The web application allows you to browse the LIBD human dorsolateral pre-frontal cortex (DLPFC) spatial transcriptomics data generated with the 10x Genomics Visium platform. Through the R/Bioconductor package you can also download the data as well as visualize your own datasets using this web application. Please check the manuscript or bioRxiv pre-print for more details about this project.

If you tweet about this website, the data or the R package please use the #spatialLIBD hashtag. You can find previous tweets that way as shown here. Thank you!

Study design

As a quick overview, the data presented here is from portion of the DLPFC that spans six neuronal layers plus white matter (A) for a total of three subjects with two pairs of spatially adjacent replicates (B). Each dissection of DLPFC was designed to span all six layers plus white matter (C). Using this web application you can explore the expression of known genes such as SNAP25 (D, a neuronal gene), MOBP (E, an oligodendrocyte gene), and known layer markers from mouse studies such as PCP4 (F, a known layer 5 marker gene).

Basics

Install spatialLIBD

R is an open-source statistical environment which can be easily modified to enhance its functionality via packages. spatialLIBD is a R package available via Bioconductor. R can be installed on any operating system from CRAN after which you can install spatialLIBD by using the following commands in your R session:

if (!requireNamespace("BiocManager", quietly = TRUE)) {
    install.packages("BiocManager")
}

BiocManager::install("spatialLIBD")

## Check that you have a valid Bioconductor installation
BiocManager::valid()

To run all the code in this vignette, you might need to install other R/Bioconductor packages, which you can do with:

BiocManager::install("spatialLIBD", dependencies = TRUE, force = TRUE)

If you want to use the development version of spatialLIBD, you will need to use the R version corresponding to the current Bioconductor-devel branch as described in more detail on the Bioconductor website. Then you can install spatialLIBD from GitHub using the following command.

BiocManager::install("LieberInstitute/spatialLIBD")

Required knowledge

spatialLIBD (Pardo, Spangler, Weber, Hicks, Jaffe, Martinowich, Maynard, and Collado-Torres, 2022) is based on many other packages and in particular in those that have implemented the infrastructure needed for dealing with single cell RNA sequencing data, visualization functions, and interactive data exploration. That is, packages like SingleCellExperiment that allow you to store the data, ggplot2 and plotly for visualizing the data, and shiny for building an interactive interface. A spatialLIBD user who only accesses the web application is not expected to deal with those packages directly. A spatialLIBD user will need to be familiar with SingleCellExperiment and ggplot2 to understand the data provided by spatialLIBD or the graphical results spatialLIBD provides. Furthermore, it’ll be useful for the user to know about shiny and plotly if you wish to adapt the web application provided by spatialLIBD.

If you are asking yourself the question “Where do I start using Bioconductor?” you might be interested in this blog post.

Asking for help

As package developers, we try to explain clearly how to use our packages and in which order to use the functions. But R and Bioconductor have a steep learning curve so it is critical to learn where to ask for help. The blog post quoted above mentions some but we would like to highlight the Bioconductor support site as the main resource for getting help regarding Bioconductor. Other alternatives are available such as creating GitHub issues and tweeting. However, please note that if you want to receive help you should adhere to the posting guidelines. It is particularly critical that you provide a small reproducible example and your session information so package developers can track down the source of the error.

Citing spatialLIBD

We hope that spatialLIBD will be useful for your research. Please use the following information to cite the package and the research article describing the data provided by spatialLIBD. Thank you!

## Citation info
citation("spatialLIBD")
#> To cite package 'spatialLIBD' in publications use:
#> 
#>   Pardo B, Spangler A, Weber LM, Hicks SC, Jaffe AE, Martinowich K,
#>   Maynard KR, Collado-Torres L (2022). "spatialLIBD: an R/Bioconductor
#>   package to visualize spatially-resolved transcriptomics data." _BMC
#>   Genomics_. doi:10.1186/s12864-022-08601-w
#>   <https://doi.org/10.1186/s12864-022-08601-w>,
#>   <https://doi.org/10.1186/s12864-022-08601-w>.
#> 
#>   Maynard KR, Collado-Torres L, Weber LM, Uytingco C, Barry BK,
#>   Williams SR, II JLC, Tran MN, Besich Z, Tippani M, Chew J, Yin Y,
#>   Kleinman JE, Hyde TM, Rao N, Hicks SC, Martinowich K, Jaffe AE
#>   (2021). "Transcriptome-scale spatial gene expression in the human
#>   dorsolateral prefrontal cortex." _Nature Neuroscience_.
#>   doi:10.1038/s41593-020-00787-0
#>   <https://doi.org/10.1038/s41593-020-00787-0>,
#>   <https://www.nature.com/articles/s41593-020-00787-0>.
#> 
#>   Huuki-Myers LA, Spangler A, Eagles NJ, Montgomergy KD, Kwon SH, Guo
#>   B, Grant-Peters M, Divecha HR, Tippani M, Sriworarat C, Nguyen AB,
#>   Ravichandran P, Tran MN, Seyedian A, Consortium P, Hyde TM, Kleinman
#>   JE, Battle A, Page SC, Ryten M, Hicks SC, Martinowich K,
#>   Collado-Torres L, Maynard KR (2024). "A data-driven single-cell and
#>   spatial transcriptomic map of the human prefrontal cortex."
#>   _Science_. doi:10.1126/science.adh1938
#>   <https://doi.org/10.1126/science.adh1938>,
#>   <https://doi.org/10.1126/science.adh1938>.
#> 
#>   Kwon SH, Parthiban S, Tippani M, Divecha HR, Eagles NJ, Lobana JS,
#>   Williams SR, Mark M, Bharadwaj RA, Kleinman JE, Hyde TM, Page SC,
#>   Hicks SC, Martinowich K, Maynard KR, Collado-Torres L (2023).
#>   "Influence of Alzheimer’s disease related neuropathology on local
#>   microenvironment gene expression in the human inferior temporal
#>   cortex." _GEN Biotechnology_. doi:10.1089/genbio.2023.0019
#>   <https://doi.org/10.1089/genbio.2023.0019>,
#>   <https://doi.org/10.1089/genbio.2023.0019>.
#> 
#> To see these entries in BibTeX format, use 'print(<citation>,
#> bibtex=TRUE)', 'toBibtex(.)', or set
#> 'options(citation.bibtex.max=999)'.

Overview

The spatialLIBD (Pardo, Spangler, Weber et al., 2022) package was developed for analyzing the human dorsolateral prefrontal cortex (DLPFC) spatial transcriptomics data generated with the 10x Genomics Visium technology by researchers at the Lieber Institute for Brain Development (LIBD) (Maynard, Collado-Torres, Weber, Uytingco, Barry, Williams, II, Tran, Besich, Tippani, Chew, Yin, Kleinman, Hyde, Rao, Hicks, Martinowich, and Jaffe, 2021). An initial shiny application was developed for interactively exploring this data and for assigning human brain layer labels to the each spot for each sample generated. While this was useful enough for our project, we made this Bioconductor package in case you want to:

  • access our Visium data to get some data from this new technology and develop methods or infrastructure for other Visium datasets.
  • re-shape your data into what ours is structured as, then re-use the visualization functions and/or the shiny app itself.
  • want to explore our data in more detail. This can range from launching the shiny application locally to diving into the specifics of the data from our project (Maynard, Collado-Torres, Weber et al., 2021).

In this vignette we’ll showcase how you can access the Human DLPFC LIBD Visium dataset (Maynard, Collado-Torres, Weber et al., 2021), the R functions provided by spatialLIBD (Pardo, Spangler, Weber et al., 2022), and an overview of how you can re-shape your own Visium dataset to match the structure we used.

To get started, please load the spatialLIBD package.

Human DLPFC Visium dataset

The human DLPFC 10x Genomics Visium dataset analyzed by LIBD researchers and colleagues is described in detail by Maynard, Collado-Torres et al (Maynard, Collado-Torres, Weber et al., 2021). However, briefly, this dataset is composed of:

  • Three brain subjects (all controls; two males, one female; ages 30-46; details).
  • Four images per subject: two spatially adjacent replicates at position 0, then two more spatially adjacent replicates 300 micrometers away.
  • Slices designed to cover layers 1 through 6 and the white matter (WM) of the dorsolateral prefrontal cortex (DLPFC).

Data specifics

We combined all the Visium data into a single SpatialExperiment (Righelli, Weber, Crowell, Pardo, Collado-Torres, Ghazanfar, Lun, Hicks, and Risso, 2022) object that we typically refer to as spe 1. It has 33,538 genes (rows) and 47,681 spots (columns). This is the initial point for most of our analyses (code available on GitHub). Using spatialLIBD (Pardo, Spangler, Weber et al., 2022) we manually assigned each spot across all 12 images to a layer (L1 through L6 or WM). We then compressed the spot-level data at the layer-level using a pseudo-bulking approach resulting in the SingleCellExperiment object we typically refer to as sce_layer 2. We then computed for each gene t or F statistics assessing whether the gene had higher expression in a given layer compared to the rest (enrichment; t-stat), between one layer and another layer (pairwise; t-stat), or had any expression variability across all layers (anova; F-stat). The results from the models are stored in what we refer to as modeling_results 3.

In summary,

  • spe is the SpatialExperiment object with all the spot-level data and the histology information for visualization of the data.
  • sce_layer is the SingleCellExperiment object with the layer-level data.
  • modeling_results contains the layer-level enrichment, pairwise and anova statistics.

Downloading the data with spatialLIBD

Using spatialLIBD (Pardo, Spangler, Weber et al., 2022) you can download all of these R objects. They are hosted by Bioconductor’s ExperimentHub (Morgan and Shepherd, 2024) resource and you can download them using spatialLIBD::fetch_data(). fetch_data() will query ExperimentHub which in turn will download the data and cache it so you don’t have to download it again. If ExperimentHub is unavailable, then fetch_data() has a backup option that does not cache the files 4 Below we obtain all of these objects.

## Connect to ExperimentHub
ehub <- ExperimentHub::ExperimentHub()
## Download the small example sce data
sce <- fetch_data(type = "sce_example", eh = ehub)
#> 2024-10-31 20:31:48.544302 loading file /github/home/.cache/R/BiocFileCache/4a115950912_sce_sub_for_vignette.Rdata%3Fdl%3D1

## Convert to a SpatialExperiment object
spe <- sce_to_spe(sce)

## If you want to download the full real data (about 2.1 GB in RAM) use:
if (FALSE) {
    if (!exists("spe")) spe <- fetch_data(type = "spe", eh = ehub)
}

## Query ExperimentHub and download the data
if (!exists("sce_layer")) sce_layer <- fetch_data(type = "sce_layer", eh = ehub)
#> 2024-10-31 20:31:55.311001 loading file /github/home/.cache/R/BiocFileCache/4a11f72b28f_Human_DLPFC_Visium_processedData_sce_scran_sce_layer_spatialLIBD.Rdata%3Fdl%3D1
modeling_results <- fetch_data("modeling_results", eh = ehub)
#> 2024-10-31 20:31:55.78616 loading file /github/home/.cache/R/BiocFileCache/4a1260aa093_Human_DLPFC_Visium_modeling_results.Rdata%3Fdl%3D1

Once you have downloaded the objects, we can explore them a little bit

## spot-level data
spe
#> class: SpatialExperiment 
#> dim: 33538 47681 
#> metadata(0):
#> assays(2): counts logcounts
#> rownames(33538): ENSG00000243485 ENSG00000237613 ... ENSG00000277475
#>   ENSG00000268674
#> rowData names(9): source type ... gene_search is_top_hvg
#> colnames(47681): AAACAACGAATAGTTC-1 AAACAAGTATCTCCCA-1 ...
#>   TTGTTTCCATACAACT-1 TTGTTTGTGTAAATTC-1
#> colData names(69): sample_id Cluster ... array_row array_col
#> reducedDimNames(6): PCA TSNE_perplexity50 ... TSNE_perplexity80
#>   UMAP_neighbors15
#> mainExpName: NULL
#> altExpNames(0):
#> spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres
#> imgData names(4): sample_id image_id data scaleFactor

## layer-level data
sce_layer
#> class: SingleCellExperiment 
#> dim: 22331 76 
#> metadata(0):
#> assays(2): counts logcounts
#> rownames(22331): ENSG00000243485 ENSG00000238009 ... ENSG00000278384
#>   ENSG00000271254
#> rowData names(10): source type ... is_top_hvg is_top_hvg_sce_layer
#> colnames(76): 151507_Layer1 151507_Layer2 ... 151676_Layer6 151676_WM
#> colData names(13): sample_name layer_guess ...
#>   layer_guess_reordered_short spatialLIBD
#> reducedDimNames(6): PCA TSNE_perplexity5 ... UMAP_neighbors15 PCAsub
#> mainExpName: NULL
#> altExpNames(0):

## list of modeling result tables
sapply(modeling_results, class)
#>        anova   enrichment     pairwise 
#> "data.frame" "data.frame" "data.frame"
sapply(modeling_results, dim)
#>      anova enrichment pairwise
#> [1,] 22331      22331    22331
#> [2,]    10         23       65
sapply(modeling_results, function(x) {
    head(colnames(x))
})
#>      anova          enrichment      pairwise          
#> [1,] "f_stat_full"  "t_stat_WM"     "t_stat_WM-Layer1"
#> [2,] "p_value_full" "t_stat_Layer1" "t_stat_WM-Layer2"
#> [3,] "fdr_full"     "t_stat_Layer2" "t_stat_WM-Layer3"
#> [4,] "full_AveExpr" "t_stat_Layer3" "t_stat_WM-Layer4"
#> [5,] "f_stat_noWM"  "t_stat_Layer4" "t_stat_WM-Layer5"
#> [6,] "p_value_noWM" "t_stat_Layer5" "t_stat_WM-Layer6"

The modeling statistics are in wide format, which can make some visualizations complicated. The function sig_genes_extract_all() provides a way to convert them into long format and add some useful information. Let’s do so below.

## Convert to a long format the modeling results
## This takes a few seconds to run
system.time(
    sig_genes <-
        sig_genes_extract_all(
            n = nrow(sce_layer),
            modeling_results = modeling_results,
            sce_layer = sce_layer
        )
)
#>    user  system elapsed 
#>   9.250   0.380   9.653

## Explore the result
class(sig_genes)
#> [1] "DFrame"
#> attr(,"package")
#> [1] "S4Vectors"
dim(sig_genes)
#> [1] 1138881      12

Interactively explore the spatialLIBD data

Now that you have downloaded the data, you can interactively explore the data using a shiny (Chang, Cheng, Allaire, Sievert, Schloerke, Xie, Allen, McPherson, Dipert, and Borges, 2024) web application contained within spatialLIBD (Pardo, Spangler, Weber et al., 2022). To do so, use the run_app() function as shown below:

if (interactive()) {
    run_app(
        spe = spe,
        sce_layer = sce_layer,
        modeling_results = modeling_results,
        sig_genes = sig_genes
    )
}

The spatialLIBD shiny application allows you to browse the spot-level data and interactively label spots, as well as explore the layer-level results. Once you launch it, check the Documentation tab for each view for more details. In order to avoid duplicating the documentation, we provide all the details on the shiny application itself.

Though overall, this application allows you to export all static visualizations as PDF files or all interactive visualizations as PNG files, as well as all result table as CSV files. This is what produces the version you can access without any R use from your side at spatial.libd.org/spatialLIBD/.

spatialLIBD functions

We already covered fetch_data() which allows you to download the Human DLPFC Visium data from LIBD researchers and colleagues (Maynard, Collado-Torres, Weber et al., 2021).

Spot-level clusters and discrete variables

With the spe object that contains the spot-level data, we can visualize any discrete variable such as the layers using vis_clus() and related functions. These functions know where to extract and how to visualize the histology information.

## View our LIBD layers for one sample
vis_clus(
    spe = spe,
    clustervar = "layer_guess_reordered",
    sampleid = "151673",
    colors = libd_layer_colors,
    ... = " LIBD Layers"
)

Most of the variables stored in spe are discrete variables and as such, you can visualize them using vis_clus() and vis_grid_clus() for one or more than one sample respectively.

## This is not fully precise, but gives you a rough idea
## Some integer columns are actually continuous variables
table(sapply(colData(spe), class) %in% c("factor", "integer"))
#> 
#> FALSE  TRUE 
#>    12    57

## This is more precise (one cluster has 28 unique values)
table(sapply(colData(spe), function(x) length(unique(x))) < 29)
#> 
#> FALSE  TRUE 
#>     7    62

Notably, vis_clus() has a spatial logical(1) argument which is useful if you want to visualize the data without the histology information provided by geom_spatial() (a custom ggplot2::layer()). In particular, this is useful if you then want to use plotly::ggplotly() or other similar functions on the resulting ggplot2::ggplot() object.

## View our LIBD layers for one sample
## without spatial information
vis_clus(
    spe = spe,
    clustervar = "layer_guess_reordered",
    sampleid = "151673",
    colors = libd_layer_colors,
    ... = " LIBD Layers",
    spatial = FALSE
)

Some helper functions include get_colors() and sort_clusters() as well as the libd_layer_colors object included in spatialLIBD (Pardo, Spangler, Weber et al., 2022).

## Color palette designed by Lukas M. Weber with feedback from the team.
libd_layer_colors
#>        Layer1        Layer2        Layer3        Layer4        Layer5 
#>     "#F0027F"     "#377EB8"     "#4DAF4A"     "#984EA3"     "#FFD700" 
#>        Layer6            WM            NA           WM2 
#>     "#FF7F00"     "#1A1A1A" "transparent"     "#666666"

Spot-level genes and continuous variables

Similar to vis_clus(), the vis_gene() family of functions use the spe spot-level object to visualize the gene expression or any continuous variable such as the number of cells per spots. That is, vis_gene() can visualize any of the assays(spe) or any of the continuous variables stored in colData(spe). If you want to visualize more than one sample at a time, use vis_grid_gene() instead. And just like vis_clus(), vis_gene() has a spatial logical(1) argument to turn off the custom spatial ggplot2::layer() produced by geom_spatial().

## Available gene expression assays
assayNames(spe)
#> [1] "counts"    "logcounts"

## Not all of these make sense to visualize
## In particular, the key is not useful to visualize.
colnames(colData(spe))
#>  [1] "sample_id"                   "Cluster"                    
#>  [3] "sum_umi"                     "sum_gene"                   
#>  [5] "subject"                     "position"                   
#>  [7] "replicate"                   "subject_position"           
#>  [9] "discard"                     "key"                        
#> [11] "cell_count"                  "SNN_k50_k4"                 
#> [13] "SNN_k50_k5"                  "SNN_k50_k6"                 
#> [15] "SNN_k50_k7"                  "SNN_k50_k8"                 
#> [17] "SNN_k50_k9"                  "SNN_k50_k10"                
#> [19] "SNN_k50_k11"                 "SNN_k50_k12"                
#> [21] "SNN_k50_k13"                 "SNN_k50_k14"                
#> [23] "SNN_k50_k15"                 "SNN_k50_k16"                
#> [25] "SNN_k50_k17"                 "SNN_k50_k18"                
#> [27] "SNN_k50_k19"                 "SNN_k50_k20"                
#> [29] "SNN_k50_k21"                 "SNN_k50_k22"                
#> [31] "SNN_k50_k23"                 "SNN_k50_k24"                
#> [33] "SNN_k50_k25"                 "SNN_k50_k26"                
#> [35] "SNN_k50_k27"                 "SNN_k50_k28"                
#> [37] "GraphBased"                  "Maynard"                    
#> [39] "Martinowich"                 "layer_guess"                
#> [41] "layer_guess_reordered"       "layer_guess_reordered_short"
#> [43] "expr_chrM"                   "expr_chrM_ratio"            
#> [45] "SpatialDE_PCA"               "SpatialDE_pool_PCA"         
#> [47] "HVG_PCA"                     "pseudobulk_PCA"             
#> [49] "markers_PCA"                 "SpatialDE_UMAP"             
#> [51] "SpatialDE_pool_UMAP"         "HVG_UMAP"                   
#> [53] "pseudobulk_UMAP"             "markers_UMAP"               
#> [55] "SpatialDE_PCA_spatial"       "SpatialDE_pool_PCA_spatial" 
#> [57] "HVG_PCA_spatial"             "pseudobulk_PCA_spatial"     
#> [59] "markers_PCA_spatial"         "SpatialDE_UMAP_spatial"     
#> [61] "SpatialDE_pool_UMAP_spatial" "HVG_UMAP_spatial"           
#> [63] "pseudobulk_UMAP_spatial"     "markers_UMAP_spatial"       
#> [65] "spatialLIBD"                 "ManualAnnotation"           
#> [67] "in_tissue"                   "array_row"                  
#> [69] "array_col"

## Visualize a gene
vis_gene(
    spe = spe,
    sampleid = "151673",
    viridis = FALSE
)


## Visualize the estimated number of cells per spot
vis_gene(
    spe = spe,
    sampleid = "151673",
    geneid = "cell_count"
)


## Visualize the fraction of chrM expression per spot
## without the spatial layer
vis_gene(
    spe = spe,
    sampleid = "151673",
    geneid = "expr_chrM_ratio",
    spatial = FALSE
)

As for the color palette, you can either use the color blind friendly palette when viridis = TRUE or a custom palette we determined. Note that if you design your own palette, you have to take into account that values it can be hard to distinguish some colors from the histology set of purple tones that are noticeable when the continuous variable is below or equal to the minCount (in our palettes such points have a 'transparent' color). For more details, check the internal code of vis_gene_p().

Extract significant genes

Earlier we also ran sig_genes_extract_all() in order to run the shiny web application. However, we didn’t explain the output. If you explore it, you’ll notice that it’s a very long table with several columns.

head(sig_genes)
#> DataFrame with 6 rows and 12 columns
#>         top  model_type        test        gene      stat        pval
#>   <integer> <character> <character> <character> <numeric>   <numeric>
#> 1         1  enrichment          WM       NDRG1   16.3053 1.25896e-26
#> 2         2  enrichment          WM      PTP4A2   16.1469 2.25133e-26
#> 3         3  enrichment          WM        AQP1   15.9927 3.97849e-26
#> 4         4  enrichment          WM       PAQR6   15.1971 7.86258e-25
#> 5         5  enrichment          WM      ANP32B   14.9798 1.80183e-24
#> 6         6  enrichment          WM        JAM3   14.7413 4.50756e-24
#>           fdr gene_index         ensembl           in_rows       in_rows_top20
#>     <numeric>  <integer>     <character>     <IntegerList>       <IntegerList>
#> 1 2.51372e-22      10404 ENSG00000104419 1,40215,65023,... 1,156337,223320,...
#> 2 2.51372e-22        487 ENSG00000184007 2,40531,64532,... 2,223323,245661,...
#> 3 2.96145e-22       8201 ENSG00000240583 3,32241,65062,... 3,223314,245643,...
#> 4 4.38948e-21       1501 ENSG00000160781 4,34318,65863,...     4,245654,267982
#> 5 8.04735e-21      10962 ENSG00000136938 5,28964,63931,...     5,223329,267975
#> 6 1.67295e-20      12600 ENSG00000166086 6,40868,65917,... 6,156334,223324,...
#>                                       results
#>                               <CharacterList>
#> 1 WM_top1,WM-Layer1_top20,WM-Layer4_top10,...
#> 2 WM_top2,WM-Layer4_top13,WM-Layer5_top20,...
#> 3   WM_top3,WM-Layer4_top4,WM-Layer5_top2,...
#> 4     WM_top4,WM-Layer5_top13,WM-Layer6_top10
#> 5      WM_top5,WM-Layer4_top19,WM-Layer6_top3
#> 6 WM_top6,WM-Layer1_top17,WM-Layer4_top14,...

The output of sig_genes_extract_all() contains the following columns:

  • top: the rank of the gene for the given test.
  • model_type: either enrichment, pairwise or anova.
  • test: the short notation for the test performed. For example, WM is white matter versus the other layers while WM-Layer1 is white matter greater than layer 1.
  • gene: the gene symbol.
  • stat: the corresponding F or t-statistic.
  • pval: the corresponding p-value (two-sided for t-stats).
  • fdr: the FDR adjusted p-value.
  • gene_index: the row of sce_layer and the original tables in modeling_results to join the tables if necessary.
  • ensembl: Ensembl gene ID.
  • in_rows: an IntegerList() specifying all the rows where that gene is present.
  • in_rows_top20: an IntegerList() specifying all the rows where that gene is present and where its rank (top) is less than or equal to 20. This information is only included for the first occurrence of each gene if that gene is on the top 20 rank for any of the models.
  • results: an CharacterList() specifying all the test results where the gene is ranked (top) among the top 20 genes.

sig_genes_extract_all() uses sig_genes_extract() as it’s workhorse and as such has a very similar output.

Visualize modeling results

After extracting the table of modeling results in long format with sig_genes_extract_all(), we can then use layer_boxplot() to visualize any gene of interest for any of the model types and tests we performed. Below we explore the first one (by default). We also show the color palettes used in the shiny application provided by spatialLIBD (Pardo, Spangler, Weber et al., 2022). The last example has the long title version that uses more information from the sig_genes object we created earlier.

## Note that we recommend setting the random seed so the jittering of the
## points will be reproducible. Given the requirements by BiocCheck this
## cannot be done inside the layer_boxplot() function.

## Create a boxplot of the first gene in `sig_genes`.
set.seed(20200206)
layer_boxplot(sig_genes = sig_genes, sce_layer = sce_layer)


## Viridis colors displayed in the shiny app
## showing the first pairwise model result
## (which illustrates the background colors used for the layers not
## involved in the pairwise comparison)
set.seed(20200206)
layer_boxplot(
    i = which(sig_genes$model_type == "pairwise")[1],
    sig_genes = sig_genes,
    sce_layer = sce_layer,
    col_low_box = viridisLite::viridis(4)[2],
    col_low_point = viridisLite::viridis(4)[1],
    col_high_box = viridisLite::viridis(4)[3],
    col_high_point = viridisLite::viridis(4)[4]
)


## Paper colors displayed in the shiny app
set.seed(20200206)
layer_boxplot(
    sig_genes = sig_genes,
    sce_layer = sce_layer,
    short_title = FALSE,
    col_low_box = "palegreen3",
    col_low_point = "springgreen2",
    col_high_box = "darkorange2",
    col_high_point = "orange1"
)

Correlation of layer-level statistics

Just like we compressed our spe object into sce_layer by pseudo-bulking 5, we can do the same for other single nucleus or single cell RNA sequencing datasets (snRNA-seq, scRNA-seq) and then compute enrichment t-statistics (one group vs the rest; could be one cell type vs the rest or one cluster of cells vs the rest). In our analysis (Maynard, Collado-Torres, Weber et al., 2021), we did this for several datasets including one of our LIBD Human DLPFC snRNA-seq data generated by Matthew N Tran et al 6. In spatialLIBD (Pardo, Spangler, Weber et al., 2022) we include a small set of these statistics for the 31 cell clusters identified by Matthew N Tran et al.

## Explore the enrichment t-statistics derived from Tran et al's snRNA-seq
## DLPFC data
dim(tstats_Human_DLPFC_snRNAseq_Nguyen_topLayer)
#> [1] 692  31
tstats_Human_DLPFC_snRNAseq_Nguyen_topLayer[seq_len(3), seq_len(6)]
#>                      2 (1)       4 (1)      6 (1)      8 (1)     10 (1)
#> ENSG00000104419 -0.6720726  0.30024609  0.5854130  0.1358497  0.3981996
#> ENSG00000184007  0.2535838 -0.28998049 -1.8106052 -0.2096309 -0.4521719
#> ENSG00000240583 -0.2015831 -0.04053423 -0.5120807  0.1688405 -0.3969257
#>                     12 (1)
#> ENSG00000104419 -0.9731938
#> ENSG00000184007 -0.1840286
#> ENSG00000240583 -0.2086954

The function layer_stat_cor() will take as input one such matrix of statistics and correlate them against our layer enrichment results (or other model types) using the subset of Ensembl gene IDs that are observed in both tables.

## Compute the correlation matrix of enrichment t-statistics between our data
## and Tran et al's snRNA-seq data
cor_stats_layer <- layer_stat_cor(
    tstats_Human_DLPFC_snRNAseq_Nguyen_topLayer,
    modeling_results,
    "enrichment"
)

## Explore the correlation matrix
head(cor_stats_layer[, seq_len(3)])
#>               WM       Layer6     Layer5
#> 22 (3) 0.6824669 -0.009192291 -0.1934265
#> 3 (3)  0.7154122 -0.070042729 -0.2290574
#> 23 (3) 0.6637885 -0.031467704 -0.2018306
#> 17 (3) 0.6364983 -0.094216046 -0.2026147
#> 21 (3) 0.6281443 -0.050336358 -0.1988774
#> 7 (4)  0.1850724 -0.197283175 -0.2716890

Once we have computed this correlation matrix, we can then visualize it using layer_stat_cor_plot() as shown below.

## Visualize the correlation matrix
layer_stat_cor_plot(cor_stats_layer, max = max(cor_stats_layer))

In order to fully interpret the resulting heatmap you need to know what each of the cell clusters labels mean. In this case, the syntax is xx (Y) where xx is the cluster number and Y is:

  1. Excit: excitatory neurons.
  2. Inhib: inhibitory neurons.
  3. Oligo: oligodendrocytes.
  4. Astro: astrocytes.
  5. OPC: oligodendrocyte progenitor cells.
  6. Drop: an ambiguous cluster of cells that potentially should be dropped.

You can find the version with the full names here if you are interested in it.

The shiny application provided by spatialLIBD (Pardo, Spangler, Weber et al., 2022) allows users to upload CSV file with these t-statistics, view the correlation heatmaps, download them, and download the correlation matrix. An example CSV file is provided here.

Gene set enrichment

Many researchers have identified lists of genes that increase the risk of a given disorder or disease, are differentially expressed in a given experiment or set of conditions, have been described in a several research papers, among other collections. We can ask for any of our modeling results whether a list of genes is enriched among our significant results.

For illustration purposes, we included in spatialLIBD (Pardo, Spangler, Weber et al., 2022) a set of genes from the Simons Foundation Autism Research Initiative SFARI. In our analysis (Maynard, Collado-Torres, Weber et al., 2021) we used more gene lists. Below, we read in the data and create a list() object with the Ensembl gene IDs that SFARI has provided that are related to autism.

## Read in the SFARI gene sets included in the package
asd_sfari <- utils::read.csv(
    system.file(
        "extdata",
        "SFARI-Gene_genes_01-03-2020release_02-04-2020export.csv",
        package = "spatialLIBD"
    ),
    as.is = TRUE
)

## Format them appropriately
asd_sfari_geneList <- list(
    Gene_SFARI_all = asd_sfari$ensembl.id,
    Gene_SFARI_high = asd_sfari$ensembl.id[asd_sfari$gene.score < 3],
    Gene_SFARI_syndromic = asd_sfari$ensembl.id[asd_sfari$syndromic == 1]
)

After reading the list of genes, we can then compute the enrichment odds ratios and p-values for a given FDR threshold in our statistics.

## Compute the gene set enrichment results
asd_sfari_enrichment <- gene_set_enrichment(
    gene_list = asd_sfari_geneList,
    modeling_results = modeling_results,
    model_type = "enrichment"
)

## Explore the results
head(asd_sfari_enrichment)
#>          OR        Pval   test NumSig SetSize                   ID model_type
#> 1 1.2659915 0.001761332     WM    231     869       Gene_SFARI_all enrichment
#> 2 1.1819109 0.098959488     WM     90     355      Gene_SFARI_high enrichment
#> 3 1.2333378 0.185302060     WM     31     118 Gene_SFARI_syndromic enrichment
#> 4 0.9702022 0.613080610 Layer1     71     869       Gene_SFARI_all enrichment
#> 5 0.7192630 0.949332765 Layer1     22     355      Gene_SFARI_high enrichment
#> 6 1.1216176 0.405453216 Layer1     11     118 Gene_SFARI_syndromic enrichment
#>   fdr_cut
#> 1     0.1
#> 2     0.1
#> 3     0.1
#> 4     0.1
#> 5     0.1
#> 6     0.1

Using the above enrichment table, we can then visualize the odds ratios on a heatmap as shown below. Note that we use the thresholded p-values at -log10(p) = 12 for visualization purposes and only show the odds ratios for -log10(p) > 3 by default.

## Visualize gene set enrichment results
gene_set_enrichment_plot(
    asd_sfari_enrichment,
    xlabs = gsub(".*_", "", unique(asd_sfari_enrichment$ID)),
    layerHeights = c(0, 40, 55, 75, 85, 110, 120, 135)
)

The shiny application provided by spatialLIBD (Pardo, Spangler, Weber et al., 2022) allows users to upload CSV file their gene lists, compute the enrichment statistics, visualize them, download the PDF, and download the enrichment table. An example CSV file is provided here.

Re-shaping your data to our structure

This section gets into the details of how we generated the data (Maynard, Collado-Torres, Weber et al., 2021) behind spatialLIBD (Pardo, Spangler, Weber et al., 2022). This could be useful if you are a Bioconductor developer or a user very familiar with packages such as SingleCellExperiment.

SpatialExperiment support

As of version 1.3.3, spatialLIBD supports the SpatialExperiment class from SpatialExperiment. The functions vis_gene_p(), vis_gene(), vis_grid_clus(), vis_grid_gene(), vis_clus(), vis_clus_p(), geom_spatial() now work with SpatialExperiment objects thanks to the updates in SpatialExperiment. If you have spot-level data formatted in the older SingleCellExperiment objects that were heavily modified for spatialLIBD, you can use sce_to_spe() to convert the objects.

This work was done by Brenda Pardo and Leonardo.

Using spatialLIBD with your own data

Please check the second vignette on how to use spatialLIBD with your own data as exemplified with a public 10x Genomics dataset or go directly to the read10xVisiumWrapper() documentation.

Expected characteristics of the data

If you want to check the key characteristics required by spatialLIBD’s functions or the shiny application, use the check_* family of functions.

check_spe(spe)
#> class: SpatialExperiment 
#> dim: 33538 47681 
#> metadata(0):
#> assays(2): counts logcounts
#> rownames(33538): ENSG00000243485 ENSG00000237613 ... ENSG00000277475
#>   ENSG00000268674
#> rowData names(9): source type ... gene_search is_top_hvg
#> colnames(47681): AAACAACGAATAGTTC-1 AAACAAGTATCTCCCA-1 ...
#>   TTGTTTCCATACAACT-1 TTGTTTGTGTAAATTC-1
#> colData names(69): sample_id Cluster ... array_row array_col
#> reducedDimNames(6): PCA TSNE_perplexity50 ... TSNE_perplexity80
#>   UMAP_neighbors15
#> mainExpName: NULL
#> altExpNames(0):
#> spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres
#> imgData names(4): sample_id image_id data scaleFactor
check_sce_layer(sce_layer)
#> class: SingleCellExperiment 
#> dim: 22331 76 
#> metadata(0):
#> assays(2): counts logcounts
#> rownames(22331): ENSG00000243485 ENSG00000238009 ... ENSG00000278384
#>   ENSG00000271254
#> rowData names(10): source type ... is_top_hvg is_top_hvg_sce_layer
#> colnames(76): 151507_Layer1 151507_Layer2 ... 151676_Layer6 151676_WM
#> colData names(13): sample_name layer_guess ...
#>   layer_guess_reordered_short spatialLIBD
#> reducedDimNames(6): PCA TSNE_perplexity5 ... UMAP_neighbors15 PCAsub
#> mainExpName: NULL
#> altExpNames(0):
## The output here is too long to print
xx <- check_modeling_results(modeling_results)
identical(xx, modeling_results)
#> [1] TRUE

Generating our data (legacy)

If you are interested in reshaping your data to fit our structure, we do not provide a quick function to do so. That is intentional given the active development by the Bioconductor community for determining the best way to deal with spatial transcriptomics data in general and the 10x Visium data in particular. Having said that, we do have all the steps and reproducibility information documented across several of our analysis (Maynard, Collado-Torres, Weber et al., 2021) scripts.

  • reorganize_folder.R available here re-organizes the raw data we were sent by 10x Genomics.
  • Layer_Notebook.R available here reads in the Visium data and builds a list of RangeSummarizedExperiment() objects from SummarizedExperiment, one per sample (image) that is eventually saved as Human_DLPFC_Visium_processedData_rseList.rda.
  • convert_sce.R available here reads in Human_DLPFC_Visium_processedData_rseList.rda and builds an initial sce object with image data under metadata(sce)$image which is a single data.frame. Subsetting doesn’t automatically subset the image, so you have to do it yourself when plotting as is done by vis_clus_p() and vis_gene_p(). Having the data from all images in a single object allows you to use the spot-level data from all images to compute clusters and do other similar analyses to the ones you would do with sc/snRNA-seq data. The script creates the Human_DLPFC_Visium_processedData_sce.Rdata file.
  • sce_scran.R available here then uses scran to read in Human_DLPFC_Visium_processedData_sce.Rdata, compute the highly variable genes (stored in our final sce object at rowData(sce)$is_top_hvg), perform dimensionality reduction (PCA, TSNE, UMAP) and identify clusters using the data from all images. The resulting data is then stored as Human_DLPFC_Visium_processedData_sce_scran.Rdata and is the main object used throughout our analysis code (Maynard, Collado-Torres, Weber et al., 2021).
  • make-data_spatialLIBD.R available in the source version of spatialLIBD and online here is the script that reads in Human_DLPFC_Visium_processedData_sce_scran.Rdata as well as some other outputs from our analysis and combines them into the final sce and sce_layer objects provided by spatialLIBD (Pardo, Spangler, Weber et al., 2022). This script simplifies some operations in order to simplify the code behind the shiny application provided by spatialLIBD.

You don’t necessarily need to do all of this to use the functions provided by spatialLIBD. Note that external to the R objects, for the shiny application provided by spatialLIBD you will need to have the tissue_lowres_image.png image files in a directory structure by sample as shown here in order for the interactive visualizations made with plotly to work.

More spatially-resolved LIBD datasets

Over time spatialLIBD::fetch_data() has been expanded to provide access to other datasets generated by our teams at the Lieber Institute for Brain Development (LIBD) that have also been analyzed with spatialLIBD.

spatialDLPFC

Through spatialLIBD::fetch_data() you can also download the data from the Integrated single cell and unsupervised spatial transcriptomic analysis defines molecular anatomy of the human dorsolateral prefrontal cortex project, also known as spatialDLPFC (Huuki-Myers, Spangler, Eagles, Montgomergy, Kwon, Guo, Grant-Peters, Divecha, Tippani, Sriworarat, Nguyen, Ravichandran, Tran, Seyedian, Consortium, Hyde, Kleinman, Battle, Page, Ryten, Hicks, Martinowich, Collado-Torres, and Maynard, 2024). See http://research.libd.org/spatialDLPFC/ for more information about this project.

See the Twitter thread 🧵 below for a brief overview of the #spatialDLPFC project.

Visium_SPG_AD

Through spatialLIBD::fetch_data() you can also download the data from the Influence of Alzheimer’s disease related neuropathology on local microenvironment gene expression in the human inferior temporal cortex project, also known as Visium_SPG_AD (Kwon, Parthiban, Tippani, Divecha, Eagles, Lobana, Williams, Mark, Bharadwaj, Kleinman, Hyde, Page, Hicks, Martinowich, Maynard, and Collado-Torres, 2023). See http://research.libd.org/Visium_SPG_AD/ for more information about this project.

See the Twitter thread 🧵 below for a brief overview of the #Visium_SPG_AD project.

TODO

LIBD data outside spatialLIBD

Sometimes our collaborators have shared data through other venues. So not all LIBD spatially-resolved transcriptomics data from the Keri Martinowich, Kristen Maynard, and Leonardo Collado-Torres teams has been released through spatialLIBD. However, it is very much compatible with spatialLIBD and can be analyzed or visualized with spatialLIBD functions.

locus-c

The gene expression landscape of the human locus coeruleus revealed by single-nucleus and spatially-resolved transcriptomics, also known as locus-c, is not available through spatialLIBD, but you might be interested in checking out the excellent WeberDivechaLCdata package for more details. See https://github.com/lmweber/locus-c for more details about the locus-c project.

See the Twitter thread 🧵 below for a brief overview of the locus-c project.

Reproducibility

The spatialLIBD package (Pardo, Spangler, Weber et al., 2022) was made possible thanks to:

  • R (R Core Team, 2024)
  • AnnotationHub (Morgan and Shepherd, 2024)
  • benchmarkme (Gillespie, 2022)
  • BiocFileCache (Shepherd and Morgan, 2024)
  • BiocGenerics (Huber, W., Carey, J., Gentleman, R., Anders, S., Carlson, M., Carvalho, S., Bravo, C., Davis, S., Gatto, L., Girke, T., Gottardo, R., Hahne, F., Hansen, D., Irizarry, A., Lawrence, M., Love, I., MacDonald, J., Obenchain, V., {Ole’s}, K., {Pag`es}, H., Reyes, A., Shannon, P., Smyth, K., Tenenbaum, D., Waldron, L., Morgan, and M., 2015)
  • BiocStyle (Oleś, 2024)
  • cowplot (Wilke, 2024)
  • DT (Xie, Cheng, and Tan, 2024)
  • edgeR (Chen, Chen, Lun, Baldoni, and Smyth, 2024)
  • ExperimentHub (Morgan and Shepherd, 2024)
  • fields (Douglas Nychka, Reinhard Furrer, John Paige, and Stephan Sain, 2021)
  • GenomicRanges (Lawrence, Huber, Pagès, Aboyoun, Carlson, Gentleman, Morgan, and Carey, 2013)
  • ggplot2 (Wickham, 2016)
  • golem (Fay, Guyader, Rochette, and Girard, 2024)
  • IRanges (Lawrence, Huber, Pagès et al., 2013)
  • knitr (Xie, 2014)
  • limma (Ritchie, Phipson, Wu, Hu, Law, Shi, and Smyth, 2015)
  • magick (Ooms, 2024)
  • Matrix (Bates, Maechler, and Jagan, 2024)
  • paletteer (Hvitfeldt, 2021)
  • plotly (Sievert, 2020)
  • RColorBrewer (Neuwirth, 2022)
  • RefManageR (McLean, 2017)
  • rmarkdown (Allaire, Xie, Dervieux, McPherson, Luraschi, Ushey, Atkins, Wickham, Cheng, Chang, and Iannone, 2024)
  • rtracklayer (Lawrence, Gentleman, and Carey, 2009)
  • S4Vectors (Pagès, Lawrence, and Aboyoun, 2024)
  • scater (McCarthy, Campbell, Lun, and Willis, 2017)
  • scuttle (McCarthy, Campbell, Lun et al., 2017)
  • sessioninfo (Wickham, Chang, Flight, Müller, and Hester, 2021)
  • SingleCellExperiment (Amezquita, Lun, Becht, Carey, Carpp, Geistlinger, Marini, Rue-Albrecht, Risso, Soneson, Waldron, Pages, Smith, Huber, Morgan, Gottardo, and Hicks, 2020)
  • shiny (Chang, Cheng, Allaire et al., 2024)
  • SpatialExperiment (Righelli, Weber, Crowell et al., 2022)
  • statmod (Giner and Smyth, 2016)
  • SummarizedExperiment (Morgan, Obenchain, Hester, and Pagès, 2024)
  • testthat (Wickham, 2011)
  • viridisLite (Garnier, Simon, Ross, Noam, Rudis, Robert, Camargo, Pedro, Sciaini, Marco, Scherer, and Cédric, 2023)

Code for creating the vignette

## Create the vignette
library("rmarkdown")
system.time(render("spatialLIBD.Rmd"))

## Extract the R code
library("knitr")
knit("spatialLIBD.Rmd", tangle = TRUE)

Date the vignette was generated.

#> [1] "2024-10-31 20:32:13 UTC"

Wallclock time spent generating the vignette.

#> Time difference of 37.911 secs

R session information.

#> ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.4.1 (2024-06-14)
#>  os       Ubuntu 22.04.4 LTS
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language en
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       UTC
#>  date     2024-10-31
#>  pandoc   3.2 @ /usr/bin/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
#>  package              * version   date (UTC) lib source
#>  abind                  1.4-8     2024-09-12 [1] RSPM (R 4.4.0)
#>  AnnotationDbi          1.66.0    2024-05-01 [1] Bioconductor 3.19 (R 4.4.1)
#>  AnnotationHub          3.12.0    2024-04-30 [1] Bioconductor 3.19 (R 4.4.1)
#>  attempt                0.3.1     2020-05-03 [1] RSPM (R 4.4.0)
#>  backports              1.5.0     2024-05-23 [1] RSPM (R 4.4.0)
#>  beachmat               2.20.0    2024-04-30 [1] Bioconductor 3.19 (R 4.4.1)
#>  beeswarm               0.4.0     2021-06-01 [1] RSPM (R 4.4.0)
#>  benchmarkme            1.0.8     2022-06-12 [1] RSPM (R 4.4.0)
#>  benchmarkmeData        1.0.4     2020-04-23 [1] RSPM (R 4.4.0)
#>  bibtex                 0.5.1     2023-01-26 [1] RSPM (R 4.4.0)
#>  Biobase              * 2.64.0    2024-04-30 [1] Bioconductor 3.19 (R 4.4.1)
#>  BiocFileCache          2.12.0    2024-04-30 [1] Bioconductor 3.19 (R 4.4.1)
#>  BiocGenerics         * 0.50.0    2024-04-30 [1] Bioconductor 3.19 (R 4.4.1)
#>  BiocIO                 1.14.0    2024-04-30 [1] Bioconductor 3.19 (R 4.4.1)
#>  BiocManager            1.30.25   2024-08-28 [1] RSPM (R 4.4.0)
#>  BiocNeighbors          1.22.0    2024-04-30 [1] Bioconductor 3.19 (R 4.4.1)
#>  BiocParallel           1.38.0    2024-04-30 [1] Bioconductor 3.19 (R 4.4.1)
#>  BiocSingular           1.20.0    2024-04-30 [1] Bioconductor 3.19 (R 4.4.1)
#>  BiocStyle            * 2.32.1    2024-06-16 [1] Bioconductor 3.19 (R 4.4.1)
#>  BiocVersion            3.19.1    2024-04-17 [2] Bioconductor 3.19 (R 4.4.1)
#>  Biostrings             2.72.1    2024-06-02 [1] Bioconductor 3.19 (R 4.4.1)
#>  bit                    4.5.0     2024-09-20 [1] RSPM (R 4.4.0)
#>  bit64                  4.5.2     2024-09-22 [1] RSPM (R 4.4.0)
#>  bitops                 1.0-9     2024-10-03 [1] RSPM (R 4.4.0)
#>  blob                   1.2.4     2023-03-17 [1] RSPM (R 4.4.0)
#>  bookdown               0.41      2024-10-16 [1] RSPM (R 4.4.0)
#>  bslib                  0.8.0     2024-07-29 [2] RSPM (R 4.4.0)
#>  cachem                 1.1.0     2024-05-16 [2] RSPM (R 4.4.0)
#>  cli                    3.6.3     2024-06-21 [2] RSPM (R 4.4.0)
#>  codetools              0.2-20    2024-03-31 [3] CRAN (R 4.4.1)
#>  colorspace             2.1-1     2024-07-26 [1] RSPM (R 4.4.0)
#>  config                 0.3.2     2023-08-30 [1] RSPM (R 4.4.0)
#>  cowplot                1.1.3     2024-01-22 [1] RSPM (R 4.4.0)
#>  crayon                 1.5.3     2024-06-20 [2] RSPM (R 4.4.0)
#>  curl                   5.2.3     2024-09-20 [2] RSPM (R 4.4.0)
#>  data.table             1.16.2    2024-10-10 [1] RSPM (R 4.4.0)
#>  DBI                    1.2.3     2024-06-02 [1] RSPM (R 4.4.0)
#>  dbplyr                 2.5.0     2024-03-19 [1] RSPM (R 4.4.0)
#>  DelayedArray           0.30.1    2024-05-07 [1] Bioconductor 3.19 (R 4.4.1)
#>  DelayedMatrixStats     1.26.0    2024-04-30 [1] Bioconductor 3.19 (R 4.4.1)
#>  desc                   1.4.3     2023-12-10 [2] RSPM (R 4.4.0)
#>  digest                 0.6.37    2024-08-19 [2] RSPM (R 4.4.0)
#>  doParallel             1.0.17    2022-02-07 [1] RSPM (R 4.4.0)
#>  dotCall64              1.2       2024-10-04 [1] RSPM (R 4.4.0)
#>  dplyr                  1.1.4     2023-11-17 [1] RSPM (R 4.4.0)
#>  DT                     0.33      2024-04-04 [1] RSPM (R 4.4.0)
#>  edgeR                  4.2.2     2024-10-13 [1] Bioconductor 3.19 (R 4.4.1)
#>  evaluate               1.0.1     2024-10-10 [2] RSPM (R 4.4.0)
#>  ExperimentHub          2.12.0    2024-04-30 [1] Bioconductor 3.19 (R 4.4.1)
#>  fansi                  1.0.6     2023-12-08 [2] RSPM (R 4.4.0)
#>  farver                 2.1.2     2024-05-13 [1] RSPM (R 4.4.0)
#>  fastmap                1.2.0     2024-05-15 [2] RSPM (R 4.4.0)
#>  fields                 16.3      2024-09-30 [1] RSPM (R 4.4.0)
#>  filelock               1.0.3     2023-12-11 [1] RSPM (R 4.4.0)
#>  foreach                1.5.2     2022-02-02 [1] RSPM (R 4.4.0)
#>  fs                     1.6.5     2024-10-30 [2] RSPM (R 4.4.0)
#>  generics               0.1.3     2022-07-05 [1] RSPM (R 4.4.0)
#>  GenomeInfoDb         * 1.40.1    2024-05-24 [1] Bioconductor 3.19 (R 4.4.1)
#>  GenomeInfoDbData       1.2.12    2024-10-27 [1] Bioconductor
#>  GenomicAlignments      1.40.0    2024-04-30 [1] Bioconductor 3.19 (R 4.4.1)
#>  GenomicRanges        * 1.56.2    2024-10-09 [1] Bioconductor 3.19 (R 4.4.1)
#>  ggbeeswarm             0.7.2     2023-04-29 [1] RSPM (R 4.4.0)
#>  ggplot2                3.5.1     2024-04-23 [1] RSPM (R 4.4.0)
#>  ggrepel                0.9.6     2024-09-07 [1] RSPM (R 4.4.0)
#>  glue                   1.8.0     2024-09-30 [2] RSPM (R 4.4.0)
#>  golem                  0.5.1     2024-08-27 [1] RSPM (R 4.4.0)
#>  gridExtra              2.3       2017-09-09 [1] RSPM (R 4.4.0)
#>  gtable                 0.3.6     2024-10-25 [1] RSPM (R 4.4.0)
#>  highr                  0.11      2024-05-26 [2] RSPM (R 4.4.0)
#>  htmltools              0.5.8.1   2024-04-04 [2] RSPM (R 4.4.0)
#>  htmlwidgets            1.6.4     2023-12-06 [2] RSPM (R 4.4.0)
#>  httpuv                 1.6.15    2024-03-26 [2] RSPM (R 4.4.0)
#>  httr                   1.4.7     2023-08-15 [2] RSPM (R 4.4.0)
#>  IRanges              * 2.38.1    2024-07-03 [1] Bioconductor 3.19 (R 4.4.1)
#>  irlba                  2.3.5.1   2022-10-03 [1] RSPM (R 4.4.0)
#>  iterators              1.0.14    2022-02-05 [1] RSPM (R 4.4.0)
#>  jquerylib              0.1.4     2021-04-26 [2] RSPM (R 4.4.0)
#>  jsonlite               1.8.9     2024-09-20 [2] RSPM (R 4.4.0)
#>  KEGGREST               1.44.1    2024-06-19 [1] Bioconductor 3.19 (R 4.4.1)
#>  knitr                  1.48      2024-07-07 [2] RSPM (R 4.4.0)
#>  labeling               0.4.3     2023-08-29 [1] RSPM (R 4.4.0)
#>  later                  1.3.2     2023-12-06 [2] RSPM (R 4.4.0)
#>  lattice                0.22-6    2024-03-20 [3] CRAN (R 4.4.1)
#>  lazyeval               0.2.2     2019-03-15 [1] RSPM (R 4.4.0)
#>  lifecycle              1.0.4     2023-11-07 [2] RSPM (R 4.4.0)
#>  limma                  3.60.6    2024-10-02 [1] Bioconductor 3.19 (R 4.4.1)
#>  locfit                 1.5-9.10  2024-06-24 [1] RSPM (R 4.4.0)
#>  lubridate              1.9.3     2023-09-27 [1] RSPM (R 4.4.0)
#>  magick                 2.8.5     2024-09-20 [1] RSPM (R 4.4.0)
#>  magrittr               2.0.3     2022-03-30 [2] RSPM (R 4.4.0)
#>  maps                   3.4.2     2023-12-15 [1] RSPM (R 4.4.0)
#>  Matrix                 1.7-1     2024-10-18 [3] RSPM (R 4.4.0)
#>  MatrixGenerics       * 1.16.0    2024-04-30 [1] Bioconductor 3.19 (R 4.4.1)
#>  matrixStats          * 1.4.1     2024-09-08 [1] RSPM (R 4.4.0)
#>  memoise                2.0.1     2021-11-26 [2] RSPM (R 4.4.0)
#>  mime                   0.12      2021-09-28 [2] RSPM (R 4.4.0)
#>  munsell                0.5.1     2024-04-01 [1] RSPM (R 4.4.0)
#>  paletteer              1.6.0     2024-01-21 [1] RSPM (R 4.4.0)
#>  pillar                 1.9.0     2023-03-22 [2] RSPM (R 4.4.0)
#>  pkgconfig              2.0.3     2019-09-22 [2] RSPM (R 4.4.0)
#>  pkgdown                2.1.1     2024-09-17 [2] RSPM (R 4.4.0)
#>  plotly                 4.10.4    2024-01-13 [1] RSPM (R 4.4.0)
#>  plyr                   1.8.9     2023-10-02 [1] RSPM (R 4.4.0)
#>  png                    0.1-8     2022-11-29 [1] RSPM (R 4.4.0)
#>  promises               1.3.0     2024-04-05 [2] RSPM (R 4.4.0)
#>  purrr                  1.0.2     2023-08-10 [2] RSPM (R 4.4.0)
#>  R6                     2.5.1     2021-08-19 [2] RSPM (R 4.4.0)
#>  ragg                   1.3.3     2024-09-11 [2] RSPM (R 4.4.0)
#>  rappdirs               0.3.3     2021-01-31 [2] RSPM (R 4.4.0)
#>  RColorBrewer           1.1-3     2022-04-03 [1] RSPM (R 4.4.0)
#>  Rcpp                   1.0.13    2024-07-17 [2] RSPM (R 4.4.0)
#>  RCurl                  1.98-1.16 2024-07-11 [1] RSPM (R 4.4.0)
#>  RefManageR           * 1.4.0     2022-09-30 [1] RSPM (R 4.4.0)
#>  rematch2               2.1.2     2020-05-01 [2] RSPM (R 4.4.0)
#>  restfulr               0.0.15    2022-06-16 [1] RSPM (R 4.4.1)
#>  rjson                  0.2.23    2024-09-16 [1] RSPM (R 4.4.0)
#>  rlang                  1.1.4     2024-06-04 [2] RSPM (R 4.4.0)
#>  rmarkdown              2.28      2024-08-17 [2] RSPM (R 4.4.0)
#>  Rsamtools              2.20.0    2024-04-30 [1] Bioconductor 3.19 (R 4.4.1)
#>  RSQLite                2.3.7     2024-05-27 [1] RSPM (R 4.4.0)
#>  rsvd                   1.0.5     2021-04-16 [1] RSPM (R 4.4.0)
#>  rtracklayer            1.64.0    2024-04-30 [1] Bioconductor 3.19 (R 4.4.1)
#>  S4Arrays               1.4.1     2024-05-20 [1] Bioconductor 3.19 (R 4.4.1)
#>  S4Vectors            * 0.42.1    2024-07-03 [1] Bioconductor 3.19 (R 4.4.1)
#>  sass                   0.4.9     2024-03-15 [2] RSPM (R 4.4.0)
#>  ScaledMatrix           1.12.0    2024-04-30 [1] Bioconductor 3.19 (R 4.4.1)
#>  scales                 1.3.0     2023-11-28 [1] RSPM (R 4.4.0)
#>  scater                 1.32.1    2024-07-21 [1] Bioconductor 3.19 (R 4.4.1)
#>  scuttle                1.14.0    2024-04-30 [1] Bioconductor 3.19 (R 4.4.1)
#>  sessioninfo          * 1.2.2     2021-12-06 [2] RSPM (R 4.4.0)
#>  shiny                  1.9.1     2024-08-01 [2] RSPM (R 4.4.0)
#>  shinyWidgets           0.8.7     2024-09-23 [1] RSPM (R 4.4.0)
#>  SingleCellExperiment * 1.26.0    2024-04-30 [1] Bioconductor 3.19 (R 4.4.1)
#>  spam                   2.11-0    2024-10-03 [1] RSPM (R 4.4.0)
#>  SparseArray            1.4.8     2024-05-24 [1] Bioconductor 3.19 (R 4.4.1)
#>  sparseMatrixStats      1.16.0    2024-04-30 [1] Bioconductor 3.19 (R 4.4.1)
#>  SpatialExperiment    * 1.14.0    2024-05-01 [1] Bioconductor 3.19 (R 4.4.1)
#>  spatialLIBD          * 1.17.10   2024-10-31 [1] Bioconductor
#>  statmod                1.5.0     2023-01-06 [1] RSPM (R 4.4.0)
#>  stringi                1.8.4     2024-05-06 [2] RSPM (R 4.4.0)
#>  stringr                1.5.1     2023-11-14 [2] RSPM (R 4.4.0)
#>  SummarizedExperiment * 1.34.0    2024-05-01 [1] Bioconductor 3.19 (R 4.4.1)
#>  systemfonts            1.1.0     2024-05-15 [2] RSPM (R 4.4.0)
#>  textshaping            0.4.0     2024-05-24 [2] RSPM (R 4.4.0)
#>  tibble                 3.2.1     2023-03-20 [2] RSPM (R 4.4.0)
#>  tidyr                  1.3.1     2024-01-24 [1] RSPM (R 4.4.0)
#>  tidyselect             1.2.1     2024-03-11 [1] RSPM (R 4.4.0)
#>  timechange             0.3.0     2024-01-18 [1] RSPM (R 4.4.0)
#>  UCSC.utils             1.0.0     2024-04-30 [1] Bioconductor 3.19 (R 4.4.1)
#>  utf8                   1.2.4     2023-10-22 [2] RSPM (R 4.4.0)
#>  vctrs                  0.6.5     2023-12-01 [2] RSPM (R 4.4.0)
#>  vipor                  0.4.7     2023-12-18 [1] RSPM (R 4.4.0)
#>  viridis                0.6.5     2024-01-29 [1] RSPM (R 4.4.0)
#>  viridisLite            0.4.2     2023-05-02 [1] RSPM (R 4.4.0)
#>  withr                  3.0.2     2024-10-28 [2] RSPM (R 4.4.0)
#>  xfun                   0.48      2024-10-03 [2] RSPM (R 4.4.0)
#>  XML                    3.99-0.17 2024-06-25 [1] RSPM (R 4.4.0)
#>  xml2                   1.3.6     2023-12-04 [2] RSPM (R 4.4.0)
#>  xtable                 1.8-4     2019-04-21 [2] RSPM (R 4.4.0)
#>  XVector                0.44.0    2024-04-30 [1] Bioconductor 3.19 (R 4.4.1)
#>  yaml                   2.3.10    2024-07-26 [2] RSPM (R 4.4.0)
#>  zlibbioc               1.50.0    2024-04-30 [1] Bioconductor 3.19 (R 4.4.1)
#> 
#>  [1] /__w/_temp/Library
#>  [2] /usr/local/lib/R/site-library
#>  [3] /usr/local/lib/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Bibliography

This vignette was generated using BiocStyle (Oleś, 2024), knitr (Xie, 2014) and rmarkdown (Allaire, Xie, Dervieux et al., 2024) running behind the scenes.

Citations made with RefManageR (McLean, 2017).

[1] J. Allaire, Y. Xie, C. Dervieux, et al. rmarkdown: Dynamic Documents for R. R package version 2.28. 2024. URL: https://github.com/rstudio/rmarkdown.

[2] R. Amezquita, A. Lun, E. Becht, et al. “Orchestrating single-cell analysis with Bioconductor”. In: Nature Methods 17 (2020), pp. 137–145. URL: https://www.nature.com/articles/s41592-019-0654-x.

[3] D. Bates, M. Maechler, and M. Jagan. Matrix: Sparse and Dense Matrix Classes and Methods. R package version 1.7-1. 2024. URL: https://Matrix.R-forge.R-project.org.

[4] W. Chang, J. Cheng, J. Allaire, et al. shiny: Web Application Framework for R. R package version 1.9.1, https://github.com/rstudio/shiny. 2024. URL: https://shiny.posit.co/.

[5] Y. Chen, L. Chen, A. T. L. Lun, et al. “edgeR 4.0: powerful differential analysis of sequencing data with expanded functionality and improved support for small counts and larger datasets”. In: bioRxiv (2024). DOI: 10.1101/2024.01.21.576131.

[6] Douglas Nychka, Reinhard Furrer, John Paige, et al. fields: Tools for spatial data. R package version 16.3. Boulder, CO, USA: University Corporation for Atmospheric Research, 2021. URL: https://github.com/dnychka/fieldsRPackage.

[7] C. Fay, V. Guyader, S. Rochette, et al. golem: A Framework for Robust Shiny Applications. R package version 0.5.1, https://github.com/ThinkR-open/golem. 2024. URL: https://thinkr-open.github.io/golem/.

[8] Garnier, Simon, Ross, et al. viridis(Lite) - Colorblind-Friendly Color Maps for R. viridisLite package version 0.4.2. 2023. DOI: 10.5281/zenodo.4678327. URL: https://sjmgarnier.github.io/viridis/.

[9] C. Gillespie. benchmarkme: Crowd Sourced System Benchmarks. R package version 1.0.8. 2022. URL: https://github.com/csgillespie/benchmarkme.

[10] G. Giner and G. K. Smyth. “statmod: probability calculations for the inverse Gaussian distribution”. In: R Journal 8.1 (2016), pp. 339-351.

[11] Huber, W., Carey, et al. “Orchestrating high-throughput genomic analysis with Bioconductor”. In: Nature Methods 12.2 (2015), pp. 115–121. URL: http://www.nature.com/nmeth/journal/v12/n2/full/nmeth.3252.html.

[12] L. A. Huuki-Myers, A. Spangler, N. J. Eagles, et al. “A data-driven single-cell and spatial transcriptomic map of the human prefrontal cortex”. In: Science (2024). DOI: 10.1126/science.adh1938. URL: https://doi.org/10.1126/science.adh1938.

[13] E. Hvitfeldt. paletteer: Comprehensive Collection of Color Palettes. R package version 1.3.0. 2021. URL: https://github.com/EmilHvitfeldt/paletteer.

[14] S. H. Kwon, S. Parthiban, M. Tippani, et al. “Influence of Alzheimer’s disease related neuropathology on local microenvironment gene expression in the human inferior temporal cortex”. In: GEN Biotechnology (2023). DOI: 10.1089/genbio.2023.0019. URL: https://doi.org/10.1089/genbio.2023.0019.

[15] M. Lawrence, R. Gentleman, and V. Carey. “rtracklayer: an R package for interfacing with genome browsers”. In: Bioinformatics 25 (2009), pp. 1841-1842. DOI: 10.1093/bioinformatics/btp328. URL: http://bioinformatics.oxfordjournals.org/content/25/14/1841.abstract.

[16] M. Lawrence, W. Huber, H. Pagès, et al. “Software for Computing and Annotating Genomic Ranges”. In: PLoS Computational Biology 9 (8 2013). DOI: 10.1371/journal.pcbi.1003118. URL: http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003118}.

[17] K. R. Maynard, L. Collado-Torres, L. M. Weber, et al. “Transcriptome-scale spatial gene expression in the human dorsolateral prefrontal cortex”. In: Nature Neuroscience (2021). DOI: 10.1038/s41593-020-00787-0. URL: https://www.nature.com/articles/s41593-020-00787-0.

[18] D. J. McCarthy, K. R. Campbell, A. T. L. Lun, et al. “Scater: pre-processing, quality control, normalisation and visualisation of single-cell RNA-seq data in R”. In: Bioinformatics 33 (8 2017), pp. 1179-1186. DOI: 10.1093/bioinformatics/btw777.

[19] M. W. McLean. “RefManageR: Import and Manage BibTeX and BibLaTeX References in R”. In: The Journal of Open Source Software (2017). DOI: 10.21105/joss.00338.

[20] M. Morgan, V. Obenchain, J. Hester, et al. SummarizedExperiment: SummarizedExperiment container. R package version 1.34.0. 2024. DOI: 10.18129/B9.bioc.SummarizedExperiment. URL: https://bioconductor.org/packages/SummarizedExperiment.

[21] M. Morgan and L. Shepherd. AnnotationHub: Client to access AnnotationHub resources. R package version 3.12.0. 2024. DOI: 10.18129/B9.bioc.AnnotationHub. URL: https://bioconductor.org/packages/AnnotationHub.

[22] M. Morgan and L. Shepherd. ExperimentHub: Client to access ExperimentHub resources. R package version 2.12.0. 2024. DOI: 10.18129/B9.bioc.ExperimentHub. URL: https://bioconductor.org/packages/ExperimentHub.

[23] E. Neuwirth. RColorBrewer: ColorBrewer Palettes. R package version 1.1-3. 2022.

[24] A. Oleś. BiocStyle: Standard styles for vignettes and other Bioconductor documents. R package version 2.32.1. 2024. DOI: 10.18129/B9.bioc.BiocStyle. URL: https://bioconductor.org/packages/BiocStyle.

[25] J. Ooms. magick: Advanced Graphics and Image-Processing in R. R package version 2.8.5. 2024. URL: https://docs.ropensci.org/magick/https://ropensci.r-universe.dev/magick.

[26] H. Pagès, M. Lawrence, and P. Aboyoun. S4Vectors: Foundation of vector-like and list-like containers in Bioconductor. R package version 0.42.1. 2024. DOI: 10.18129/B9.bioc.S4Vectors. URL: https://bioconductor.org/packages/S4Vectors.

[27] B. Pardo, A. Spangler, L. M. Weber, et al. “spatialLIBD: an R/Bioconductor package to visualize spatially-resolved transcriptomics data”. In: BMC Genomics (2022). DOI: 10.1186/s12864-022-08601-w. URL: https://doi.org/10.1186/s12864-022-08601-w.

[28] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, 2024. URL: https://www.R-project.org/.

[29] D. Righelli, L. M. Weber, H. L. Crowell, et al. “SpatialExperiment: infrastructure for spatially-resolved transcriptomics data in R using Bioconductor”. In: Bioinformatics 38.11 (2022), pp. -3. DOI: https://doi.org/10.1093/bioinformatics/btac299.

[30] M. E. Ritchie, B. Phipson, D. Wu, et al. “limma powers differential expression analyses for RNA-sequencing and microarray studies”. In: Nucleic Acids Research 43.7 (2015), p. e47. DOI: 10.1093/nar/gkv007.

[31] L. Shepherd and M. Morgan. BiocFileCache: Manage Files Across Sessions. R package version 2.12.0. 2024. DOI: 10.18129/B9.bioc.BiocFileCache. URL: https://bioconductor.org/packages/BiocFileCache.

[32] C. Sievert. Interactive Web-Based Data Visualization with R, plotly, and shiny. Chapman and Hall/CRC, 2020. ISBN: 9781138331457. URL: https://plotly-r.com.

[33] H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016. ISBN: 978-3-319-24277-4. URL: https://ggplot2.tidyverse.org.

[34] H. Wickham. “testthat: Get Started with Testing”. In: The R Journal 3 (2011), pp. 5–10. URL: https://journal.r-project.org/archive/2011-1/RJournal_2011-1_Wickham.pdf.

[35] H. Wickham, W. Chang, R. Flight, et al. sessioninfo: R Session Information. R package version 1.2.2, https://r-lib.github.io/sessioninfo/. 2021. URL: https://github.com/r-lib/sessioninfo#readme.

[36] C. Wilke. cowplot: Streamlined Plot Theme and Plot Annotations for ‘ggplot2’. R package version 1.1.3. 2024. URL: https://wilkelab.org/cowplot/.

[37] Y. Xie. “knitr: A Comprehensive Tool for Reproducible Research in R”. In: Implementing Reproducible Computational Research. Ed. by V. Stodden, F. Leisch and R. D. Peng. ISBN 978-1466561595. Chapman and Hall/CRC, 2014.

[38] Y. Xie, J. Cheng, and X. Tan. DT: A Wrapper of the JavaScript Library ‘DataTables’. R package version 0.33. 2024. URL: https://github.com/rstudio/DT.


  1. Check this code for details on how we built the spe object. In particular check convert_sce.R and sce_scran.R.↩︎

  2. Check this code for details on how we built the sce_layer object. In particular check spots_per_layer.R and layer_enrichment.R.↩︎

  3. Check this code for details on how we built the modeling_results object. In particular check layer_specificity_fstats.R, layer_specificity.R, and misc_numbers.R.↩︎

  4. You can change the destdir argument and specific a specific location that you will use and re-use. However the default value of destdir is a temporary directory that will be wiped out once you close your R session.↩︎

  5. For more details, check this script.↩︎

  6. For more details, check this script.↩︎