vignettes/spatialLIBD.Rmd
spatialLIBD.Rmd
Welcome to the spatialLIBD
project! It is composed
of:
The web application allows you to browse the LIBD human dorsolateral pre-frontal cortex (DLPFC) spatial transcriptomics data generated with the 10x Genomics Visium platform. Through the R/Bioconductor package you can also download the data as well as visualize your own datasets using this web application. Please check the manuscript or bioRxiv pre-print for more details about this project.
If you tweet about this website, the data or the R package please use
the #spatialLIBD
hashtag. You can find previous tweets that
way as shown
here.
Thank you!
As a quick overview, the data presented here is from portion of the DLPFC that spans six neuronal layers plus white matter (A) for a total of three subjects with two pairs of spatially adjacent replicates (B). Each dissection of DLPFC was designed to span all six layers plus white matter (C). Using this web application you can explore the expression of known genes such as SNAP25 (D, a neuronal gene), MOBP (E, an oligodendrocyte gene), and known layer markers from mouse studies such as PCP4 (F, a known layer 5 marker gene).
spatialLIBD
R
is an open-source statistical environment which can be
easily modified to enhance its functionality via packages. spatialLIBD
is a R
package available via Bioconductor. R
can be installed on any operating system from CRAN after which you can install
spatialLIBD
by using the following commands in your R
session:
if (!requireNamespace("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
BiocManager::install("spatialLIBD")
## Check that you have a valid Bioconductor installation
BiocManager::valid()
To run all the code in this vignette, you might need to install other R/Bioconductor packages, which you can do with:
BiocManager::install("spatialLIBD", dependencies = TRUE, force = TRUE)
If you want to use the development version of
spatialLIBD
, you will need to use the R version
corresponding to the current Bioconductor-devel branch as described in
more detail on the Bioconductor
website. Then you can install spatialLIBD
from GitHub
using the following command.
BiocManager::install("LieberInstitute/spatialLIBD")
spatialLIBD (Pardo, Spangler, Weber, Hicks, Jaffe, Martinowich, Maynard, and Collado-Torres, 2022) is based on many other packages and in particular in those that have implemented the infrastructure needed for dealing with single cell RNA sequencing data, visualization functions, and interactive data exploration. That is, packages like SingleCellExperiment that allow you to store the data, ggplot2 and plotly for visualizing the data, and shiny for building an interactive interface. A spatialLIBD user who only accesses the web application is not expected to deal with those packages directly. A spatialLIBD user will need to be familiar with SingleCellExperiment and ggplot2 to understand the data provided by spatialLIBD or the graphical results spatialLIBD provides. Furthermore, it’ll be useful for the user to know about shiny and plotly if you wish to adapt the web application provided by spatialLIBD.
If you are asking yourself the question “Where do I start using Bioconductor?” you might be interested in this blog post.
As package developers, we try to explain clearly how to use our
packages and in which order to use the functions. But R
and
Bioconductor
have a steep learning curve so it is critical
to learn where to ask for help. The blog post quoted above mentions some
but we would like to highlight the Bioconductor support site
as the main resource for getting help regarding Bioconductor. Other
alternatives are available such as creating GitHub issues and tweeting.
However, please note that if you want to receive help you should adhere
to the posting
guidelines. It is particularly critical that you provide a small
reproducible example and your session information so package developers
can track down the source of the error.
spatialLIBD
We hope that spatialLIBD will be useful for your research. Please use the following information to cite the package and the research article describing the data provided by spatialLIBD. Thank you!
## Citation info
citation("spatialLIBD")
#> To cite package 'spatialLIBD' in publications use:
#>
#> Pardo B, Spangler A, Weber LM, Hicks SC, Jaffe AE, Martinowich K,
#> Maynard KR, Collado-Torres L (2022). "spatialLIBD: an R/Bioconductor
#> package to visualize spatially-resolved transcriptomics data." _BMC
#> Genomics_. doi:10.1186/s12864-022-08601-w
#> <https://doi.org/10.1186/s12864-022-08601-w>,
#> <https://doi.org/10.1186/s12864-022-08601-w>.
#>
#> Maynard KR, Collado-Torres L, Weber LM, Uytingco C, Barry BK,
#> Williams SR, II JLC, Tran MN, Besich Z, Tippani M, Chew J, Yin Y,
#> Kleinman JE, Hyde TM, Rao N, Hicks SC, Martinowich K, Jaffe AE
#> (2021). "Transcriptome-scale spatial gene expression in the human
#> dorsolateral prefrontal cortex." _Nature Neuroscience_.
#> doi:10.1038/s41593-020-00787-0
#> <https://doi.org/10.1038/s41593-020-00787-0>,
#> <https://www.nature.com/articles/s41593-020-00787-0>.
#>
#> Huuki-Myers LA, Spangler A, Eagles NJ, Montgomergy KD, Kwon SH, Guo
#> B, Grant-Peters M, Divecha HR, Tippani M, Sriworarat C, Nguyen AB,
#> Ravichandran P, Tran MN, Seyedian A, Consortium P, Hyde TM, Kleinman
#> JE, Battle A, Page SC, Ryten M, Hicks SC, Martinowich K,
#> Collado-Torres L, Maynard KR (2024). "A data-driven single-cell and
#> spatial transcriptomic map of the human prefrontal cortex."
#> _Science_. doi:10.1126/science.adh1938
#> <https://doi.org/10.1126/science.adh1938>,
#> <https://doi.org/10.1126/science.adh1938>.
#>
#> Kwon SH, Parthiban S, Tippani M, Divecha HR, Eagles NJ, Lobana JS,
#> Williams SR, Mark M, Bharadwaj RA, Kleinman JE, Hyde TM, Page SC,
#> Hicks SC, Martinowich K, Maynard KR, Collado-Torres L (2023).
#> "Influence of Alzheimer’s disease related neuropathology on local
#> microenvironment gene expression in the human inferior temporal
#> cortex." _GEN Biotechnology_. doi:10.1089/genbio.2023.0019
#> <https://doi.org/10.1089/genbio.2023.0019>,
#> <https://doi.org/10.1089/genbio.2023.0019>.
#>
#> To see these entries in BibTeX format, use 'print(<citation>,
#> bibtex=TRUE)', 'toBibtex(.)', or set
#> 'options(citation.bibtex.max=999)'.
The spatialLIBD (Pardo, Spangler, Weber et al., 2022) package was developed for analyzing the human dorsolateral prefrontal cortex (DLPFC) spatial transcriptomics data generated with the 10x Genomics Visium technology by researchers at the Lieber Institute for Brain Development (LIBD) (Maynard, Collado-Torres, Weber, Uytingco, Barry, Williams, II, Tran, Besich, Tippani, Chew, Yin, Kleinman, Hyde, Rao, Hicks, Martinowich, and Jaffe, 2021). An initial shiny application was developed for interactively exploring this data and for assigning human brain layer labels to the each spot for each sample generated. While this was useful enough for our project, we made this Bioconductor package in case you want to:
In this vignette we’ll showcase how you can access the Human DLPFC LIBD Visium dataset (Maynard, Collado-Torres, Weber et al., 2021), the R functions provided by spatialLIBD (Pardo, Spangler, Weber et al., 2022), and an overview of how you can re-shape your own Visium dataset to match the structure we used.
To get started, please load the spatialLIBD package.
The human DLPFC 10x Genomics Visium dataset analyzed by LIBD researchers and colleagues is described in detail by Maynard, Collado-Torres et al (Maynard, Collado-Torres, Weber et al., 2021). However, briefly, this dataset is composed of:
We combined all the Visium data into a single SpatialExperiment
(Righelli, Weber, Crowell, Pardo, Collado-Torres, Ghazanfar, Lun, Hicks,
and Risso, 2022) object that we typically refer to as
spe
1. It has 33,538 genes (rows) and 47,681
spots (columns). This is the initial point for most of our analyses (code available on
GitHub). Using spatialLIBD
(Pardo, Spangler, Weber et al., 2022) we manually assigned each spot
across all 12 images to a layer (L1 through L6 or WM). We then
compressed the spot-level data at the layer-level using a pseudo-bulking
approach resulting in the SingleCellExperiment
object we typically refer to as sce_layer
2. We
then computed for each gene t or F statistics assessing whether the gene
had higher expression in a given layer compared to the rest
(enrichment
; t-stat), between one layer and another layer
(pairwise
; t-stat), or had any expression variability
across all layers (anova
; F-stat). The results from the
models are stored in what we refer to as
modeling_results
3.
In summary,
spe
is the SpatialExperiment
object with all the spot-level data and the histology information for
visualization of the data.sce_layer
is the SingleCellExperiment
object with the layer-level data.modeling_results
contains the layer-level
enrichment
, pairwise
and anova
statistics.spatialLIBD
Using spatialLIBD
(Pardo, Spangler, Weber et al., 2022) you can download all of these R
objects. They are hosted by Bioconductor’s ExperimentHub
(Morgan and Shepherd, 2024) resource and you can download them using
spatialLIBD::fetch_data()
. fetch_data()
will
query ExperimentHub
which in turn will download the data and cache it so you don’t have to
download it again. If ExperimentHub
is unavailable, then fetch_data()
has a backup option that
does not cache the files 4 Below we obtain all of these objects.
## Connect to ExperimentHub
ehub <- ExperimentHub::ExperimentHub()
## Download the small example sce data
sce <- fetch_data(type = "sce_example", eh = ehub)
#> 2024-12-16 21:55:22.038067 loading file /github/home/.cache/R/BiocFileCache/5db3dd794ad_sce_sub_for_vignette.Rdata%3Fdl%3D1
## Convert to a SpatialExperiment object
spe <- sce_to_spe(sce)
## If you want to download the full real data (about 2.1 GB in RAM) use:
if (FALSE) {
if (!exists("spe")) spe <- fetch_data(type = "spe", eh = ehub)
}
## Query ExperimentHub and download the data
if (!exists("sce_layer")) sce_layer <- fetch_data(type = "sce_layer", eh = ehub)
#> 2024-12-16 21:55:26.684604 loading file /github/home/.cache/R/BiocFileCache/5db16a912b9_Human_DLPFC_Visium_processedData_sce_scran_sce_layer_spatialLIBD.Rdata%3Fdl%3D1
modeling_results <- fetch_data("modeling_results", eh = ehub)
#> 2024-12-16 21:55:27.096168 loading file /github/home/.cache/R/BiocFileCache/5db39a09009_Human_DLPFC_Visium_modeling_results.Rdata%3Fdl%3D1
Once you have downloaded the objects, we can explore them a little bit
## spot-level data
spe
#> class: SpatialExperiment
#> dim: 33538 47681
#> metadata(0):
#> assays(2): counts logcounts
#> rownames(33538): ENSG00000243485 ENSG00000237613 ... ENSG00000277475
#> ENSG00000268674
#> rowData names(9): source type ... gene_search is_top_hvg
#> colnames(47681): AAACAACGAATAGTTC-1 AAACAAGTATCTCCCA-1 ...
#> TTGTTTCCATACAACT-1 TTGTTTGTGTAAATTC-1
#> colData names(69): sample_id Cluster ... array_row array_col
#> reducedDimNames(6): PCA TSNE_perplexity50 ... TSNE_perplexity80
#> UMAP_neighbors15
#> mainExpName: NULL
#> altExpNames(0):
#> spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres
#> imgData names(4): sample_id image_id data scaleFactor
## layer-level data
sce_layer
#> class: SingleCellExperiment
#> dim: 22331 76
#> metadata(0):
#> assays(2): counts logcounts
#> rownames(22331): ENSG00000243485 ENSG00000238009 ... ENSG00000278384
#> ENSG00000271254
#> rowData names(10): source type ... is_top_hvg is_top_hvg_sce_layer
#> colnames(76): 151507_Layer1 151507_Layer2 ... 151676_Layer6 151676_WM
#> colData names(13): sample_name layer_guess ...
#> layer_guess_reordered_short spatialLIBD
#> reducedDimNames(6): PCA TSNE_perplexity5 ... UMAP_neighbors15 PCAsub
#> mainExpName: NULL
#> altExpNames(0):
## list of modeling result tables
sapply(modeling_results, class)
#> anova enrichment pairwise
#> "data.frame" "data.frame" "data.frame"
sapply(modeling_results, dim)
#> anova enrichment pairwise
#> [1,] 22331 22331 22331
#> [2,] 10 23 65
sapply(modeling_results, function(x) {
head(colnames(x))
})
#> anova enrichment pairwise
#> [1,] "f_stat_full" "t_stat_WM" "t_stat_WM-Layer1"
#> [2,] "p_value_full" "t_stat_Layer1" "t_stat_WM-Layer2"
#> [3,] "fdr_full" "t_stat_Layer2" "t_stat_WM-Layer3"
#> [4,] "full_AveExpr" "t_stat_Layer3" "t_stat_WM-Layer4"
#> [5,] "f_stat_noWM" "t_stat_Layer4" "t_stat_WM-Layer5"
#> [6,] "p_value_noWM" "t_stat_Layer5" "t_stat_WM-Layer6"
The modeling statistics are in wide format, which can make some
visualizations complicated. The function
sig_genes_extract_all()
provides a way to convert them into
long format and add some useful information. Let’s do so below.
## Convert to a long format the modeling results
## This takes a few seconds to run
system.time(
sig_genes <-
sig_genes_extract_all(
n = nrow(sce_layer),
modeling_results = modeling_results,
sce_layer = sce_layer
)
)
#> user system elapsed
#> 9.671 0.332 10.014
## Explore the result
class(sig_genes)
#> [1] "DFrame"
#> attr(,"package")
#> [1] "S4Vectors"
dim(sig_genes)
#> [1] 1138881 12
spatialLIBD
data
Now that you have downloaded the data, you can interactively explore
the data using a shiny (Chang,
Cheng, Allaire, Sievert, Schloerke, Xie, Allen, McPherson, Dipert, and
Borges, 2024) web application contained within spatialLIBD
(Pardo, Spangler, Weber et al., 2022). To do so, use the
run_app()
function as shown below:
if (interactive()) {
run_app(
spe = spe,
sce_layer = sce_layer,
modeling_results = modeling_results,
sig_genes = sig_genes
)
}
The spatialLIBD shiny application allows you to browse the spot-level data and interactively label spots, as well as explore the layer-level results. Once you launch it, check the Documentation tab for each view for more details. In order to avoid duplicating the documentation, we provide all the details on the shiny application itself.
Though overall, this application allows you to export all static visualizations as PDF files or all interactive visualizations as PNG files, as well as all result table as CSV files. This is what produces the version you can access without any R use from your side at spatial.libd.org/spatialLIBD/.
spatialLIBD
functions
We already covered fetch_data()
which allows you to
download the Human DLPFC Visium data from LIBD researchers and
colleagues (Maynard, Collado-Torres, Weber et al., 2021).
With the spe
object that contains the spot-level data,
we can visualize any discrete variable such as the layers using
vis_clus()
and related functions. These functions know
where to extract and how to visualize the histology information.
## View our LIBD layers for one sample
vis_clus(
spe = spe,
clustervar = "layer_guess_reordered",
sampleid = "151673",
colors = libd_layer_colors,
... = " LIBD Layers"
)
Most of the variables stored in spe
are discrete
variables and as such, you can visualize them using
vis_clus()
and vis_grid_clus()
for one or more
than one sample respectively.
## This is not fully precise, but gives you a rough idea
## Some integer columns are actually continuous variables
table(sapply(colData(spe), class) %in% c("factor", "integer"))
#>
#> FALSE TRUE
#> 12 57
## This is more precise (one cluster has 28 unique values)
table(sapply(colData(spe), function(x) length(unique(x))) < 29)
#>
#> FALSE TRUE
#> 7 62
Notably, vis_clus()
has a spatial
logical(1)
argument which is useful if you want to
visualize the data without the histology information provided by
geom_spatial()
(a custom ggplot2::layer()
). In
particular, this is useful if you then want to use
plotly::ggplotly()
or other similar functions on the
resulting ggplot2::ggplot()
object.
## View our LIBD layers for one sample
## without spatial information
vis_clus(
spe = spe,
clustervar = "layer_guess_reordered",
sampleid = "151673",
colors = libd_layer_colors,
... = " LIBD Layers",
spatial = FALSE
)
Some helper functions include get_colors()
and
sort_clusters()
as well as the
libd_layer_colors
object included in spatialLIBD
(Pardo, Spangler, Weber et al., 2022).
## Color palette designed by Lukas M. Weber with feedback from the team.
libd_layer_colors
#> Layer1 Layer2 Layer3 Layer4 Layer5
#> "#F0027F" "#377EB8" "#4DAF4A" "#984EA3" "#FFD700"
#> Layer6 WM NA WM2
#> "#FF7F00" "#1A1A1A" "transparent" "#666666"
Similar to vis_clus()
, the vis_gene()
family of functions use the spe
spot-level object to
visualize the gene expression or any continuous variable such as the
number of cells per spots. That is, vis_gene()
can
visualize any of the assays(spe)
or any of the continuous
variables stored in colData(spe)
. If you want to visualize
more than one sample at a time, use vis_grid_gene()
instead. And just like vis_clus()
, vis_gene()
has a spatial
logical(1)
argument to turn off
the custom spatial ggplot2::layer()
produced by
geom_spatial()
.
## Available gene expression assays
assayNames(spe)
#> [1] "counts" "logcounts"
## Not all of these make sense to visualize
## In particular, the key is not useful to visualize.
colnames(colData(spe))
#> [1] "sample_id" "Cluster"
#> [3] "sum_umi" "sum_gene"
#> [5] "subject" "position"
#> [7] "replicate" "subject_position"
#> [9] "discard" "key"
#> [11] "cell_count" "SNN_k50_k4"
#> [13] "SNN_k50_k5" "SNN_k50_k6"
#> [15] "SNN_k50_k7" "SNN_k50_k8"
#> [17] "SNN_k50_k9" "SNN_k50_k10"
#> [19] "SNN_k50_k11" "SNN_k50_k12"
#> [21] "SNN_k50_k13" "SNN_k50_k14"
#> [23] "SNN_k50_k15" "SNN_k50_k16"
#> [25] "SNN_k50_k17" "SNN_k50_k18"
#> [27] "SNN_k50_k19" "SNN_k50_k20"
#> [29] "SNN_k50_k21" "SNN_k50_k22"
#> [31] "SNN_k50_k23" "SNN_k50_k24"
#> [33] "SNN_k50_k25" "SNN_k50_k26"
#> [35] "SNN_k50_k27" "SNN_k50_k28"
#> [37] "GraphBased" "Maynard"
#> [39] "Martinowich" "layer_guess"
#> [41] "layer_guess_reordered" "layer_guess_reordered_short"
#> [43] "expr_chrM" "expr_chrM_ratio"
#> [45] "SpatialDE_PCA" "SpatialDE_pool_PCA"
#> [47] "HVG_PCA" "pseudobulk_PCA"
#> [49] "markers_PCA" "SpatialDE_UMAP"
#> [51] "SpatialDE_pool_UMAP" "HVG_UMAP"
#> [53] "pseudobulk_UMAP" "markers_UMAP"
#> [55] "SpatialDE_PCA_spatial" "SpatialDE_pool_PCA_spatial"
#> [57] "HVG_PCA_spatial" "pseudobulk_PCA_spatial"
#> [59] "markers_PCA_spatial" "SpatialDE_UMAP_spatial"
#> [61] "SpatialDE_pool_UMAP_spatial" "HVG_UMAP_spatial"
#> [63] "pseudobulk_UMAP_spatial" "markers_UMAP_spatial"
#> [65] "spatialLIBD" "ManualAnnotation"
#> [67] "in_tissue" "array_row"
#> [69] "array_col"
## Visualize a gene
vis_gene(
spe = spe,
sampleid = "151673",
viridis = FALSE
)
## Visualize the estimated number of cells per spot
vis_gene(
spe = spe,
sampleid = "151673",
geneid = "cell_count"
)
## Visualize the fraction of chrM expression per spot
## without the spatial layer
vis_gene(
spe = spe,
sampleid = "151673",
geneid = "expr_chrM_ratio",
spatial = FALSE
)
As for the color palette, you can either use the color blind friendly
palette when viridis = TRUE
or a custom palette we
determined. Note that if you design your own palette, you have to take
into account that values it can be hard to distinguish some colors from
the histology set of purple tones that are noticeable when the
continuous variable is below or equal to the minCount
(in
our palettes such points have a 'transparent'
color). For
more details, check the internal code of vis_gene_p()
.
Earlier we also ran sig_genes_extract_all()
in order to
run the shiny web
application. However, we didn’t explain the output. If you explore it,
you’ll notice that it’s a very long table with several columns.
head(sig_genes)
#> DataFrame with 6 rows and 12 columns
#> top model_type test gene stat pval
#> <integer> <character> <character> <character> <numeric> <numeric>
#> 1 1 enrichment WM NDRG1 16.3053 1.25896e-26
#> 2 2 enrichment WM PTP4A2 16.1469 2.25133e-26
#> 3 3 enrichment WM AQP1 15.9927 3.97849e-26
#> 4 4 enrichment WM PAQR6 15.1971 7.86258e-25
#> 5 5 enrichment WM ANP32B 14.9798 1.80183e-24
#> 6 6 enrichment WM JAM3 14.7413 4.50756e-24
#> fdr gene_index ensembl in_rows in_rows_top20
#> <numeric> <integer> <character> <IntegerList> <IntegerList>
#> 1 2.51372e-22 10404 ENSG00000104419 1,40215,65023,... 1,156337,223320,...
#> 2 2.51372e-22 487 ENSG00000184007 2,40531,64532,... 2,223323,245661,...
#> 3 2.96145e-22 8201 ENSG00000240583 3,32241,65062,... 3,223314,245643,...
#> 4 4.38948e-21 1501 ENSG00000160781 4,34318,65863,... 4,245654,267982
#> 5 8.04735e-21 10962 ENSG00000136938 5,28964,63931,... 5,223329,267975
#> 6 1.67295e-20 12600 ENSG00000166086 6,40868,65917,... 6,156334,223324,...
#> results
#> <CharacterList>
#> 1 WM_top1,WM-Layer1_top20,WM-Layer4_top10,...
#> 2 WM_top2,WM-Layer4_top13,WM-Layer5_top20,...
#> 3 WM_top3,WM-Layer4_top4,WM-Layer5_top2,...
#> 4 WM_top4,WM-Layer5_top13,WM-Layer6_top10
#> 5 WM_top5,WM-Layer4_top19,WM-Layer6_top3
#> 6 WM_top6,WM-Layer1_top17,WM-Layer4_top14,...
The output of sig_genes_extract_all()
contains the
following columns:
top
: the rank of the gene for the given
test
.model_type
: either enrichment
,
pairwise
or anova
.test
: the short notation for the test performed. For
example, WM
is white matter versus the other layers while
WM-Layer1
is white matter greater than layer 1.gene
: the gene symbol.stat
: the corresponding F or t-statistic.pval
: the corresponding p-value (two-sided for
t-stats).fdr
: the FDR adjusted p-value.gene_index
: the row of sce_layer
and the
original tables in modeling_results
to join the tables if
necessary.ensembl
: Ensembl gene ID.in_rows
: an IntegerList()
specifying all
the rows where that gene is present.in_rows_top20
: an IntegerList()
specifying
all the rows where that gene is present and where its rank
(top
) is less than or equal to 20. This information is only
included for the first occurrence of each gene if that gene is on the
top 20 rank for any of the models.results
: an CharacterList()
specifying all
the test
results where the gene is ranked
(top
) among the top 20 genes.sig_genes_extract_all()
uses
sig_genes_extract()
as it’s workhorse and as such has a
very similar output.
After extracting the table of modeling results in long format with
sig_genes_extract_all()
, we can then use
layer_boxplot()
to visualize any gene of interest for any
of the model types and tests we performed. Below we explore the first
one (by default). We also show the color palettes used in the shiny
application provided by spatialLIBD
(Pardo, Spangler, Weber et al., 2022). The last example has the long
title version that uses more information from the sig_genes
object we created earlier.
## Note that we recommend setting the random seed so the jittering of the
## points will be reproducible. Given the requirements by BiocCheck this
## cannot be done inside the layer_boxplot() function.
## Create a boxplot of the first gene in `sig_genes`.
set.seed(20200206)
layer_boxplot(sig_genes = sig_genes, sce_layer = sce_layer)
## Viridis colors displayed in the shiny app
## showing the first pairwise model result
## (which illustrates the background colors used for the layers not
## involved in the pairwise comparison)
set.seed(20200206)
layer_boxplot(
i = which(sig_genes$model_type == "pairwise")[1],
sig_genes = sig_genes,
sce_layer = sce_layer,
col_low_box = viridisLite::viridis(4)[2],
col_low_point = viridisLite::viridis(4)[1],
col_high_box = viridisLite::viridis(4)[3],
col_high_point = viridisLite::viridis(4)[4]
)
## Paper colors displayed in the shiny app
set.seed(20200206)
layer_boxplot(
sig_genes = sig_genes,
sce_layer = sce_layer,
short_title = FALSE,
col_low_box = "palegreen3",
col_low_point = "springgreen2",
col_high_box = "darkorange2",
col_high_point = "orange1"
)
Just like we compressed our spe
object into
sce_layer
by pseudo-bulking 5, we can do the same
for other single nucleus or single cell RNA sequencing datasets
(snRNA-seq, scRNA-seq) and then compute enrichment t-statistics (one
group vs the rest; could be one cell type vs the rest or one cluster of
cells vs the rest). In our analysis (Maynard, Collado-Torres, Weber et
al., 2021), we did this for several datasets including one of our LIBD
Human DLPFC snRNA-seq data generated by Matthew N Tran et al 6. In spatialLIBD
(Pardo, Spangler, Weber et al., 2022) we include a small set of these
statistics for the 31 cell clusters identified by Matthew N Tran et
al.
## Explore the enrichment t-statistics derived from Tran et al's snRNA-seq
## DLPFC data
dim(tstats_Human_DLPFC_snRNAseq_Nguyen_topLayer)
#> [1] 692 31
tstats_Human_DLPFC_snRNAseq_Nguyen_topLayer[seq_len(3), seq_len(6)]
#> 2 (1) 4 (1) 6 (1) 8 (1) 10 (1)
#> ENSG00000104419 -0.6720726 0.30024609 0.5854130 0.1358497 0.3981996
#> ENSG00000184007 0.2535838 -0.28998049 -1.8106052 -0.2096309 -0.4521719
#> ENSG00000240583 -0.2015831 -0.04053423 -0.5120807 0.1688405 -0.3969257
#> 12 (1)
#> ENSG00000104419 -0.9731938
#> ENSG00000184007 -0.1840286
#> ENSG00000240583 -0.2086954
The function layer_stat_cor()
will take as input one
such matrix of statistics and correlate them against our layer
enrichment results (or other model types) using the subset of Ensembl
gene IDs that are observed in both tables.
## Compute the correlation matrix of enrichment t-statistics between our data
## and Tran et al's snRNA-seq data
cor_stats_layer <- layer_stat_cor(
tstats_Human_DLPFC_snRNAseq_Nguyen_topLayer,
modeling_results,
"enrichment"
)
## Explore the correlation matrix
head(cor_stats_layer[, seq_len(3)])
#> WM Layer6 Layer5
#> 22 (3) 0.6824669 -0.009192291 -0.1934265
#> 3 (3) 0.7154122 -0.070042729 -0.2290574
#> 23 (3) 0.6637885 -0.031467704 -0.2018306
#> 17 (3) 0.6364983 -0.094216046 -0.2026147
#> 21 (3) 0.6281443 -0.050336358 -0.1988774
#> 7 (4) 0.1850724 -0.197283175 -0.2716890
Once we have computed this correlation matrix, we can then visualize
it using layer_stat_cor_plot()
as shown below.
## Visualize the correlation matrix
layer_stat_cor_plot(cor_stats_layer)
In order to fully interpret the resulting heatmap you need to know
what each of the cell clusters labels mean. In this case, the syntax is
xx (Y)
where xx
is the cluster number and
Y
is:
Excit
: excitatory neurons.Inhib
: inhibitory neurons.Oligo
: oligodendrocytes.Astro
: astrocytes.OPC
: oligodendrocyte progenitor cells.Drop
: an ambiguous cluster of cells that potentially
should be dropped.You can find the version with the full names here if you are interested in it.
The shiny application provided by spatialLIBD (Pardo, Spangler, Weber et al., 2022) allows users to upload CSV file with these t-statistics, view the correlation heatmaps, download them, and download the correlation matrix. An example CSV file is provided here.
Many researchers have identified lists of genes that increase the risk of a given disorder or disease, are differentially expressed in a given experiment or set of conditions, have been described in a several research papers, among other collections. We can ask for any of our modeling results whether a list of genes is enriched among our significant results.
For illustration purposes, we included in spatialLIBD
(Pardo, Spangler, Weber et al., 2022) a set of genes from the Simons
Foundation Autism Research Initiative SFARI. In our analysis (Maynard,
Collado-Torres, Weber et al., 2021) we used more gene lists. Below, we
read in the data and create a list()
object with the
Ensembl gene IDs that SFARI has provided that are related to autism.
## Read in the SFARI gene sets included in the package
asd_sfari <- utils::read.csv(
system.file(
"extdata",
"SFARI-Gene_genes_01-03-2020release_02-04-2020export.csv",
package = "spatialLIBD"
),
as.is = TRUE
)
## Format them appropriately
asd_sfari_geneList <- list(
Gene_SFARI_all = asd_sfari$ensembl.id,
Gene_SFARI_high = asd_sfari$ensembl.id[asd_sfari$gene.score < 3],
Gene_SFARI_syndromic = asd_sfari$ensembl.id[asd_sfari$syndromic == 1]
)
After reading the list of genes, we can then compute the enrichment odds ratios and p-values for a given FDR threshold in our statistics.
## Compute the gene set enrichment results
asd_sfari_enrichment <- gene_set_enrichment(
gene_list = asd_sfari_geneList,
modeling_results = modeling_results,
model_type = "enrichment"
)
## Explore the results
head(asd_sfari_enrichment)
#> OR Pval test NumSig SetSize ID model_type
#> 1 1.2659915 0.001761332 WM 231 869 Gene_SFARI_all enrichment
#> 2 1.1819109 0.098959488 WM 90 355 Gene_SFARI_high enrichment
#> 3 1.2333378 0.185302060 WM 31 118 Gene_SFARI_syndromic enrichment
#> 4 0.9702022 0.613080610 Layer1 71 869 Gene_SFARI_all enrichment
#> 5 0.7192630 0.949332765 Layer1 22 355 Gene_SFARI_high enrichment
#> 6 1.1216176 0.405453216 Layer1 11 118 Gene_SFARI_syndromic enrichment
#> fdr_cut
#> 1 0.1
#> 2 0.1
#> 3 0.1
#> 4 0.1
#> 5 0.1
#> 6 0.1
Using the above enrichment table, we can then visualize the odds
ratios on a heatmap as shown below. Note that we use the thresholded
p-values at -log10(p) = 12
for visualization purposes and
only show the odds ratios for -log10(p) > 3
by
default.
## Visualize gene set enrichment results
gene_set_enrichment_plot(
asd_sfari_enrichment,
xlabs = gsub(".*_", "", unique(asd_sfari_enrichment$ID)),
plot_SetSize_bar = TRUE,
model_colors = get_colors(
spatialLIBD::libd_layer_colors,
clusters = unique(asd_sfari_enrichment$test)
)
)
The shiny application provided by spatialLIBD (Pardo, Spangler, Weber et al., 2022) allows users to upload CSV file their gene lists, compute the enrichment statistics, visualize them, download the PDF, and download the enrichment table. An example CSV file is provided here.
This section gets into the details of how we generated the data (Maynard, Collado-Torres, Weber et al., 2021) behind spatialLIBD (Pardo, Spangler, Weber et al., 2022). This could be useful if you are a Bioconductor developer or a user very familiar with packages such as SingleCellExperiment.
As of version 1.3.3, spatialLIBD
supports the SpatialExperiment
class from SpatialExperiment.
The functions vis_gene_p()
, vis_gene()
,
vis_grid_clus()
, vis_grid_gene()
,
vis_clus()
, vis_clus_p()
,
geom_spatial()
now work with SpatialExperiment
objects thanks to the updates in SpatialExperiment.
If you have spot-level data formatted in the older
SingleCellExperiment
objects that were heavily modified for
spatialLIBD
, you can use sce_to_spe()
to
convert the objects.
This work was done by Brenda Pardo and Leonardo.
Please check the second vignette on how to use spatialLIBD
with your own data as exemplified with a public 10x Genomics dataset or
go directly to the read10xVisiumWrapper()
documentation.
If you want to check the key characteristics required by spatialLIBD’s
functions or the shiny
application, use the check_*
family of functions.
check_spe(spe)
#> class: SpatialExperiment
#> dim: 33538 47681
#> metadata(0):
#> assays(2): counts logcounts
#> rownames(33538): ENSG00000243485 ENSG00000237613 ... ENSG00000277475
#> ENSG00000268674
#> rowData names(9): source type ... gene_search is_top_hvg
#> colnames(47681): AAACAACGAATAGTTC-1 AAACAAGTATCTCCCA-1 ...
#> TTGTTTCCATACAACT-1 TTGTTTGTGTAAATTC-1
#> colData names(69): sample_id Cluster ... array_row array_col
#> reducedDimNames(6): PCA TSNE_perplexity50 ... TSNE_perplexity80
#> UMAP_neighbors15
#> mainExpName: NULL
#> altExpNames(0):
#> spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres
#> imgData names(4): sample_id image_id data scaleFactor
check_sce_layer(sce_layer)
#> class: SingleCellExperiment
#> dim: 22331 76
#> metadata(0):
#> assays(2): counts logcounts
#> rownames(22331): ENSG00000243485 ENSG00000238009 ... ENSG00000278384
#> ENSG00000271254
#> rowData names(10): source type ... is_top_hvg is_top_hvg_sce_layer
#> colnames(76): 151507_Layer1 151507_Layer2 ... 151676_Layer6 151676_WM
#> colData names(13): sample_name layer_guess ...
#> layer_guess_reordered_short spatialLIBD
#> reducedDimNames(6): PCA TSNE_perplexity5 ... UMAP_neighbors15 PCAsub
#> mainExpName: NULL
#> altExpNames(0):
## The output here is too long to print
xx <- check_modeling_results(modeling_results)
identical(xx, modeling_results)
#> [1] TRUE
If you are interested in reshaping your data to fit our structure, we do not provide a quick function to do so. That is intentional given the active development by the Bioconductor community for determining the best way to deal with spatial transcriptomics data in general and the 10x Visium data in particular. Having said that, we do have all the steps and reproducibility information documented across several of our analysis (Maynard, Collado-Torres, Weber et al., 2021) scripts.
reorganize_folder.R
available here
re-organizes the raw data we were sent by 10x Genomics.Layer_Notebook.R
available here
reads in the Visium data and builds a list of
RangeSummarizedExperiment()
objects from SummarizedExperiment,
one per sample (image) that is eventually saved as
Human_DLPFC_Visium_processedData_rseList.rda
.convert_sce.R
available here
reads in Human_DLPFC_Visium_processedData_rseList.rda
and
builds an initial sce
object with image data under
metadata(sce)$image
which is a single data.frame.
Subsetting doesn’t automatically subset the image, so you have to do it
yourself when plotting as is done by vis_clus_p()
and
vis_gene_p()
. Having the data from all images in a single
object allows you to use the spot-level data from all images to compute
clusters and do other similar analyses to the ones you would do with
sc/snRNA-seq data. The script creates the
Human_DLPFC_Visium_processedData_sce.Rdata
file.sce_scran.R
available here
then uses scran to
read in Human_DLPFC_Visium_processedData_sce.Rdata
, compute
the highly variable genes (stored in our final sce
object
at rowData(sce)$is_top_hvg
), perform dimensionality
reduction (PCA, TSNE, UMAP) and identify clusters using the data from
all images. The resulting data is then stored as
Human_DLPFC_Visium_processedData_sce_scran.Rdata
and is the
main object used throughout our analysis code (Maynard, Collado-Torres,
Weber et al., 2021).make-data_spatialLIBD.R
available in the source version
of spatialLIBD
and online
here is the script that reads in
Human_DLPFC_Visium_processedData_sce_scran.Rdata
as well as
some other outputs from our analysis and combines them into the final
sce
and sce_layer
objects provided by spatialLIBD
(Pardo, Spangler, Weber et al., 2022). This script simplifies some
operations in order to simplify the code behind the shiny
application provided by spatialLIBD.You don’t necessarily need to do all of this to use the functions
provided by spatialLIBD.
Note that external to the R objects, for the shiny
application provided by spatialLIBD
you will need to have the tissue_lowres_image.png
image
files in a directory structure by sample as shown here
in order for the interactive visualizations made with plotly to
work.
Over time spatialLIBD::fetch_data()
has been expanded to
provide access to other datasets generated by our teams at the Lieber
Institute for Brain Development (LIBD) that have
also been analyzed with spatialLIBD
.
Through spatialLIBD::fetch_data()
you can also download
the data from the Integrated
single cell and unsupervised spatial transcriptomic analysis defines
molecular anatomy of the human dorsolateral prefrontal cortex
project, also known as spatialDLPFC
(Huuki-Myers, Spangler,
Eagles, Montgomergy, Kwon, Guo, Grant-Peters, Divecha, Tippani,
Sriworarat, Nguyen, Ravichandran, Tran, Seyedian, Consortium, Hyde,
Kleinman, Battle, Page, Ryten, Hicks, Martinowich, Collado-Torres, and
Maynard, 2024). See http://research.libd.org/spatialDLPFC/ for more
information about this project.
See the Twitter thread 🧵 below for a brief overview of the #spatialDLPFC
project.
Hot of the pre-print press! 🔥 Our latest work #spatialDLPFC pairs #snRNAseq and #Visium spatial transcriptomic data in the human #DLPFC building a neuroanatomical atlas of this critical brain region 🧠@LieberInstitute @10xGenomics #scitwitter
— Louise Huuki-Myers (@lahuuki) February 17, 2023
📰 https://t.co/NJWJ1mwB9J pic.twitter.com/l8W154XZ50
Through spatialLIBD::fetch_data()
you can also download
the data from the Influence
of Alzheimer’s disease related neuropathology on local microenvironment
gene expression in the human inferior temporal cortex project,
also known as Visium_SPG_AD
(Kwon, Parthiban, Tippani,
Divecha, Eagles, Lobana, Williams, Mark, Bharadwaj, Kleinman, Hyde,
Page, Hicks, Martinowich, Maynard, and Collado-Torres, 2023). See http://research.libd.org/Visium_SPG_AD/ for more
information about this project.
See the Twitter thread 🧵 below for a brief overview of the #Visium_SPG_AD
project.
TODO
spatialLIBD
Sometimes our collaborators have shared data through other venues. So
not all LIBD spatially-resolved transcriptomics data from the Keri
Martinowich, Kristen
Maynard, and Leonardo
Collado-Torres teams has been released through
spatialLIBD
. However, it is very much compatible with
spatialLIBD
and can be analyzed or visualized with
spatialLIBD
functions.
The
gene expression landscape of the human locus coeruleus revealed by
single-nucleus and spatially-resolved transcriptomics, also
known as locus-c
, is not available through
spatialLIBD
, but you might be interested in checking out
the excellent WeberDivechaLCdata
package for more details. See https://github.com/lmweber/locus-c for more details
about the locus-c
project.
See the Twitter thread 🧵 below for a brief overview of the
locus-c
project.
Very happy to share our preprint on spatially-resolved transcriptomics and single-nucleus RNA-sequencing in the human locus coeruleus! 🎉🧠🔵 https://t.co/L69G2P9PO6
— Lukas Weber ☀️ (@lmwebr) October 31, 2022
The spatialLIBD package (Pardo, Spangler, Weber et al., 2022) was made possible thanks to:
Code for creating the vignette
## Create the vignette
library("rmarkdown")
system.time(render("spatialLIBD.Rmd"))
## Extract the R code
library("knitr")
knit("spatialLIBD.Rmd", tangle = TRUE)
Date the vignette was generated.
#> [1] "2024-12-16 21:55:46 UTC"
Wallclock time spent generating the vignette.
#> Time difference of 36.462 secs
R
session information.
#> ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.4.2 (2024-10-31)
#> os Ubuntu 24.04.1 LTS
#> system x86_64, linux-gnu
#> ui X11
#> language en
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz UTC
#> date 2024-12-16
#> pandoc 3.5 @ /usr/bin/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> abind 1.4-8 2024-09-12 [1] RSPM (R 4.4.0)
#> AnnotationDbi 1.68.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> AnnotationHub 3.14.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> attempt 0.3.1 2020-05-03 [1] RSPM (R 4.4.0)
#> backports 1.5.0 2024-05-23 [1] RSPM (R 4.4.0)
#> beachmat 2.22.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> beeswarm 0.4.0 2021-06-01 [1] RSPM (R 4.4.0)
#> benchmarkme 1.0.8 2022-06-12 [1] RSPM (R 4.4.0)
#> benchmarkmeData 1.0.4 2020-04-23 [1] RSPM (R 4.4.0)
#> bibtex 0.5.1 2023-01-26 [1] RSPM (R 4.4.0)
#> Biobase * 2.66.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> BiocFileCache 2.14.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> BiocGenerics * 0.52.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> BiocIO 1.16.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> BiocManager 1.30.25 2024-08-28 [2] CRAN (R 4.4.2)
#> BiocNeighbors 2.0.1 2024-11-28 [1] Bioconductor 3.20 (R 4.4.2)
#> BiocParallel 1.40.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> BiocSingular 1.22.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> BiocStyle * 2.34.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> BiocVersion 3.20.0 2024-10-21 [2] Bioconductor 3.20 (R 4.4.2)
#> Biostrings 2.74.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> bit 4.5.0.1 2024-12-03 [1] RSPM (R 4.4.0)
#> bit64 4.5.2 2024-09-22 [1] RSPM (R 4.4.0)
#> bitops 1.0-9 2024-10-03 [1] RSPM (R 4.4.0)
#> blob 1.2.4 2023-03-17 [1] RSPM (R 4.4.0)
#> bookdown 0.41 2024-10-16 [1] RSPM (R 4.4.0)
#> bslib 0.8.0 2024-07-29 [2] RSPM (R 4.4.0)
#> cachem 1.1.0 2024-05-16 [2] RSPM (R 4.4.0)
#> Cairo 1.6-2 2023-11-28 [1] RSPM (R 4.4.0)
#> circlize 0.4.16 2024-02-20 [1] RSPM (R 4.4.0)
#> cli 3.6.3 2024-06-21 [2] RSPM (R 4.4.0)
#> clue 0.3-66 2024-11-13 [1] RSPM (R 4.4.0)
#> cluster 2.1.8 2024-12-11 [3] RSPM (R 4.4.0)
#> codetools 0.2-20 2024-03-31 [3] CRAN (R 4.4.2)
#> colorspace 2.1-1 2024-07-26 [1] RSPM (R 4.4.0)
#> ComplexHeatmap 2.22.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> config 0.3.2 2023-08-30 [1] RSPM (R 4.4.0)
#> cowplot 1.1.3 2024-01-22 [1] RSPM (R 4.4.0)
#> crayon 1.5.3 2024-06-20 [2] RSPM (R 4.4.0)
#> curl 6.0.1 2024-11-14 [2] RSPM (R 4.4.0)
#> data.table 1.16.4 2024-12-06 [1] RSPM (R 4.4.0)
#> DBI 1.2.3 2024-06-02 [1] RSPM (R 4.4.0)
#> dbplyr 2.5.0 2024-03-19 [1] RSPM (R 4.4.0)
#> DelayedArray 0.32.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> desc 1.4.3 2023-12-10 [2] RSPM (R 4.4.0)
#> digest 0.6.37 2024-08-19 [2] RSPM (R 4.4.0)
#> doParallel 1.0.17 2022-02-07 [1] RSPM (R 4.4.0)
#> dplyr 1.1.4 2023-11-17 [1] RSPM (R 4.4.0)
#> DT 0.33 2024-04-04 [1] RSPM (R 4.4.0)
#> edgeR 4.4.1 2024-12-02 [1] Bioconductor 3.20 (R 4.4.2)
#> evaluate 1.0.1 2024-10-10 [2] RSPM (R 4.4.0)
#> ExperimentHub 2.14.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> fansi 1.0.6 2023-12-08 [2] RSPM (R 4.4.0)
#> farver 2.1.2 2024-05-13 [1] RSPM (R 4.4.0)
#> fastmap 1.2.0 2024-05-15 [2] RSPM (R 4.4.0)
#> filelock 1.0.3 2023-12-11 [1] RSPM (R 4.4.0)
#> foreach 1.5.2 2022-02-02 [1] RSPM (R 4.4.0)
#> fs 1.6.5 2024-10-30 [2] RSPM (R 4.4.0)
#> generics 0.1.3 2022-07-05 [1] RSPM (R 4.4.0)
#> GenomeInfoDb * 1.42.1 2024-11-28 [1] Bioconductor 3.20 (R 4.4.2)
#> GenomeInfoDbData 1.2.13 2024-12-10 [1] Bioconductor
#> GenomicAlignments 1.42.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> GenomicRanges * 1.58.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> GetoptLong 1.0.5 2020-12-15 [1] RSPM (R 4.4.0)
#> ggbeeswarm 0.7.2 2023-04-29 [1] RSPM (R 4.4.0)
#> ggplot2 3.5.1 2024-04-23 [1] RSPM (R 4.4.0)
#> ggrepel 0.9.6 2024-09-07 [1] RSPM (R 4.4.0)
#> GlobalOptions 0.1.2 2020-06-10 [1] RSPM (R 4.4.0)
#> glue 1.8.0 2024-09-30 [2] RSPM (R 4.4.0)
#> golem 0.5.1 2024-08-27 [1] RSPM (R 4.4.0)
#> gridExtra 2.3 2017-09-09 [1] RSPM (R 4.4.0)
#> gtable 0.3.6 2024-10-25 [1] RSPM (R 4.4.0)
#> htmltools 0.5.8.1 2024-04-04 [2] RSPM (R 4.4.0)
#> htmlwidgets 1.6.4 2023-12-06 [2] RSPM (R 4.4.0)
#> httpuv 1.6.15 2024-03-26 [2] RSPM (R 4.4.0)
#> httr 1.4.7 2023-08-15 [1] RSPM (R 4.4.0)
#> IRanges * 2.40.1 2024-12-05 [1] Bioconductor 3.20 (R 4.4.2)
#> irlba 2.3.5.1 2022-10-03 [1] RSPM (R 4.4.0)
#> iterators 1.0.14 2022-02-05 [1] RSPM (R 4.4.0)
#> jquerylib 0.1.4 2021-04-26 [2] RSPM (R 4.4.0)
#> jsonlite 1.8.9 2024-09-20 [2] RSPM (R 4.4.0)
#> KEGGREST 1.46.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> knitr 1.49 2024-11-08 [2] RSPM (R 4.4.0)
#> labeling 0.4.3 2023-08-29 [1] RSPM (R 4.4.0)
#> later 1.4.1 2024-11-27 [2] RSPM (R 4.4.0)
#> lattice 0.22-6 2024-03-20 [3] CRAN (R 4.4.2)
#> lazyeval 0.2.2 2019-03-15 [1] RSPM (R 4.4.0)
#> lifecycle 1.0.4 2023-11-07 [2] RSPM (R 4.4.0)
#> limma 3.62.1 2024-11-03 [1] Bioconductor 3.20 (R 4.4.2)
#> locfit 1.5-9.10 2024-06-24 [1] RSPM (R 4.4.0)
#> lubridate 1.9.4 2024-12-08 [1] RSPM (R 4.4.0)
#> magick 2.8.5 2024-09-20 [1] RSPM (R 4.4.0)
#> magrittr 2.0.3 2022-03-30 [2] RSPM (R 4.4.0)
#> Matrix 1.7-1 2024-10-18 [3] CRAN (R 4.4.2)
#> MatrixGenerics * 1.18.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> matrixStats * 1.4.1 2024-09-08 [1] RSPM (R 4.4.0)
#> memoise 2.0.1 2021-11-26 [2] RSPM (R 4.4.0)
#> mime 0.12 2021-09-28 [2] RSPM (R 4.4.0)
#> munsell 0.5.1 2024-04-01 [1] RSPM (R 4.4.0)
#> paletteer 1.6.0 2024-01-21 [1] RSPM (R 4.4.0)
#> pillar 1.9.0 2023-03-22 [2] RSPM (R 4.4.0)
#> pkgconfig 2.0.3 2019-09-22 [2] RSPM (R 4.4.0)
#> pkgdown 2.1.1 2024-09-17 [2] RSPM (R 4.4.0)
#> plotly 4.10.4 2024-01-13 [1] RSPM (R 4.4.0)
#> plyr 1.8.9 2023-10-02 [1] RSPM (R 4.4.0)
#> png 0.1-8 2022-11-29 [1] RSPM (R 4.4.0)
#> promises 1.3.2 2024-11-28 [2] RSPM (R 4.4.0)
#> purrr 1.0.2 2023-08-10 [2] RSPM (R 4.4.0)
#> R6 2.5.1 2021-08-19 [2] RSPM (R 4.4.0)
#> ragg 1.3.3 2024-09-11 [2] RSPM (R 4.4.0)
#> rappdirs 0.3.3 2021-01-31 [2] RSPM (R 4.4.0)
#> RColorBrewer 1.1-3 2022-04-03 [1] RSPM (R 4.4.0)
#> Rcpp 1.0.13-1 2024-11-02 [2] RSPM (R 4.4.0)
#> RCurl 1.98-1.16 2024-07-11 [1] RSPM (R 4.4.0)
#> RefManageR * 1.4.0 2022-09-30 [1] RSPM (R 4.4.0)
#> rematch2 2.1.2 2020-05-01 [2] RSPM (R 4.4.0)
#> restfulr 0.0.15 2022-06-16 [1] RSPM (R 4.4.0)
#> rjson 0.2.23 2024-09-16 [1] RSPM (R 4.4.0)
#> rlang 1.1.4 2024-06-04 [2] RSPM (R 4.4.0)
#> rmarkdown 2.29 2024-11-04 [2] RSPM (R 4.4.0)
#> Rsamtools 2.22.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> RSQLite 2.3.9 2024-12-03 [1] RSPM (R 4.4.0)
#> rsvd 1.0.5 2021-04-16 [1] RSPM (R 4.4.0)
#> rtracklayer 1.66.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> S4Arrays 1.6.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> S4Vectors * 0.44.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> sass 0.4.9 2024-03-15 [2] RSPM (R 4.4.0)
#> ScaledMatrix 1.14.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> scales 1.3.0 2023-11-28 [1] RSPM (R 4.4.0)
#> scater 1.34.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> scuttle 1.16.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> sessioninfo * 1.2.2 2021-12-06 [2] RSPM (R 4.4.0)
#> shape 1.4.6.1 2024-02-23 [1] RSPM (R 4.4.0)
#> shiny 1.10.0 2024-12-14 [2] RSPM (R 4.4.0)
#> shinyWidgets 0.8.7 2024-09-23 [1] RSPM (R 4.4.0)
#> SingleCellExperiment * 1.28.1 2024-11-10 [1] Bioconductor 3.20 (R 4.4.2)
#> SparseArray 1.6.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> SpatialExperiment * 1.16.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> spatialLIBD * 1.19.6 2024-12-16 [1] Bioconductor
#> statmod 1.5.0 2023-01-06 [1] RSPM (R 4.4.0)
#> stringi 1.8.4 2024-05-06 [2] RSPM (R 4.4.0)
#> stringr 1.5.1 2023-11-14 [2] RSPM (R 4.4.0)
#> SummarizedExperiment * 1.36.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> systemfonts 1.1.0 2024-05-15 [2] RSPM (R 4.4.0)
#> textshaping 0.4.1 2024-12-06 [2] RSPM (R 4.4.0)
#> tibble 3.2.1 2023-03-20 [2] RSPM (R 4.4.0)
#> tidyr 1.3.1 2024-01-24 [1] RSPM (R 4.4.0)
#> tidyselect 1.2.1 2024-03-11 [1] RSPM (R 4.4.0)
#> timechange 0.3.0 2024-01-18 [1] RSPM (R 4.4.0)
#> UCSC.utils 1.2.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> utf8 1.2.4 2023-10-22 [2] RSPM (R 4.4.0)
#> vctrs 0.6.5 2023-12-01 [2] RSPM (R 4.4.0)
#> vipor 0.4.7 2023-12-18 [1] RSPM (R 4.4.0)
#> viridis 0.6.5 2024-01-29 [1] RSPM (R 4.4.0)
#> viridisLite 0.4.2 2023-05-02 [1] RSPM (R 4.4.0)
#> withr 3.0.2 2024-10-28 [2] RSPM (R 4.4.0)
#> xfun 0.49 2024-10-31 [2] RSPM (R 4.4.0)
#> XML 3.99-0.17 2024-06-25 [1] RSPM (R 4.4.0)
#> xml2 1.3.6 2023-12-04 [2] RSPM (R 4.4.0)
#> xtable 1.8-4 2019-04-21 [2] RSPM (R 4.4.0)
#> XVector 0.46.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> yaml 2.3.10 2024-07-26 [2] RSPM (R 4.4.0)
#> zlibbioc 1.52.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#>
#> [1] /__w/_temp/Library
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/local/lib/R/library
#>
#> ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
This vignette was generated using BiocStyle (Oleś, 2024), knitr (Xie, 2014) and rmarkdown (Allaire, Xie, Dervieux et al., 2024) running behind the scenes.
Citations made with RefManageR (McLean, 2017).
[1] J. Allaire, Y. Xie, C. Dervieux, et al. rmarkdown: Dynamic Documents for R. R package version 2.29. 2024. URL: https://github.com/rstudio/rmarkdown.
[2] R. Amezquita, A. Lun, E. Becht, et al. “Orchestrating single-cell analysis with Bioconductor”. In: Nature Methods 17 (2020), pp. 137–145. URL: https://www.nature.com/articles/s41592-019-0654-x.
[3] D. Bates, M. Maechler, and M. Jagan. Matrix: Sparse and Dense Matrix Classes and Methods. R package version 1.7-1. 2024. URL: https://CRAN.R-project.org/package=Matrix.
[4] W. Chang, J. Cheng, J. Allaire, et al. shiny: Web Application Framework for R. R package version 1.10.0, https://github.com/rstudio/shiny. 2024. URL: https://shiny.posit.co/.
[5] Y. Chen, L. Chen, A. T. L. Lun, et al. “edgeR 4.0: powerful differential analysis of sequencing data with expanded functionality and improved support for small counts and larger datasets”. In: bioRxiv (2024). DOI: 10.1101/2024.01.21.576131.
[6] C. Fay, V. Guyader, S. Rochette, et al. golem: A Framework for Robust Shiny Applications. R package version 0.5.1, https://github.com/ThinkR-open/golem. 2024. URL: https://thinkr-open.github.io/golem/.
[7] Garnier, Simon, Ross, et al. viridis(Lite) - Colorblind-Friendly Color Maps for R. viridisLite package version 0.4.2. 2023. DOI: 10.5281/zenodo.4678327. URL: https://sjmgarnier.github.io/viridis/.
[8] C. Gillespie. benchmarkme: Crowd Sourced System Benchmarks. R package version 1.0.8. 2022. URL: https://github.com/csgillespie/benchmarkme.
[9] G. Giner and G. K. Smyth. “statmod: probability calculations for the inverse Gaussian distribution”. In: R Journal 8.1 (2016), pp. 339-351.
[10] Z. Gu, R. Eils, and M. Schlesner. “Complex heatmaps reveal patterns and correlations in multidimensional genomic data”. In: Bioinformatics (2016). DOI: 10.1093/bioinformatics/btw313.
[11] Z. Gu, L. Gu, R. Eils, et al. “circlize implements and enhances circular visualization in R”. In: Bioinformatics 30 (19 2014), pp. 2811-2812.
[12] Huber, W., Carey, et al. “Orchestrating high-throughput genomic analysis with Bioconductor”. In: Nature Methods 12.2 (2015), pp. 115–121. URL: http://www.nature.com/nmeth/journal/v12/n2/full/nmeth.3252.html.
[13] L. A. Huuki-Myers, A. Spangler, N. J. Eagles, et al. “A data-driven single-cell and spatial transcriptomic map of the human prefrontal cortex”. In: Science (2024). DOI: 10.1126/science.adh1938. URL: https://doi.org/10.1126/science.adh1938.
[14] E. Hvitfeldt. paletteer: Comprehensive Collection of Color Palettes. R package version 1.3.0. 2021. URL: https://github.com/EmilHvitfeldt/paletteer.
[15] S. H. Kwon, S. Parthiban, M. Tippani, et al. “Influence of Alzheimer’s disease related neuropathology on local microenvironment gene expression in the human inferior temporal cortex”. In: GEN Biotechnology (2023). DOI: 10.1089/genbio.2023.0019. URL: https://doi.org/10.1089/genbio.2023.0019.
[16] M. Lawrence, R. Gentleman, and V. Carey. “rtracklayer: an R package for interfacing with genome browsers”. In: Bioinformatics 25 (2009), pp. 1841-1842. DOI: 10.1093/bioinformatics/btp328. URL: http://bioinformatics.oxfordjournals.org/content/25/14/1841.abstract.
[17] M. Lawrence, W. Huber, H. Pagès, et al. “Software for Computing and Annotating Genomic Ranges”. In: PLoS Computational Biology 9 (8 2013). DOI: 10.1371/journal.pcbi.1003118. URL: http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003118}.
[18] K. R. Maynard, L. Collado-Torres, L. M. Weber, et al. “Transcriptome-scale spatial gene expression in the human dorsolateral prefrontal cortex”. In: Nature Neuroscience (2021). DOI: 10.1038/s41593-020-00787-0. URL: https://www.nature.com/articles/s41593-020-00787-0.
[19] D. J. McCarthy, K. R. Campbell, A. T. L. Lun, et al. “Scater: pre-processing, quality control, normalisation and visualisation of single-cell RNA-seq data in R”. In: Bioinformatics 33 (8 2017), pp. 1179-1186. DOI: 10.1093/bioinformatics/btw777.
[20] M. W. McLean. “RefManageR: Import and Manage BibTeX and BibLaTeX References in R”. In: The Journal of Open Source Software (2017). DOI: 10.21105/joss.00338.
[21] M. Morgan, V. Obenchain, J. Hester, et al. SummarizedExperiment: A container (S4 class) for matrix-like assays. R package version 1.36.0. 2024. DOI: 10.18129/B9.bioc.SummarizedExperiment. URL: https://bioconductor.org/packages/SummarizedExperiment.
[22] M. Morgan and L. Shepherd. AnnotationHub: Client to access AnnotationHub resources. R package version 3.14.0. 2024. DOI: 10.18129/B9.bioc.AnnotationHub. URL: https://bioconductor.org/packages/AnnotationHub.
[23] M. Morgan and L. Shepherd. ExperimentHub: Client to access ExperimentHub resources. R package version 2.14.0. 2024. DOI: 10.18129/B9.bioc.ExperimentHub. URL: https://bioconductor.org/packages/ExperimentHub.
[24] E. Neuwirth. RColorBrewer: ColorBrewer Palettes. R package version 1.1-3. 2022.
[25] A. Oleś. BiocStyle: Standard styles for vignettes and other Bioconductor documents. R package version 2.34.0. 2024. DOI: 10.18129/B9.bioc.BiocStyle. URL: https://bioconductor.org/packages/BiocStyle.
[26] J. Ooms. magick: Advanced Graphics and Image-Processing in R. R package version 2.8.5. 2024. URL: https://docs.ropensci.org/magick/https://ropensci.r-universe.dev/magick.
[27] H. Pagès, M. Lawrence, and P. Aboyoun. S4Vectors: Foundation of vector-like and list-like containers in Bioconductor. R package version 0.44.0. 2024. DOI: 10.18129/B9.bioc.S4Vectors. URL: https://bioconductor.org/packages/S4Vectors.
[28] B. Pardo, A. Spangler, L. M. Weber, et al. “spatialLIBD: an R/Bioconductor package to visualize spatially-resolved transcriptomics data”. In: BMC Genomics (2022). DOI: 10.1186/s12864-022-08601-w. URL: https://doi.org/10.1186/s12864-022-08601-w.
[29] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, 2024. URL: https://www.R-project.org/.
[30] D. Righelli, L. M. Weber, H. L. Crowell, et al. “SpatialExperiment: infrastructure for spatially-resolved transcriptomics data in R using Bioconductor”. In: Bioinformatics 38.11 (2022), pp. -3. DOI: https://doi.org/10.1093/bioinformatics/btac299.
[31] M. E. Ritchie, B. Phipson, D. Wu, et al. “limma powers differential expression analyses for RNA-sequencing and microarray studies”. In: Nucleic Acids Research 43.7 (2015), p. e47. DOI: 10.1093/nar/gkv007.
[32] L. Shepherd and M. Morgan. BiocFileCache: Manage Files Across Sessions. R package version 2.14.0. 2024. DOI: 10.18129/B9.bioc.BiocFileCache. URL: https://bioconductor.org/packages/BiocFileCache.
[33] C. Sievert. Interactive Web-Based Data Visualization with R, plotly, and shiny. Chapman and Hall/CRC, 2020. ISBN: 9781138331457. URL: https://plotly-r.com.
[34] H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016. ISBN: 978-3-319-24277-4. URL: https://ggplot2.tidyverse.org.
[35] H. Wickham. “testthat: Get Started with Testing”. In: The R Journal 3 (2011), pp. 5–10. URL: https://journal.r-project.org/archive/2011-1/RJournal_2011-1_Wickham.pdf.
[36] H. Wickham, W. Chang, R. Flight, et al. sessioninfo: R Session Information. R package version 1.2.2, https://r-lib.github.io/sessioninfo/. 2021. URL: https://github.com/r-lib/sessioninfo#readme.
[37] C. Wilke. cowplot: Streamlined Plot Theme and Plot Annotations for ‘ggplot2’. R package version 1.1.3. 2024. URL: https://wilkelab.org/cowplot/.
[38] Y. Xie. “knitr: A Comprehensive Tool for Reproducible Research in R”. In: Implementing Reproducible Computational Research. Ed. by V. Stodden, F. Leisch and R. D. Peng. ISBN 978-1466561595. Chapman and Hall/CRC, 2014.
[39] Y. Xie, J. Cheng, and X. Tan. DT: A Wrapper of the JavaScript Library ‘DataTables’. R package version 0.33. 2024. URL: https://github.com/rstudio/DT.
Check this
code for details on how we built the spe
object. In
particular check convert_sce.R
and
sce_scran.R
.↩︎
Check this
code for details on how we built the sce_layer
object.
In particular check spots_per_layer.R
and
layer_enrichment.R
.↩︎
Check this
code for details on how we built the modeling_results
object. In particular check layer_specificity_fstats.R
,
layer_specificity.R
, and misc_numbers.R
.↩︎
You can change the destdir
argument and
specific a specific location that you will use and re-use. However the
default value of destdir
is a temporary directory that will
be wiped out once you close your R session.↩︎
For more details, check this script.↩︎
For more details, check this script.↩︎