scran::findMarkers()
.R/findMarkers_1vAll.R
findMarkers_1vAll.Rd
For each cell type, this function computes the statistics comparing that cell type (the "1") against all other cell types combined ("All").
findMarkers_1vAll(
sce,
assay_name = "counts",
cellType_col = "cellType",
add_symbol = FALSE,
mod = NULL,
verbose = TRUE,
direction = "up"
)
A SingleCellExperiment object.
Name of the assay to use for calculation. See
see assayNames(sce)
for possible values.
Column name on colData(sce)
that denotes the cell type.
A logical(1)
indicating whether to add the gene symbol
column to the marker stats table.
A character(1)
string specifying the model used as design in
scran::findMarkers()
. It can be NULL
(default) if there are no blocking
terms with uninteresting
factors as documented at pairwiseTTests.
A logical(1)
choosing whether to print progress messages or
not.
A character(1)
for the choice of direction tested for
gene cell type markers: "up"
(default), "any"
, or "down"
. Impacts
p-values: if "up"
genes with logFC < 0 will have p.value = 1
.
A tibble::tibble()
of 1 vs. ALL standard log fold change + p-values
for each gene x cell type.
gene
is the name of the gene (from rownames(sce)
).
logFC
the log fold change from the DE test
log.p.value
the log of the p-value of the DE test
log.FDR
the log of the False Discovery Rate adjusted p.value
std.logFC
the standard logFC.
cellType.target
the cell type we're finding marker genes for
std.logFC.rank
the rank of std.logFC
for each cell type
std.logFC.anno
is an annotation of the std.logFC
value
helpful for plotting.
See https://github.com/MarioniLab/scran/issues/57 for a more in depth
discussion about the standard log fold change statistics provided by
scran::findMarkers()
.
See also https://youtu.be/IaclszgZb-g for a LIBD rstats club presentation on "Finding and interpreting marker genes in sc/snRNA-seq data". The companion notes are available at https://docs.google.com/document/d/1BeMtKgE7gpmNywInndVC9o_ufopn-U2EZHB32bO7ObM/edit?usp=sharing.
## load example SingleCellExperiment
if (!exists("sce_DLPFC_example")) sce_DLPFC_example <- fetch_deconvo_data("sce_DLPFC_example")
#> 2024-10-31 13:27:00.365846 loading file /github/home/.cache/R/BiocFileCache/66141272054e_sce_DLPFC_example.Rdata%3Frlkey%3Dv3z4u8ru0d2y12zgdl1az07q9%26st%3D1dcfqc1i%26dl%3D1
## Explore properties of the sce object
sce_DLPFC_example
#> class: SingleCellExperiment
#> dim: 557 10000
#> metadata(3): Samples cell_type_colors cell_type_colors_broad
#> assays(1): logcounts
#> rownames(557): GABRD PRDM16 ... AFF2 MAMLD1
#> rowData names(7): source type ... gene_type binomial_deviance
#> colnames(10000): 8_AGTGACTGTAGTTACC-1 17_GCAGCCAGTGAGTCAG-1 ...
#> 12_GGACGTCTCTGACAGT-1 1_GGTTAACTCTCTCTAA-1
#> colData names(32): Sample Barcode ... cellType_layer layer_annotation
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
## this data contains logcounts of gene expression
SummarizedExperiment::assays(sce_DLPFC_example)$logcounts[1:5, 1:5]
#> 8_AGTGACTGTAGTTACC-1 17_GCAGCCAGTGAGTCAG-1 3_CTGGACGAGCTTCATG-1
#> GABRD 0 0.9249246 0.000000
#> PRDM16 0 0.0000000 0.000000
#> MICOS10 0 0.0000000 0.000000
#> LINC01141 0 0.0000000 0.000000
#> ADGRB2 0 0.9249246 2.253612
#> 13_CCCTCAAAGTCTAGCT-1 11_TGTAAGCCATTCTGTT-1
#> GABRD 0.000000 0.0000000
#> PRDM16 0.000000 0.0000000
#> MICOS10 0.000000 0.6528615
#> LINC01141 0.000000 0.0000000
#> ADGRB2 2.253454 0.0000000
## nuclei are classified in to cell types
table(sce_DLPFC_example$cellType_broad_hc)
#>
#> Astro EndoMural Micro Oligo OPC Excit Inhib
#> 692 417 316 1970 350 4335 1920
## Get the 1vALL stats for each gene for each cell type defined in
## `cellType_broad_hc`
marker_stats_1vAll <- findMarkers_1vAll(
sce = sce_DLPFC_example,
assay_name = "logcounts",
cellType_col = "cellType_broad_hc",
mod = "~BrNum"
)
#> Running 1vALL Testing for up-regulated genes
#> 2024-10-31 13:27:00.624104 - Find markers for: Oligo
#> 2024-10-31 13:27:01.029763 - Find markers for: Excit
#> 2024-10-31 13:27:01.388754 - Find markers for: Inhib
#> 2024-10-31 13:27:01.725297 - Find markers for: Micro
#> 2024-10-31 13:27:02.846454 - Find markers for: OPC
#> 2024-10-31 13:27:03.180054 - Find markers for: EndoMural
#> 2024-10-31 13:27:03.51722 - Find markers for: Astro
#> 2024-10-31 13:27:03.909627 - Building Table
#> ** Done! **
## explore output, top markers have high logFC
head(marker_stats_1vAll)
#> # A tibble: 6 × 8
#> # Groups: cellType.target [1]
#> gene logFC log.p.value log.FDR std.logFC cellType.target std.logFC.rank
#> <chr> <dbl> <dbl> <dbl> <dbl> <fct> <int>
#> 1 ST18 4.46 -6697. -6691. 4.32 Oligo 1
#> 2 PLP1 4.25 -5233. -5227. 3.50 Oligo 2
#> 3 MOBP 3.18 -4821. -4816. 3.28 Oligo 3
#> 4 ENPP2 2.72 -4521. -4516. 3.12 Oligo 4
#> 5 TF 3.00 -4466. -4462. 3.09 Oligo 5
#> 6 RNF220 3.80 -4390. -4385. 3.05 Oligo 6
#> # ℹ 1 more variable: std.logFC.anno <chr>