For each cell type, this function computes the statistics comparing that cell type (the "1") against all other cell types combined ("All").

findMarkers_1vAll(
  sce,
  assay_name = "counts",
  cellType_col = "cellType",
  add_symbol = FALSE,
  mod = NULL,
  verbose = TRUE,
  direction = "up"
)

Arguments

sce

A SingleCellExperiment object.

assay_name

Name of the assay to use for calculation. See see assayNames(sce) for possible values.

cellType_col

Column name on colData(sce) that denotes the cell type.

add_symbol

A logical(1) indicating whether to add the gene symbol column to the marker stats table.

mod

A character(1) string specifying the model used as design in scran::findMarkers(). It can be NULL (default) if there are no blocking terms with uninteresting factors as documented at pairwiseTTests.

verbose

A logical(1) choosing whether to print progress messages or not.

direction

A character(1) for the choice of direction tested for gene cell type markers: "up" (default), "any", or "down". Impacts p-values: if "up" genes with logFC < 0 will have p.value = 1.

Value

A tibble::tibble() of 1 vs. ALL standard log fold change + p-values for each gene x cell type.

  • gene is the name of the gene (from rownames(sce)).

  • logFC the log fold change from the DE test

  • log.p.value the log of the p-value of the DE test

  • log.FDR the log of the False Discovery Rate adjusted p.value

  • std.logFC the standard logFC.

  • cellType.target the cell type we're finding marker genes for

  • std.logFC.rank the rank of std.logFC for each cell type

  • std.logFC.anno is an annotation of the std.logFC value helpful for plotting.

Details

See https://github.com/MarioniLab/scran/issues/57 for a more in depth discussion about the standard log fold change statistics provided by scran::findMarkers().

See also https://youtu.be/IaclszgZb-g for a LIBD rstats club presentation on "Finding and interpreting marker genes in sc/snRNA-seq data". The companion notes are available at https://docs.google.com/document/d/1BeMtKgE7gpmNywInndVC9o_ufopn-U2EZHB32bO7ObM/edit?usp=sharing.

Examples

## load example SingleCellExperiment
if (!exists("sce_DLPFC_example")) sce_DLPFC_example <- fetch_deconvo_data("sce_DLPFC_example")
#> 2024-10-31 13:27:00.365846 loading file /github/home/.cache/R/BiocFileCache/66141272054e_sce_DLPFC_example.Rdata%3Frlkey%3Dv3z4u8ru0d2y12zgdl1az07q9%26st%3D1dcfqc1i%26dl%3D1
## Explore properties of the sce object
sce_DLPFC_example
#> class: SingleCellExperiment 
#> dim: 557 10000 
#> metadata(3): Samples cell_type_colors cell_type_colors_broad
#> assays(1): logcounts
#> rownames(557): GABRD PRDM16 ... AFF2 MAMLD1
#> rowData names(7): source type ... gene_type binomial_deviance
#> colnames(10000): 8_AGTGACTGTAGTTACC-1 17_GCAGCCAGTGAGTCAG-1 ...
#>   12_GGACGTCTCTGACAGT-1 1_GGTTAACTCTCTCTAA-1
#> colData names(32): Sample Barcode ... cellType_layer layer_annotation
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):

## this data contains logcounts of gene expression
SummarizedExperiment::assays(sce_DLPFC_example)$logcounts[1:5, 1:5]
#>           8_AGTGACTGTAGTTACC-1 17_GCAGCCAGTGAGTCAG-1 3_CTGGACGAGCTTCATG-1
#> GABRD                        0             0.9249246             0.000000
#> PRDM16                       0             0.0000000             0.000000
#> MICOS10                      0             0.0000000             0.000000
#> LINC01141                    0             0.0000000             0.000000
#> ADGRB2                       0             0.9249246             2.253612
#>           13_CCCTCAAAGTCTAGCT-1 11_TGTAAGCCATTCTGTT-1
#> GABRD                  0.000000             0.0000000
#> PRDM16                 0.000000             0.0000000
#> MICOS10                0.000000             0.6528615
#> LINC01141              0.000000             0.0000000
#> ADGRB2                 2.253454             0.0000000

## nuclei are classified in to cell types
table(sce_DLPFC_example$cellType_broad_hc)
#> 
#>     Astro EndoMural     Micro     Oligo       OPC     Excit     Inhib 
#>       692       417       316      1970       350      4335      1920 

## Get the 1vALL stats for each gene for each cell type defined in
## `cellType_broad_hc`
marker_stats_1vAll <- findMarkers_1vAll(
    sce = sce_DLPFC_example,
    assay_name = "logcounts",
    cellType_col = "cellType_broad_hc",
    mod = "~BrNum"
)
#> Running 1vALL Testing for up-regulated genes
#> 2024-10-31 13:27:00.624104 - Find markers for: Oligo
#> 2024-10-31 13:27:01.029763 - Find markers for: Excit
#> 2024-10-31 13:27:01.388754 - Find markers for: Inhib
#> 2024-10-31 13:27:01.725297 - Find markers for: Micro
#> 2024-10-31 13:27:02.846454 - Find markers for: OPC
#> 2024-10-31 13:27:03.180054 - Find markers for: EndoMural
#> 2024-10-31 13:27:03.51722 - Find markers for: Astro
#> 2024-10-31 13:27:03.909627 - Building Table
#> ** Done! **

## explore output, top markers have high logFC
head(marker_stats_1vAll)
#> # A tibble: 6 × 8
#> # Groups:   cellType.target [1]
#>   gene   logFC log.p.value log.FDR std.logFC cellType.target std.logFC.rank
#>   <chr>  <dbl>       <dbl>   <dbl>     <dbl> <fct>                    <int>
#> 1 ST18    4.46      -6697.  -6691.      4.32 Oligo                        1
#> 2 PLP1    4.25      -5233.  -5227.      3.50 Oligo                        2
#> 3 MOBP    3.18      -4821.  -4816.      3.28 Oligo                        3
#> 4 ENPP2   2.72      -4521.  -4516.      3.12 Oligo                        4
#> 5 TF      3.00      -4466.  -4462.      3.09 Oligo                        5
#> 6 RNF220  3.80      -4390.  -4385.      3.05 Oligo                        6
#> # ℹ 1 more variable: std.logFC.anno <chr>