A guided example showing how processed results from the RNAseq pipeline SPEAQeasy can be used in differential expression analyses and visualization. We also show how existing genotype information for a set of samples can be combined with SPEAQeasy results to resolve any identity issues, as can emerge during sequencing.
SPEAQeasy
We hope that SPEAQeasy
will be useful for your research. Please use the following information to cite the package and the overall approach. Thank you!
@article {Eagles2021,
author = {Eagles, Nicholas J. and Burke, Emily E. and Leonard, Jacob and Barry, Brianna K. and Stolz, Joshua M. and Huuki, Louise and Phan, BaDoi N. and Larrios Serrato, Violeta and Guti{\'e}rrez-Mill{\'a}n, Everardo and Aguilar-Ordo{\~n}ez, Israel and Jaffe, Andrew E. and Collado-Torres, Leonardo},
title = {SPEAQeasy: a scalable pipeline for expression analysis and quantification for R/bioconductor-powered RNA-seq analyses},
year = {2021},
doi = {10.1186/s12859-021-04142-3},
publisher = {Cold Spring Harbor Laboratory},
URL = {https://doi.org/10.1186/s12859-021-04142-3},
journal = {BMC Bioinformatics}
}
This is a project by the R/Bioconductor-powered Team Data Science at the Lieber Institute for Brain Development.
The analysis was possible thanks to R and many of its packages:
[1] J. Allaire, Y. Xie, J. McPherson, J. Luraschi, et al. rmarkdown: Dynamic Documents for R. R package version 2.6. 2020. <URL: https://github.com/rstudio/rmarkdown>.
[2] C. Boettiger. knitcitations: Citations for ‘Knitr’ Markdown Files. R package version 1.0.12. 2021. <URL: https://github.com/cboettig/knitcitations>.
[3] M. Carlson. org.Hs.eg.db: Genome wide annotation for Human. R package version 3.12.0. 2020.
[4] L. Collado-Torres, A. E. Jaffe, and E. E. Burke. jaffelab: Commonly used functions by the Jaffe lab. R package version 0.99.32. 2021. <URL: https://github.com/LieberInstitute/jaffelab>.
[5] L. Collado-Torres, A. Nellore, K. Kammers, S. E. Ellis, et al. “Reproducible RNA-seq analysis using recount2”. In: Nature Biotechnology (2017). DOI: 10.1038/nbt.3838. <URL: http://www.nature.com/nbt/journal/v35/n4/full/nbt.3838.html>.
[6] T. L. Davis and A. Day. getopt: C-Like ‘getopt’ Behavior. R package version 1.20.3. 2019. <URL: https://github.com/trevorld/r-getopt>.
[7] R. Kolde. pheatmap: Pretty Heatmaps. R package version 1.0.12. 2019.
[8] C. Law, Y. Chen, W. Shi, and G. Smyth. “Voom: precision weights unlock linear model analysis tools for RNA-seq read counts”. In: Genome Biology 15 (2014), p. R29.
[9] M. Morgan, V. Obenchain, J. Hester, and H. Pagès. SummarizedExperiment: SummarizedExperiment container. R package version 1.20.0. 2020. <URL: https://bioconductor.org/packages/SummarizedExperiment>.
[10] K. Müller. here: A Simpler Way to Find Your Files. https://here.r-lib.org/, https://github.com/r-lib/here. 2020.
[11] E. Neuwirth. RColorBrewer: ColorBrewer Palettes. R package version 1.1-2. 2014.
[12] V. Obenchain, M. Lawrence, V. Carey, S. Gogarten, et al. “VariantAnnotation: a Bioconductor package for exploration and annotation of genetic variants”. In: Bioinformatics 30.14 (2014), pp. 2076-2078. DOI: 10.1093/bioinformatics/btu168.
[13] A. Oleś, M. Morgan, and W. Huber. BiocStyle: Standard styles for vignettes and other Bioconductor documents. R package version 2.18.1. 2020. <URL: https://github.com/Bioconductor/BiocStyle>.
[14] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, 2020. <URL: https://www.R-project.org/>.
[15] M. E. Ritchie, B. Phipson, D. Wu, Y. Hu, et al. “limma powers differential expression analyses for RNA-sequencing and microarray studies”. In: Nucleic Acids Research 43.7 (2015), p. e47. DOI: 10.1093/nar/gkv007.
[16] M. D. Robinson, D. J. McCarthy, and G. K. Smyth. “edgeR: a Bioconductor package for differential expression analysis of digital gene expression data”. In: Bioinformatics 26.1 (2010), pp. 139-140. DOI: 10.1093/bioinformatics/btp616.
[17] H. Wickham. tidyr: Tidy Messy Data. https://tidyr.tidyverse.org, https://github.com/tidyverse/tidyr. 2020.
[18] H. Wickham, J. Hester, and W. Chang. devtools: Tools to Make Developing R Packages Easier. https://devtools.r-lib.org/, https://github.com/r-lib/devtools. 2020.
[19] G. Yu, L. Wang, Y. Han, and Q. He. “clusterProfiler: an R package for comparing biological themes among gene clusters”. In: OMICS: A Journal of Integrative Biology 16.5 (2012), pp. 284-287. DOI: 10.1089/omi.2011.0118.
# Time spent creating this report:
diff(c(timestart, Sys.time()))
## Time difference of 1.044434 secs
# Date this report was generated
message(Sys.time())
## 2023-07-14 13:53:34
# Reproducibility info
options(width = 120)
sessioninfo::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
## setting value
## version R version 4.0.3 (2020-10-10)
## os Ubuntu 20.04 LTS
## system x86_64, linux-gnu
## ui X11
## language (EN)
## collate en_US.UTF-8
## ctype en_US.UTF-8
## tz Etc/UTC
## date 2023-07-14
##
## ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
## package * version date lib source
## AnnotationDbi * 1.52.0 2020-10-27 [1] Bioconductor
## askpass 1.1 2019-01-13 [2] RSPM (R 4.0.3)
## assertthat 0.2.1 2019-03-21 [2] RSPM (R 4.0.3)
## backports 1.2.1 2020-12-09 [1] RSPM (R 4.0.3)
## base64enc 0.1-3 2015-07-28 [2] RSPM (R 4.0.3)
## bibtex 0.5.1 2023-07-14 [1] Github (ropensci/bibtex@eeb69da)
## Biobase * 2.50.0 2020-10-27 [1] Bioconductor
## BiocFileCache 1.14.0 2020-10-27 [1] Bioconductor
## BiocGenerics * 0.36.1 2021-04-16 [1] Bioconductor
## BiocManager 1.30.10 2019-11-16 [1] RSPM (R 4.0.0)
## BiocParallel 1.24.1 2020-11-06 [1] Bioconductor
## BiocStyle * 2.18.1 2020-11-24 [1] Bioconductor
## biomaRt 2.46.3 2021-02-09 [1] Bioconductor
## Biostrings 2.58.0 2020-10-27 [1] Bioconductor
## bit 4.0.4 2020-08-04 [1] RSPM (R 4.0.5)
## bit64 4.0.5 2020-08-30 [1] RSPM (R 4.0.5)
## bitops 1.0-6 2013-08-17 [1] RSPM (R 4.0.3)
## blob 1.2.1 2020-01-20 [1] RSPM (R 4.0.3)
## bookdown 0.21 2020-10-13 [1] RSPM (R 4.0.2)
## BSgenome 1.58.0 2020-10-27 [1] Bioconductor
## bslib 0.2.4 2021-01-25 [1] RSPM (R 4.0.3)
## bumphunter 1.32.0 2020-10-27 [1] Bioconductor
## cachem 1.0.4 2021-02-13 [2] RSPM (R 4.0.3)
## callr 3.5.1 2020-10-13 [2] RSPM (R 4.0.3)
## checkmate 2.0.0 2020-02-06 [1] RSPM (R 4.0.3)
## cli 2.3.0 2021-01-31 [2] RSPM (R 4.0.3)
## cluster 2.1.1 2021-02-14 [3] RSPM (R 4.0.3)
## clusterProfiler * 3.18.1 2021-02-09 [1] Bioconductor
## codetools 0.2-18 2020-11-04 [3] RSPM (R 4.0.3)
## colorspace 2.0-0 2020-11-11 [1] RSPM (R 4.0.3)
## cowplot 1.1.1 2020-12-30 [1] RSPM (R 4.0.5)
## crayon 1.4.1 2021-02-08 [2] RSPM (R 4.0.3)
## crosstalk 1.1.1 2021-01-12 [2] RSPM (R 4.0.3)
## curl 4.3 2019-12-02 [2] RSPM (R 4.0.3)
## data.table 1.13.6 2020-12-30 [1] RSPM (R 4.0.3)
## DBI 1.1.1 2021-01-15 [1] RSPM (R 4.0.3)
## dbplyr 2.1.0 2021-02-03 [1] RSPM (R 4.0.3)
## DelayedArray 0.16.3 2021-03-24 [1] Bioconductor
## derfinder 1.24.2 2020-12-18 [1] Bioconductor
## derfinderHelper 1.24.1 2020-12-18 [1] Bioconductor
## desc 1.2.0 2018-05-01 [2] RSPM (R 4.0.3)
## devtools * 2.3.2 2020-09-18 [2] RSPM (R 4.0.3)
## digest 0.6.27 2020-10-24 [2] RSPM (R 4.0.3)
## DO.db 2.9 2023-07-14 [1] Bioconductor
## doRNG 1.8.2 2020-01-27 [1] RSPM (R 4.0.3)
## DOSE 3.16.0 2020-10-27 [1] Bioconductor
## downloader 0.4 2015-07-09 [1] RSPM (R 4.0.3)
## dplyr 1.0.4 2021-02-02 [1] RSPM (R 4.0.3)
## DT 0.17 2021-01-06 [2] RSPM (R 4.0.3)
## edgeR * 3.32.1 2021-01-14 [1] Bioconductor
## ellipsis 0.3.1 2020-05-15 [2] RSPM (R 4.0.3)
## enrichplot * 1.10.2 2021-01-28 [1] Bioconductor
## evaluate 0.14 2019-05-28 [2] RSPM (R 4.0.3)
## ExploreModelMatrix * 1.2.0 2020-10-27 [1] Bioconductor
## farver 2.0.3 2020-01-16 [1] RSPM (R 4.0.3)
## fastmap 1.1.0 2021-01-25 [2] RSPM (R 4.0.3)
## fastmatch 1.1-0 2017-01-28 [1] RSPM (R 4.0.3)
## fgsea 1.16.0 2020-10-27 [1] Bioconductor
## foreach 1.5.1 2020-10-15 [1] RSPM (R 4.0.3)
## foreign 0.8-81 2020-12-22 [3] RSPM (R 4.0.3)
## Formula 1.2-4 2020-10-16 [1] RSPM (R 4.0.3)
## fs 1.5.0 2020-07-31 [2] RSPM (R 4.0.3)
## generics 0.1.0 2020-10-31 [1] RSPM (R 4.0.3)
## GenomeInfoDb * 1.26.7 2021-04-08 [1] Bioconductor
## GenomeInfoDbData 1.2.4 2023-07-14 [1] Bioconductor
## GenomicAlignments 1.26.0 2020-10-27 [1] Bioconductor
## GenomicFeatures 1.42.3 2021-04-01 [1] Bioconductor
## GenomicFiles 1.26.0 2020-10-27 [1] Bioconductor
## GenomicRanges * 1.42.0 2020-10-27 [1] Bioconductor
## GEOquery 2.58.0 2020-10-27 [1] Bioconductor
## ggforce 0.3.2 2020-06-23 [1] RSPM (R 4.0.3)
## ggnewscale 0.4.5 2021-01-11 [1] RSPM (R 4.0.3)
## ggplot2 * 3.3.3 2020-12-30 [1] RSPM (R 4.0.3)
## ggraph 2.0.4 2020-11-16 [1] RSPM (R 4.0.3)
## ggrepel 0.9.1 2021-01-15 [1] RSPM (R 4.0.3)
## glue 1.4.2 2020-08-27 [2] RSPM (R 4.0.3)
## GO.db 3.12.1 2023-07-14 [1] Bioconductor
## googledrive 1.0.1 2020-05-05 [1] RSPM (R 4.0.0)
## GOSemSim 2.16.1 2020-10-29 [1] Bioconductor
## graphlayouts 0.7.1 2020-10-26 [1] RSPM (R 4.0.3)
## gridExtra 2.3 2017-09-09 [1] RSPM (R 4.0.3)
## gtable 0.3.0 2019-03-25 [1] RSPM (R 4.0.3)
## here * 1.0.1 2020-12-13 [1] RSPM (R 4.0.3)
## highr 0.8 2019-03-20 [2] RSPM (R 4.0.3)
## Hmisc 4.4-2 2020-11-29 [1] RSPM (R 4.0.3)
## hms 1.0.0 2021-01-13 [1] RSPM (R 4.0.3)
## htmlTable 2.1.0 2020-09-16 [1] RSPM (R 4.0.3)
## htmltools 0.5.1.1 2021-01-22 [2] RSPM (R 4.0.3)
## htmlwidgets 1.5.3 2020-12-10 [2] RSPM (R 4.0.3)
## httpuv 1.5.5 2021-01-13 [1] RSPM (R 4.0.3)
## httr 1.4.2 2020-07-20 [2] RSPM (R 4.0.3)
## igraph 1.2.6 2020-10-06 [1] RSPM (R 4.0.3)
## IRanges * 2.24.1 2020-12-12 [1] Bioconductor
## iterators 1.0.13 2020-10-15 [1] RSPM (R 4.0.3)
## jaffelab * 0.99.32 2023-07-14 [1] Github (LieberInstitute/jaffelab@21e6574)
## jpeg 0.1-8.1 2019-10-24 [1] RSPM (R 4.0.3)
## jquerylib 0.1.3 2020-12-17 [1] RSPM (R 4.0.3)
## jsonlite 1.7.2 2020-12-09 [2] RSPM (R 4.0.3)
## knitcitations * 1.0.12 2021-01-10 [1] RSPM (R 4.0.3)
## knitr 1.31 2021-01-27 [2] RSPM (R 4.0.3)
## labeling 0.4.2 2020-10-20 [1] RSPM (R 4.0.3)
## later 1.1.0.1 2020-06-05 [2] RSPM (R 4.0.3)
## lattice 0.20-41 2020-04-02 [3] CRAN (R 4.0.3)
## latticeExtra 0.6-29 2019-12-19 [1] RSPM (R 4.0.3)
## lazyeval 0.2.2 2019-03-15 [2] RSPM (R 4.0.3)
## lifecycle 1.0.0 2021-02-15 [2] RSPM (R 4.0.3)
## limma * 3.46.0 2020-10-27 [1] Bioconductor
## locfit 1.5-9.4 2020-03-25 [1] RSPM (R 4.0.3)
## lubridate 1.7.9.2 2020-11-13 [1] RSPM (R 4.0.3)
## magrittr 2.0.1 2020-11-17 [2] RSPM (R 4.0.3)
## MASS 7.3-53.1 2021-02-12 [3] RSPM (R 4.0.3)
## Matrix 1.3-2 2021-01-06 [3] RSPM (R 4.0.3)
## MatrixGenerics * 1.2.1 2021-01-30 [1] Bioconductor
## matrixStats * 0.58.0 2021-01-29 [1] RSPM (R 4.0.3)
## memoise 2.0.0 2021-01-26 [2] RSPM (R 4.0.3)
## mime 0.10 2021-02-13 [2] RSPM (R 4.0.3)
## munsell 0.5.0 2018-06-12 [1] RSPM (R 4.0.3)
## nnet 7.3-15 2021-01-24 [3] RSPM (R 4.0.3)
## openssl 1.4.3 2020-09-18 [2] RSPM (R 4.0.3)
## org.Hs.eg.db * 3.12.0 2023-07-14 [1] Bioconductor
## pheatmap * 1.0.12 2019-01-04 [1] RSPM (R 4.0.3)
## pillar 1.4.7 2020-11-20 [2] RSPM (R 4.0.3)
## pkgbuild 1.2.0 2020-12-15 [2] RSPM (R 4.0.3)
## pkgconfig 2.0.3 2019-09-22 [2] RSPM (R 4.0.3)
## pkgload 1.1.0 2020-05-29 [2] RSPM (R 4.0.3)
## plotly * 4.9.3 2021-01-10 [1] RSPM (R 4.0.3)
## plyr 1.8.6 2020-03-03 [1] RSPM (R 4.0.3)
## png 0.1-7 2013-12-03 [1] RSPM (R 4.0.3)
## polyclip 1.10-0 2019-03-14 [1] RSPM (R 4.0.3)
## prettyunits 1.1.1 2020-01-24 [2] RSPM (R 4.0.3)
## processx 3.4.5 2020-11-30 [2] RSPM (R 4.0.3)
## progress 1.2.2 2019-05-16 [1] RSPM (R 4.0.3)
## promises 1.2.0.1 2021-02-11 [2] RSPM (R 4.0.3)
## ps 1.5.0 2020-12-05 [2] RSPM (R 4.0.3)
## purrr 0.3.4 2020-04-17 [2] RSPM (R 4.0.3)
## qvalue 2.22.0 2020-10-27 [1] Bioconductor
## R6 2.5.0 2020-10-28 [2] RSPM (R 4.0.3)
## rafalib * 1.0.0 2015-08-09 [1] RSPM (R 4.0.0)
## rappdirs 0.3.3 2021-01-31 [2] RSPM (R 4.0.3)
## RColorBrewer * 1.1-2 2014-12-07 [1] RSPM (R 4.0.3)
## Rcpp 1.0.6 2021-01-15 [2] RSPM (R 4.0.3)
## RCurl 1.98-1.2 2020-04-18 [1] RSPM (R 4.0.3)
## readr 1.4.0 2020-10-05 [1] RSPM (R 4.0.3)
## recount * 1.16.1 2020-12-18 [1] Bioconductor
## RefManageR 1.4.0 2023-07-14 [1] Github (ropensci/RefManageR@2409e32)
## remotes 2.2.0 2020-07-21 [2] RSPM (R 4.0.3)
## rentrez 1.2.3 2020-11-10 [1] RSPM (R 4.0.3)
## reshape2 1.4.4 2020-04-09 [1] RSPM (R 4.0.3)
## rintrojs 0.2.2 2019-05-29 [1] RSPM (R 4.0.0)
## rlang 0.4.10 2020-12-30 [2] RSPM (R 4.0.3)
## rmarkdown 2.6 2020-12-14 [1] RSPM (R 4.0.3)
## rngtools 1.5 2020-01-23 [1] RSPM (R 4.0.3)
## rpart 4.1-15 2019-04-12 [3] CRAN (R 4.0.3)
## rprojroot 2.0.2 2020-11-15 [2] RSPM (R 4.0.3)
## Rsamtools 2.6.0 2020-10-27 [1] Bioconductor
## RSQLite 2.2.3 2021-01-24 [1] RSPM (R 4.0.3)
## rstudioapi 0.13 2020-11-12 [2] RSPM (R 4.0.3)
## rtracklayer 1.50.0 2020-10-27 [1] Bioconductor
## rvcheck 0.1.8 2020-03-01 [1] RSPM (R 4.0.0)
## S4Vectors * 0.28.1 2020-12-09 [1] Bioconductor
## sass 0.3.1 2021-01-24 [1] RSPM (R 4.0.3)
## scales 1.1.1 2020-05-11 [1] RSPM (R 4.0.3)
## scatterpie 0.1.5 2020-09-09 [1] RSPM (R 4.0.2)
## segmented 1.3-2 2021-02-09 [1] RSPM (R 4.0.3)
## sessioninfo 1.1.1 2018-11-05 [1] RSPM (R 4.0.3)
## shadowtext 0.0.7 2019-11-06 [1] RSPM (R 4.0.0)
## shiny 1.6.0 2021-01-25 [1] RSPM (R 4.0.3)
## shinydashboard 0.7.1 2018-10-17 [1] RSPM (R 4.0.3)
## shinyjs 2.0.0 2020-09-09 [1] RSPM (R 4.0.3)
## statmod 1.4.35 2020-10-19 [1] RSPM (R 4.0.3)
## stringi 1.5.3 2020-09-09 [2] RSPM (R 4.0.3)
## stringr 1.4.0 2019-02-10 [2] RSPM (R 4.0.3)
## SummarizedExperiment * 1.20.0 2020-10-27 [1] Bioconductor
## survival 3.2-7 2020-09-28 [3] CRAN (R 4.0.3)
## testthat 3.0.2 2021-02-14 [2] RSPM (R 4.0.3)
## tibble 3.0.6 2021-01-29 [2] RSPM (R 4.0.3)
## tidygraph 1.2.0 2020-05-12 [1] RSPM (R 4.0.3)
## tidyr 1.1.2 2020-08-27 [1] RSPM (R 4.0.3)
## tidyselect 1.1.0 2020-05-11 [1] RSPM (R 4.0.3)
## tweenr 1.0.1 2018-12-14 [1] RSPM (R 4.0.3)
## usethis * 2.0.1 2021-02-10 [1] RSPM (R 4.0.3)
## VariantAnnotation 1.36.0 2020-10-27 [1] Bioconductor
## vctrs 0.3.6 2020-12-17 [2] RSPM (R 4.0.3)
## viridis 0.5.1 2018-03-29 [1] RSPM (R 4.0.3)
## viridisLite 0.3.0 2018-02-01 [1] RSPM (R 4.0.3)
## withr 2.4.1 2021-01-26 [2] RSPM (R 4.0.3)
## xfun 0.21 2021-02-10 [2] RSPM (R 4.0.3)
## XML 3.99-0.5 2020-07-23 [1] RSPM (R 4.0.3)
## xml2 1.3.2 2020-04-23 [2] RSPM (R 4.0.3)
## xtable 1.8-4 2019-04-21 [1] RSPM (R 4.0.3)
## XVector 0.30.0 2020-10-27 [1] Bioconductor
## yaml 2.2.1 2020-02-01 [2] RSPM (R 4.0.3)
## zlibbioc 1.36.0 2020-10-27 [1] Bioconductor
##
## [1] /__w/_temp/Library
## [2] /usr/local/lib/R/site-library
## [3] /usr/local/lib/R/library