R/gencode_annotated_genes.R
gencode_annotated_genes.Rd
Based on a TxDb
object built by gencode_txdb()
this function annotates
the genes. This information is then used by packages like derfinder
and
derfinderPlot
.
gencode_annotated_genes(txdb)
A GenomicFeatures::TxDb object built with
gencode_txdb()
.
The annotated genes resulting from
bumphunter::annotateTranscripts()
.
Based on code for the brainflowprobes
package at:
https://github.com/LieberInstitute/brainflowprobes/blob/devel/data-raw/create_sysdata.R
## Start from scratch if you want:
if (FALSE) { # \dontrun{
txdb_v31_hg19_chr21 <- gencode_txdb("31", "hg19", chrs = "chr21")
} # }
## or read in the txdb object for hg19 chr21 from this package
txdb_v31_hg19_chr21 <- AnnotationDbi::loadDb(
system.file("extdata", "txdb_v31_hg19_chr21.sqlite",
package = "GenomicState"
)
)
#> Loading required package: GenomicFeatures
#> Loading required package: S4Vectors
#> Loading required package: stats4
#>
#> Attaching package: ‘S4Vectors’
#> The following object is masked from ‘package:utils’:
#>
#> findMatches
#> The following objects are masked from ‘package:base’:
#>
#> I, expand.grid, unname
#> Loading required package: IRanges
#> Loading required package: GenomeInfoDb
#> Loading required package: GenomicRanges
#> Loading required package: AnnotationDbi
#> Loading required package: Biobase
#> Welcome to Bioconductor
#>
#> Vignettes contain introductory material; view with
#> 'browseVignettes()'. To cite Bioconductor, see
#> 'citation("Biobase")', and for packages 'citation("pkgname")'.
#>
#> Attaching package: ‘Biobase’
#> The following object is masked from ‘package:AnnotationHub’:
#>
#> cache
## Obtain the annotated genes for the Gencode TxDb object
genes_v31_hg19_chr21 <- gencode_annotated_genes(txdb_v31_hg19_chr21)
#> 2024-12-12 22:29:43.929076 annotating the transcripts
#> No annotationPackage supplied. Trying org.Hs.eg.db.
#> Loading required package: org.Hs.eg.db
#>
#> Getting TSS and TSE.
#> Getting CSS and CSE.
#> Getting exons.
#> Annotating genes.
#> 'select()' returned 1:many mapping between keys and columns
## Explore the result
genes_v31_hg19_chr21
#> GRanges object with 823 ranges and 8 metadata columns:
#> seqnames ranges strand | CSS CSE
#> <Rle> <IRanges> <Rle> | <integer> <integer>
#> [1] chr21 43218385-43299591 - | 43221400 43299480
#> [2] chr21 45719934-45747259 + | 45719989 45746745
#> [3] chr21 33245333-33416946 + | 33245988 33416425
#> [4] chr21 47401651-47424964 + | 47401765 47423927
#> [5] chr21 34696734-34732170 + | 34697361 34727855
#> ... ... ... ... . ... ...
#> [819] chr21 45188201-45191547 + | <NA> <NA>
#> [820] chr21 25229397-25261466 + | <NA> <NA>
#> [821] chr21 38709684-38739160 - | <NA> <NA>
#> [822] chr21 42108244-42126814 + | <NA> <NA>
#> [823] chr21 25076577-25081662 + | <NA> <NA>
#> Tx Geneid Gene
#> <character> <Rle> <Rle>
#> [1] ENSG00000141956.13_3 ENSG00000141956 PRDM15
#> [2] ENSG00000141959.17_2 ENSG00000141959 PFKL
#> [3] ENSG00000142149.9_4 ENSG00000142149 HUNK
#> [4] ENSG00000142156.14_2 ENSG00000142156 COL6A1
#> [5] ENSG00000142166.13_4 ENSG00000142166 IFNAR1
#> ... ... ... ...
#> [819] ENSG00000287507.1_1 ENSG00000287507 <NA>
#> [820] ENSG00000287612.1_1 ENSG00000287612 <NA>
#> [821] ENSG00000287637.1_1 ENSG00000287637 <NA>
#> [822] ENSG00000288069.1_1 ENSG00000288069 <NA>
#> [823] ENSG00000288094.1_1 ENSG00000288094 LOC105372749
#> Refseq Nexons
#> <Rle> <integer>
#> [1] NM_001040424 NM_0012.. 32
#> [2] NM_001002021 NM_0026.. 20
#> [3] NM_014586 NP_055401 .. 12
#> [4] NM_001848 NP_001839 35
#> [5] NM_000629 NM_0013844.. 13
#> ... ... ...
#> [819] <NA> 3
#> [820] <NA> 4
#> [821] <NA> 2
#> [822] <NA> 2
#> [823] XR_007089952 XR_0070.. 3
#> Exons
#> <IRangesList>
#> [1] 43218385-43221882,43222872-43223278,43224694-43224883,...
#> [2] 45719934-45720073,45724158-45724338,45725077-45725300,...
#> [3] 33245333-33246248,33296780-33297072,33312477-33312532,...
#> [4] 47401651-47401861,47402548-47402677,47404183-47404383,...
#> [5] 34696734-34696943,34697209-34697436,34707830-34707953,...
#> ... ...
#> [819] 45188201-45188277,45189155-45189271,45190985-45191547
#> [820] 25229397-25229860,25231273-25231484,25233548-25233655,...
#> [821] 38709684-38711756,38738974-38739160
#> [822] 42108244-42108309,42125644-42126814
#> [823] 25076577-25076893,25077415-25077672,25080924-25081662
#> -------
#> seqinfo: 1 sequence from hg19 genome