This function builds a transcript database (TxDb
) object which you can then
use to build a Gencode GenomicState
object. This function will download
the data from Gencode, import it into R, process it and build the TxDb
object.
A character(1)
with the Gencode version number.
A character(1)
with the human genome version number. Valid
options are 'hg38'
or 'hg19'
.
A character()
vector with the chromosome (contig) names to
keep.
A GenomicFeatures::TxDb object.
A character(1)
with the URL for the GTF Gencode file of interest.
Based on code for the brainflowprobes
package at:
https://github.com/LieberInstitute/brainflowprobes/blob/devel/data-raw/create_sysdata.R
## Start from scratch if you want:
if (FALSE) {
txdb_v31_hg19_chr21 <- gencode_txdb("31", "hg19", chrs = "chr21")
}
## or read in the txdb object for hg19 chr21 from this package
txdb_v31_hg19_chr21 <- AnnotationDbi::loadDb(
system.file("extdata", "txdb_v31_hg19_chr21.sqlite",
package = "GenomicState"
)
)
## Explore the result
txdb_v31_hg19_chr21
#> TxDb object:
#> # Db type: TxDb
#> # Supporting package: GenomicFeatures
#> # Data source: ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_31/GRCh37_mapping/gencode.v31lift37.annotation.gtf.gz
#> # Organism: Homo sapiens
#> # Taxonomy ID: 9606
#> # miRBase build ID: NA
#> # Genome: hg19
#> # transcript_nrow: 2813
#> # exon_nrow: 8525
#> # cds_nrow: 2576
#> # Db created by: GenomicFeatures package from Bioconductor
#> # Creation time: 2019-10-07 09:29:35 -0400 (Mon, 07 Oct 2019)
#> # GenomicFeatures version at creation time: 1.36.4
#> # RSQLite version at creation time: 2.1.2
#> # DBSCHEMAVERSION: 1.2
## Locate the GTF file for Gencode version 31 for hg19
gencode_source_url(version = "31", genome = "hg19")
#> [1] "ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_31/GRCh37_mapping/gencode.v31lift37.annotation.gtf.gz"