This function builds a transcript database (TxDb) object which you can then use to build a Gencode GenomicState object. This function will download the data from Gencode, import it into R, process it and build the TxDb object.

gencode_txdb(
  version = "31",
  genome = c("hg38", "hg19"),
  chrs = paste0("chr", c(seq_len(22), "X", "Y", "M"))
)

gencode_source_url(version = "31", genome = c("hg38", "hg19"))

Arguments

version

A character(1) with the Gencode version number.

genome

A character(1) with the human genome version number. Valid options are 'hg38' or 'hg19'.

chrs

A character() vector with the chromosome (contig) names to keep.

Value

A GenomicFeatures::TxDb object.

A character(1) with the URL for the GTF Gencode file of interest.

References

Based on code for the brainflowprobes package at: https://github.com/LieberInstitute/brainflowprobes/blob/devel/data-raw/create_sysdata.R

Author

Leonardo Collado-Torres

Examples


## Start from scratch if you want:
if (FALSE) { # \dontrun{
txdb_v31_hg19_chr21 <- gencode_txdb("31", "hg19", chrs = "chr21")
} # }

## or read in the txdb object for hg19 chr21 from this package
txdb_v31_hg19_chr21 <- AnnotationDbi::loadDb(
    system.file("extdata", "txdb_v31_hg19_chr21.sqlite",
        package = "GenomicState"
    )
)

## Explore the result
txdb_v31_hg19_chr21
#> TxDb object:
#> # Db type: TxDb
#> # Supporting package: GenomicFeatures
#> # Data source: ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_31/GRCh37_mapping/gencode.v31lift37.annotation.gtf.gz
#> # Organism: Homo sapiens
#> # Taxonomy ID: 9606
#> # miRBase build ID: NA
#> # Genome: hg19
#> # transcript_nrow: 2813
#> # exon_nrow: 8525
#> # cds_nrow: 2576
#> # Db created by: GenomicFeatures package from Bioconductor
#> # Creation time: 2019-10-07 09:29:35 -0400 (Mon, 07 Oct 2019)
#> # GenomicFeatures version at creation time: 1.36.4
#> # RSQLite version at creation time: 2.1.2
#> # DBSCHEMAVERSION: 1.2

## Locate the GTF file for Gencode version 31 for hg19
gencode_source_url(version = "31", genome = "hg19")
#> [1] "ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_31/GRCh37_mapping/gencode.v31lift37.annotation.gtf.gz"