Based on a TxDb object built by gencode_txdb() this function builds a GenomicState object which you can then use with derfinder::annotateRegions(). This information is then used by packages like derfinderPlot.

gencode_genomic_state(txdb)

Arguments

txdb

A GenomicFeatures::TxDb object built with gencode_txdb().

Value

A GenomicState object with the gene symbols as built using derfinder::makeGenomicState().

Details

Note that not all genes will have symbols as many will be NA.

References

Based on code for the brainflowprobes package at: https://github.com/LieberInstitute/brainflowprobes/blob/devel/data-raw/create_sysdata.R

Author

Leonardo Collado-Torres

Examples


## Start from scratch if you want:
if (FALSE) {
txdb_v31_hg19_chr21 <- gencode_txdb("31", "hg19", chrs = "chr21")
}

## or read in the txdb object for hg19 chr21 from this package
txdb_v31_hg19_chr21 <- AnnotationDbi::loadDb(
    system.file("extdata", "txdb_v31_hg19_chr21.sqlite",
        package = "GenomicState"
    )
)

## Now build the GenomicState object
gs_v31_hg19_chr21 <- gencode_genomic_state(txdb_v31_hg19_chr21)
#> 2023-05-07 06:38:09.080103 making the GenomicState object
#> extendedMapSeqlevels: sequence names mapped from NCBI to UCSC for species homo_sapiens
#> 'select()' returned 1:1 mapping between keys and columns
#> 2023-05-07 06:38:14.614311 finding gene symbols
#> 'select()' returned 1:many mapping between keys and columns
#> 2023-05-07 06:38:14.989506 adding gene symbols to the GenomicState

## Explore the result
gs_v31_hg19_chr21
#> $fullGenome
#> GRanges object with 7871 ranges and 5 metadata columns:
#>        seqnames            ranges strand |   theRegion         tx_id
#>           <Rle>         <IRanges>  <Rle> | <character> <IntegerList>
#>      1    chr21   9492380-9492817      + |        exon             1
#>      2    chr21   9590294-9590395      + |        exon             2
#>      3    chr21   9647910-9648694      + |        exon           3,4
#>      4    chr21   9648695-9650108      + |      intron           3,4
#>      5    chr21   9650109-9650168      + |        exon           3,4
#>    ...      ...               ...    ... .         ...           ...
#>   7867    chr21 47865683-47878803      * |  intergenic              
#>   7868    chr21 47989929-48018516      * |  intergenic              
#>   7869    chr21 48025122-48055506      * |  intergenic              
#>   7870    chr21 48085037-48110675      * |  intergenic              
#>   7871    chr21 48111139-48129895      * |  intergenic              
#>                                        tx_name          gene          symbol
#>                                <CharacterList> <IntegerList> <CharacterList>
#>      1                     ENST00000625020.1_1           776            <NA>
#>      2                     ENST00000625098.1_1           754            <NA>
#>      3 ENST00000623794.3_1,ENST00000624813.1_1           766            <NA>
#>      4 ENST00000623794.3_1,ENST00000624813.1_1           766            <NA>
#>      5 ENST00000623794.3_1,ENST00000624813.1_1           766            <NA>
#>    ...                                     ...           ...             ...
#>   7867                                                                      
#>   7868                                                                      
#>   7869                                                                      
#>   7870                                                                      
#>   7871                                                                      
#>   -------
#>   seqinfo: 1 sequence from hg19 genome
#> 
#> $codingGenome
#> GRanges object with 10361 ranges and 5 metadata columns:
#>         seqnames            ranges strand |   theRegion         tx_id
#>            <Rle>         <IRanges>  <Rle> | <character> <IntegerList>
#>       1    chr21   9490380-9492379      + |    promoter             1
#>       2    chr21   9492380-9492817      + |        exon             1
#>       3    chr21   9588294-9590293      + |    promoter             2
#>       4    chr21   9590294-9590395      + |        exon             2
#>       5    chr21   9590396-9590493      + |    promoter             2
#>     ...      ...               ...    ... .         ...           ...
#>   10357    chr21 47865683-47876803      * |  intergenic              
#>   10358    chr21 47989929-48018516      * |  intergenic              
#>   10359    chr21 48027122-48053506      * |  intergenic              
#>   10360    chr21 48085037-48108675      * |  intergenic              
#>   10361    chr21 48111139-48129895      * |  intergenic              
#>                     tx_name          gene          symbol
#>             <CharacterList> <IntegerList> <CharacterList>
#>       1 ENST00000625020.1_1           776            <NA>
#>       2 ENST00000625020.1_1           776            <NA>
#>       3 ENST00000625098.1_1           754            <NA>
#>       4 ENST00000625098.1_1           754            <NA>
#>       5 ENST00000625098.1_1           754            <NA>
#>     ...                 ...           ...             ...
#>   10357                                                  
#>   10358                                                  
#>   10359                                                  
#>   10360                                                  
#>   10361                                                  
#>   -------
#>   seqinfo: 1 sequence from hg19 genome
#>