Parses the junctions outputted from process_junction_table()
into an STAR
compatible format (SJ.out) for more convenient use in downstream analyses.
The columns strand, intron_motif and annotated will always be 0 (undefined)
but can be derived through extracting the dinucleotide motifs for the given
reference coordinates for canonical motifs. This function is an
R-implementation of the Megadepth helper script, on which further details of
column definitions can be found:
https://github.com/ChristopherWilks/megadepth#junctions.
process_junction_table(all_jxs)
A tibble::tibble()
containing junction data ("all.jxs.tsv")
generated by bam_to_junctions(all_junctions = TRUE)
and imported through
megadepth::read_junction_table()
.
Processed junctions in a STAR-compatible format.
## Install if necessary
install_megadepth()
#> The latest megadepth version is 1.2.0
#> This is not an interactive session, therefore megadepth has been installed temporarily to
#> /tmp/RtmpmgolT5/megadepth
## Find the example BAM file
example_bam <- system.file("tests", "test.bam",
package = "megadepth", mustWork = TRUE
)
## Run bam_to_junctions()
example_jxs <- bam_to_junctions(example_bam, overwrite = TRUE)
## Read the junctions in as a tibble
all_jxs <- read_junction_table(example_jxs[["all_jxs.tsv"]])
## Process junctions into a STAR-compatible format
processed_jxs <- process_junction_table(all_jxs)
processed_jxs
#> # A tibble: 5 × 8
#> chr start end strand intron_motif annotated uniquely_mapping_reads
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
#> 1 chr10 4358579 4581019 0 0 0 0
#> 2 chr10 8458623 8778558 0 0 0 0
#> 3 chr10 8722315 8848720 0 0 0 1
#> 4 chr10 8722508 8870679 0 0 0 1
#> 5 chr10 8756762 8780518 0 0 0 20
#> # ℹ 1 more variable: multimapping_reads <int>