This function creates a directory with the configuration files required for creating a UCSC track hub for the BigWig files from a given project. These files can then be hosted on GitHub or elsewhere. For more details about UCSC track hubs, check https://genome.ucsc.edu/goldenpath/help/hgTrackHubHelp.html.
create_hub(
x,
output_dir = file.path(tempdir(), x$project[1]),
hub_name = "recount3",
email = "someone@somewhere",
show_max = 5,
hub_short_label = "recount3 coverage",
hub_long_label =
"recount3 summaries and queries for large-scaleRNA-seq expression and splicing",
hub_description_url = "https://rna.recount.bio/index.html",
recount3_url = getOption("recount3_url", "http://duffel.rail.bio/recount3")
)
A data.frame
created with available_samples()
that has typically
been subset to a specific project ID from a given organism.
A character(1)
with the output directory.
A character(1)
with the UCSC track hub name you want
to display.
A character(1)
with the email used for the UCSC track hub.
An integer(1)
with the number of BigWig tracks to show
by default in the UCSC track hub. We recommend a single digit number.
A character(1)
with the UCSC track hub short
label.
A character(1)
with the UCSC track hub long label.
A character(1)
with the URL to an html
file
that will describe the UCSC track hub to users.
A character(1)
specifying the home URL for recount3
or a local directory where you have mirrored recount3
. Defaults to the
load balancer http://duffel.rail.bio/recount3, but can also be
https://recount-opendata.s3.amazonaws.com/recount3/release from
https://registry.opendata.aws/recount/ or SciServer datascope from
IDIES at JHU https://sciserver.org/public-data/recount3/data. You can
set the R option recount3_url
(for example in your .Rprofile
) if
you have a favorite mirror.
A directory at output_dir
with the files needed for a UCSC
track hub.
See https://github.com/LieberInstitute/recount3-docs/blob/master/UCSC_hubs/create_hubs.R for an example of how this function was used.
## Find all the mouse samples available from recount3
mouse_samples <- available_samples("mouse")
#> 2023-05-07 00:10:39.698728 caching file sra.recount_project.MD.gz.
## Subset to project DRP001299
info_DRP001299 <- subset(mouse_samples, project == "DRP001299")
hub_dir <- create_hub(info_DRP001299)
## List the files created by create_hub()
hub_files <- list.files(hub_dir, full.names = TRUE, recursive = TRUE)
hub_files
#> [1] "/tmp/RtmpAxQcvh/DRP001299/genomes.txt"
#> [2] "/tmp/RtmpAxQcvh/DRP001299/hub.txt"
#> [3] "/tmp/RtmpAxQcvh/DRP001299/mm10/trackDb.txt"
## Check the files contents
sapply(hub_files, function(x) {
cat(paste(readLines(x), collapse = "\n"))
})
#> genome mm10
#> trackDb mm10/trackDb.txt
#> hub recount3
#> shortLabel recount3 coverage
#> longLabel recount3 summaries and queries for large-scaleRNA-seq expression and splicing
#> genomesFile genomes.txt
#> email someone@somewhere
#> descriptionUrl https://rna.recount.bio/index.html
#> track DRR014696
#> bigDataUrl http://duffel.rail.bio/recount3/mouse/data_sources/sra/base_sums/99/DRP001299/96/sra.base_sums.DRP001299_DRR014696.ALL.bw
#> shortLabel DRR014696
#> longLabel recount3 coverage bigWig for external id DRR014696 project DRP001299 file source sra
#> type bigWig
#> visibility show
#> autoScale on
#>
#> track DRR014697
#> bigDataUrl http://duffel.rail.bio/recount3/mouse/data_sources/sra/base_sums/99/DRP001299/97/sra.base_sums.DRP001299_DRR014697.ALL.bw
#> shortLabel DRR014697
#> longLabel recount3 coverage bigWig for external id DRR014697 project DRP001299 file source sra
#> type bigWig
#> visibility show
#> autoScale on
#>
#> track DRR014698
#> bigDataUrl http://duffel.rail.bio/recount3/mouse/data_sources/sra/base_sums/99/DRP001299/98/sra.base_sums.DRP001299_DRR014698.ALL.bw
#> shortLabel DRR014698
#> longLabel recount3 coverage bigWig for external id DRR014698 project DRP001299 file source sra
#> type bigWig
#> visibility show
#> autoScale on
#>
#> track DRR014699
#> bigDataUrl http://duffel.rail.bio/recount3/mouse/data_sources/sra/base_sums/99/DRP001299/99/sra.base_sums.DRP001299_DRR014699.ALL.bw
#> shortLabel DRR014699
#> longLabel recount3 coverage bigWig for external id DRR014699 project DRP001299 file source sra
#> type bigWig
#> visibility show
#> autoScale on
#>
#> track DRR014700
#> bigDataUrl http://duffel.rail.bio/recount3/mouse/data_sources/sra/base_sums/99/DRP001299/00/sra.base_sums.DRP001299_DRR014700.ALL.bw
#> shortLabel DRR014700
#> longLabel recount3 coverage bigWig for external id DRR014700 project DRP001299 file source sra
#> type bigWig
#> visibility show
#> autoScale on
#>
#> track DRR014701
#> bigDataUrl http://duffel.rail.bio/recount3/mouse/data_sources/sra/base_sums/99/DRP001299/01/sra.base_sums.DRP001299_DRR014701.ALL.bw
#> shortLabel DRR014701
#> longLabel recount3 coverage bigWig for external id DRR014701 project DRP001299 file source sra
#> type bigWig
#> visibility hide
#> autoScale on
#>
#> track DRR014702
#> bigDataUrl http://duffel.rail.bio/recount3/mouse/data_sources/sra/base_sums/99/DRP001299/02/sra.base_sums.DRP001299_DRR014702.ALL.bw
#> shortLabel DRR014702
#> longLabel recount3 coverage bigWig for external id DRR014702 project DRP001299 file source sra
#> type bigWig
#> visibility hide
#> autoScale on
#>
#> track DRR014703
#> bigDataUrl http://duffel.rail.bio/recount3/mouse/data_sources/sra/base_sums/99/DRP001299/03/sra.base_sums.DRP001299_DRR014703.ALL.bw
#> shortLabel DRR014703
#> longLabel recount3 coverage bigWig for external id DRR014703 project DRP001299 file source sra
#> type bigWig
#> visibility hide
#> autoScale on
#>
#> track DRR014704
#> bigDataUrl http://duffel.rail.bio/recount3/mouse/data_sources/sra/base_sums/99/DRP001299/04/sra.base_sums.DRP001299_DRR014704.ALL.bw
#> shortLabel DRR014704
#> longLabel recount3 coverage bigWig for external id DRR014704 project DRP001299 file source sra
#> type bigWig
#> visibility hide
#> autoScale on
#>
#> track DRR014705
#> bigDataUrl http://duffel.rail.bio/recount3/mouse/data_sources/sra/base_sums/99/DRP001299/05/sra.base_sums.DRP001299_DRR014705.ALL.bw
#> shortLabel DRR014705
#> longLabel recount3 coverage bigWig for external id DRR014705 project DRP001299 file source sra
#> type bigWig
#> visibility hide
#> autoScale on
#> $`/tmp/RtmpAxQcvh/DRP001299/genomes.txt`
#> NULL
#>
#> $`/tmp/RtmpAxQcvh/DRP001299/hub.txt`
#> NULL
#>
#> $`/tmp/RtmpAxQcvh/DRP001299/mm10/trackDb.txt`
#> NULL
#>
## You can also check the file contents for this example project at
## https://github.com/LieberInstitute/recount3-docs/tree/master/UCSC_hubs/mouse/sra_DRP001299
## or test it out on UCSC directly through the following URL:
## https://genome.ucsc.edu/cgi-bin/hgTracks?db=mm10&lastVirtModeType=default&lastVirtModeExtraState=&virtModeType=default&virtMode=0&nonVirtPosition=&position=chr1&hubUrl=https://raw.githubusercontent.com/LieberInstitute/recount3-docs/master/UCSC_hubs/mouse/sra_DRP001299/hub.txt