This function creates a directory with the configuration files required for creating a UCSC track hub for the BigWig files from a given project. These files can then be hosted on GitHub or elsewhere. For more details about UCSC track hubs, check https://genome.ucsc.edu/goldenpath/help/hgTrackHubHelp.html.

create_hub(
  x,
  output_dir = file.path(tempdir(), x$project[1]),
  hub_name = "recount3",
  email = "someone@somewhere",
  show_max = 5,
  hub_short_label = "recount3 coverage",
  hub_long_label =
    "recount3 summaries and queries for large-scaleRNA-seq expression and splicing",
  hub_description_url = "https://rna.recount.bio/index.html",
  recount3_url = getOption("recount3_url", "http://duffel.rail.bio/recount3")
)

Arguments

x

A data.frame created with available_samples() that has typically been subset to a specific project ID from a given organism.

output_dir

A character(1) with the output directory.

hub_name

A character(1) with the UCSC track hub name you want to display.

email

A character(1) with the email used for the UCSC track hub.

show_max

An integer(1) with the number of BigWig tracks to show by default in the UCSC track hub. We recommend a single digit number.

hub_short_label

A character(1) with the UCSC track hub short label.

hub_long_label

A character(1) with the UCSC track hub long label.

hub_description_url

A character(1) with the URL to an html file that will describe the UCSC track hub to users.

recount3_url

A character(1) specifying the home URL for recount3 or a local directory where you have mirrored recount3. Defaults to the load balancer http://duffel.rail.bio/recount3, but can also be https://recount-opendata.s3.amazonaws.com/recount3/release from https://registry.opendata.aws/recount/ or SciServer datascope from IDIES at JHU https://sciserver.org/public-data/recount3/data. You can set the R option recount3_url (for example in your .Rprofile) if you have a favorite mirror.

Value

A directory at output_dir with the files needed for a UCSC track hub.

Details

See https://github.com/LieberInstitute/recount3-docs/blob/master/UCSC_hubs/create_hubs.R for an example of how this function was used.

Examples


## Find all the mouse samples available from recount3
mouse_samples <- available_samples("mouse")
#> 2023-05-07 00:10:39.698728 caching file sra.recount_project.MD.gz.

## Subset to project DRP001299
info_DRP001299 <- subset(mouse_samples, project == "DRP001299")

hub_dir <- create_hub(info_DRP001299)

## List the files created by create_hub()
hub_files <- list.files(hub_dir, full.names = TRUE, recursive = TRUE)
hub_files
#> [1] "/tmp/RtmpAxQcvh/DRP001299/genomes.txt"     
#> [2] "/tmp/RtmpAxQcvh/DRP001299/hub.txt"         
#> [3] "/tmp/RtmpAxQcvh/DRP001299/mm10/trackDb.txt"

## Check the files contents
sapply(hub_files, function(x) {
    cat(paste(readLines(x), collapse = "\n"))
})
#> genome mm10
#> trackDb mm10/trackDb.txt
#> hub recount3
#> shortLabel recount3 coverage
#> longLabel recount3 summaries and queries for large-scaleRNA-seq expression and splicing
#> genomesFile genomes.txt
#> email someone@somewhere
#> descriptionUrl https://rna.recount.bio/index.html
#> track DRR014696
#> bigDataUrl http://duffel.rail.bio/recount3/mouse/data_sources/sra/base_sums/99/DRP001299/96/sra.base_sums.DRP001299_DRR014696.ALL.bw
#> shortLabel DRR014696
#> longLabel recount3 coverage bigWig for external id DRR014696 project DRP001299 file source sra
#> type bigWig
#> visibility show
#> autoScale on
#> 
#> track DRR014697
#> bigDataUrl http://duffel.rail.bio/recount3/mouse/data_sources/sra/base_sums/99/DRP001299/97/sra.base_sums.DRP001299_DRR014697.ALL.bw
#> shortLabel DRR014697
#> longLabel recount3 coverage bigWig for external id DRR014697 project DRP001299 file source sra
#> type bigWig
#> visibility show
#> autoScale on
#> 
#> track DRR014698
#> bigDataUrl http://duffel.rail.bio/recount3/mouse/data_sources/sra/base_sums/99/DRP001299/98/sra.base_sums.DRP001299_DRR014698.ALL.bw
#> shortLabel DRR014698
#> longLabel recount3 coverage bigWig for external id DRR014698 project DRP001299 file source sra
#> type bigWig
#> visibility show
#> autoScale on
#> 
#> track DRR014699
#> bigDataUrl http://duffel.rail.bio/recount3/mouse/data_sources/sra/base_sums/99/DRP001299/99/sra.base_sums.DRP001299_DRR014699.ALL.bw
#> shortLabel DRR014699
#> longLabel recount3 coverage bigWig for external id DRR014699 project DRP001299 file source sra
#> type bigWig
#> visibility show
#> autoScale on
#> 
#> track DRR014700
#> bigDataUrl http://duffel.rail.bio/recount3/mouse/data_sources/sra/base_sums/99/DRP001299/00/sra.base_sums.DRP001299_DRR014700.ALL.bw
#> shortLabel DRR014700
#> longLabel recount3 coverage bigWig for external id DRR014700 project DRP001299 file source sra
#> type bigWig
#> visibility show
#> autoScale on
#> 
#> track DRR014701
#> bigDataUrl http://duffel.rail.bio/recount3/mouse/data_sources/sra/base_sums/99/DRP001299/01/sra.base_sums.DRP001299_DRR014701.ALL.bw
#> shortLabel DRR014701
#> longLabel recount3 coverage bigWig for external id DRR014701 project DRP001299 file source sra
#> type bigWig
#> visibility hide
#> autoScale on
#> 
#> track DRR014702
#> bigDataUrl http://duffel.rail.bio/recount3/mouse/data_sources/sra/base_sums/99/DRP001299/02/sra.base_sums.DRP001299_DRR014702.ALL.bw
#> shortLabel DRR014702
#> longLabel recount3 coverage bigWig for external id DRR014702 project DRP001299 file source sra
#> type bigWig
#> visibility hide
#> autoScale on
#> 
#> track DRR014703
#> bigDataUrl http://duffel.rail.bio/recount3/mouse/data_sources/sra/base_sums/99/DRP001299/03/sra.base_sums.DRP001299_DRR014703.ALL.bw
#> shortLabel DRR014703
#> longLabel recount3 coverage bigWig for external id DRR014703 project DRP001299 file source sra
#> type bigWig
#> visibility hide
#> autoScale on
#> 
#> track DRR014704
#> bigDataUrl http://duffel.rail.bio/recount3/mouse/data_sources/sra/base_sums/99/DRP001299/04/sra.base_sums.DRP001299_DRR014704.ALL.bw
#> shortLabel DRR014704
#> longLabel recount3 coverage bigWig for external id DRR014704 project DRP001299 file source sra
#> type bigWig
#> visibility hide
#> autoScale on
#> 
#> track DRR014705
#> bigDataUrl http://duffel.rail.bio/recount3/mouse/data_sources/sra/base_sums/99/DRP001299/05/sra.base_sums.DRP001299_DRR014705.ALL.bw
#> shortLabel DRR014705
#> longLabel recount3 coverage bigWig for external id DRR014705 project DRP001299 file source sra
#> type bigWig
#> visibility hide
#> autoScale on
#> $`/tmp/RtmpAxQcvh/DRP001299/genomes.txt`
#> NULL
#> 
#> $`/tmp/RtmpAxQcvh/DRP001299/hub.txt`
#> NULL
#> 
#> $`/tmp/RtmpAxQcvh/DRP001299/mm10/trackDb.txt`
#> NULL
#> 

## You can also check the file contents for this example project at
## https://github.com/LieberInstitute/recount3-docs/tree/master/UCSC_hubs/mouse/sra_DRP001299
## or test it out on UCSC directly through the following URL:
## https://genome.ucsc.edu/cgi-bin/hgTracks?db=mm10&lastVirtModeType=default&lastVirtModeExtraState=&virtModeType=default&virtMode=0&nonVirtPosition=&position=chr1&hubUrl=https://raw.githubusercontent.com/LieberInstitute/recount3-docs/master/UCSC_hubs/mouse/sra_DRP001299/hub.txt