Given an organism of interest, this function constructs the URL for accessing one of the output files from the recount3 project. You can then download the file using file_retrieve().

locate_url(
  project,
  project_home = project_homes(organism = organism, recount3_url = recount3_url),
  type = c("metadata", "gene", "exon", "jxn", "bw"),
  organism = c("human", "mouse"),
  sample = NULL,
  annotation = annotation_options(organism),
  jxn_format = c("ALL", "UNIQUE"),
  recount3_url = getOption("recount3_url", "http://duffel.rail.bio/recount3")
)

Arguments

project

A character(1) with the ID for a given study.

project_home

A character(1) with the home directory for the project. You can find these using project_homes().

type

A character(1) specifying whether you want to access gene counts, exon counts, exon-exon junctions or base-pair BigWig coverage files (one per sample).

organism

A character(1) specifying which organism you want to download data from. Supported options are "human" or "mouse".

sample

A character() vector with the sample ID(s) you want to download.

annotation

A character(1) specifying which annotation you want to download. Only used when type is either gene or exon.

jxn_format

A character(1) specifying whether the exon-exon junction files are derived from all the reads (ALL) or only the uniquely mapping read counts (UNIQUE). Note that UNIQUE is only available for some projects: GTEx and TCGA for human.

recount3_url

A character(1) specifying the home URL for recount3 or a local directory where you have mirrored recount3. Defaults to the load balancer http://duffel.rail.bio/recount3, but can also be https://recount-opendata.s3.amazonaws.com/recount3/release from https://registry.opendata.aws/recount/ or SciServer datascope from IDIES at JHU https://sciserver.org/public-data/recount3/data. You can set the R option recount3_url (for example in your .Rprofile) if you have a favorite mirror.

Value

A character() with the URL(s) for the file(s) of interest.

See also

Other internal functions for accessing the recount3 data: annotation_ext(), create_rse_manual(), file_retrieve(), locate_url_ann(), project_homes(), read_counts(), read_metadata()

Examples


## Example for metadata files from a project from SRA
locate_url(
    "SRP009615",
    "data_sources/sra"
)
#>                                                                                            sra.sra.SRP009615.MD.gz 
#>             "http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/15/SRP009615/sra.sra.SRP009615.MD.gz" 
#>                                                                                sra.recount_project.SRP009615.MD.gz 
#> "http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/15/SRP009615/sra.recount_project.SRP009615.MD.gz" 
#>                                                                                     sra.recount_qc.SRP009615.MD.gz 
#>      "http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/15/SRP009615/sra.recount_qc.SRP009615.MD.gz" 
#>                                                                                 sra.recount_seq_qc.SRP009615.MD.gz 
#>  "http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/15/SRP009615/sra.recount_seq_qc.SRP009615.MD.gz" 
#>                                                                                   sra.recount_pred.SRP009615.MD.gz 
#>    "http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/15/SRP009615/sra.recount_pred.SRP009615.MD.gz" 

## Example for metadata files from a project that is part of a collection
locate_url(
    "ERP110066",
    "collections/geuvadis_smartseq",
    recount3_url = "http://snaptron.cs.jhu.edu/data/temp/recount3"
)
#> 2023-05-07 00:12:31.505758 caching file geuvadis_smartseq.recount_project.gz.
#>                                                                                                          sra.sra.ERP110066.MD.gz 
#>             "http://snaptron.cs.jhu.edu/data/temp/recount3/human/data_sources/sra/metadata/66/ERP110066/sra.sra.ERP110066.MD.gz" 
#>                                                                                              sra.recount_project.ERP110066.MD.gz 
#> "http://snaptron.cs.jhu.edu/data/temp/recount3/human/data_sources/sra/metadata/66/ERP110066/sra.recount_project.ERP110066.MD.gz" 
#>                                                                                                   sra.recount_qc.ERP110066.MD.gz 
#>      "http://snaptron.cs.jhu.edu/data/temp/recount3/human/data_sources/sra/metadata/66/ERP110066/sra.recount_qc.ERP110066.MD.gz" 
#>                                                                                               sra.recount_seq_qc.ERP110066.MD.gz 
#>  "http://snaptron.cs.jhu.edu/data/temp/recount3/human/data_sources/sra/metadata/66/ERP110066/sra.recount_seq_qc.ERP110066.MD.gz" 
#>                                                                                                 sra.recount_pred.ERP110066.MD.gz 
#>    "http://snaptron.cs.jhu.edu/data/temp/recount3/human/data_sources/sra/metadata/66/ERP110066/sra.recount_pred.ERP110066.MD.gz" 
#>                                                                                                      geuvadis_smartseq.custom.gz 
#>         "http://snaptron.cs.jhu.edu/data/temp/recount3/human/collections/geuvadis_smartseq/metadata/geuvadis_smartseq.custom.gz" 

## Example for a BigWig file
locate_url(
    "SRP009615",
    "data_sources/sra",
    "bw",
    "human",
    "SRR387777"
)
#>                                                                                    sra.base_sums.SRP009615_SRR387777.ALL.bw 
#> "http://duffel.rail.bio/recount3/human/data_sources/sra/base_sums/15/SRP009615/77/sra.base_sums.SRP009615_SRR387777.ALL.bw" 

## Locate example gene count files
locate_url(
    "SRP009615",
    "data_sources/sra",
    "gene"
)
#>                                                                                 sra.gene_sums.SRP009615.G026.gz 
#> "http://duffel.rail.bio/recount3/human/data_sources/sra/gene_sums/15/SRP009615/sra.gene_sums.SRP009615.G026.gz" 
locate_url(
    "SRP009615",
    "data_sources/sra",
    "gene",
    annotation = "refseq"
)
#>                                                                                 sra.gene_sums.SRP009615.R109.gz 
#> "http://duffel.rail.bio/recount3/human/data_sources/sra/gene_sums/15/SRP009615/sra.gene_sums.SRP009615.R109.gz" 

## Example for a gene count file from a project that is part of a collection
locate_url(
    "ERP110066",
    "collections/geuvadis_smartseq",
    "gene",
    recount3_url = "http://snaptron.cs.jhu.edu/data/temp/recount3"
)
#> 2023-05-07 00:12:31.703993 caching file geuvadis_smartseq.recount_project.gz.
#>                                                                                               sra.gene_sums.ERP110066.G026.gz 
#> "http://snaptron.cs.jhu.edu/data/temp/recount3/human/data_sources/sra/gene_sums/66/ERP110066/sra.gene_sums.ERP110066.G026.gz" 

## Locate example junction files
locate_url(
    "SRP009615",
    "data_sources/sra",
    "jxn"
)
#>                                                                                 sra.junctions.SRP009615.ALL.MM.gz 
#> "http://duffel.rail.bio/recount3/human/data_sources/sra/junctions/15/SRP009615/sra.junctions.SRP009615.ALL.MM.gz" 
#>                                                                                 sra.junctions.SRP009615.ALL.RR.gz 
#> "http://duffel.rail.bio/recount3/human/data_sources/sra/junctions/15/SRP009615/sra.junctions.SRP009615.ALL.RR.gz" 
#>                                                                                 sra.junctions.SRP009615.ALL.ID.gz 
#> "http://duffel.rail.bio/recount3/human/data_sources/sra/junctions/15/SRP009615/sra.junctions.SRP009615.ALL.ID.gz" 

## Example for metadata files from a project from SRA
locate_url(
    "ERP001942",
    "data_sources/sra"
)
#>                                                                                            sra.sra.ERP001942.MD.gz 
#>             "http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/42/ERP001942/sra.sra.ERP001942.MD.gz" 
#>                                                                                sra.recount_project.ERP001942.MD.gz 
#> "http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/42/ERP001942/sra.recount_project.ERP001942.MD.gz" 
#>                                                                                     sra.recount_qc.ERP001942.MD.gz 
#>      "http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/42/ERP001942/sra.recount_qc.ERP001942.MD.gz" 
#>                                                                                 sra.recount_seq_qc.ERP001942.MD.gz 
#>  "http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/42/ERP001942/sra.recount_seq_qc.ERP001942.MD.gz" 
#>                                                                                   sra.recount_pred.ERP001942.MD.gz 
#>    "http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/42/ERP001942/sra.recount_pred.ERP001942.MD.gz"