Given an organism of interest, this function constructs the URL for accessing
one of the output files from the recount3
project. You can then download
the file using file_retrieve()
.
locate_url(
project,
project_home = project_homes(organism = organism, recount3_url = recount3_url),
type = c("metadata", "gene", "exon", "jxn", "bw"),
organism = c("human", "mouse"),
sample = NULL,
annotation = annotation_options(organism),
jxn_format = c("ALL", "UNIQUE"),
recount3_url = getOption("recount3_url", "http://duffel.rail.bio/recount3")
)
A character(1)
with the ID for a given study.
A character(1)
with the home directory for the
project
. You can find these using project_homes()
.
A character(1)
specifying whether you want to access gene
counts, exon counts, exon-exon junctions or base-pair BigWig coverage files
(one per sample
).
A character(1)
specifying which organism you want to
download data from. Supported options are "human"
or "mouse"
.
A character()
vector with the sample ID(s) you want to
download.
A character(1)
specifying which annotation you want to
download. Only used when type
is either gene
or exon
.
A character(1)
specifying whether the exon-exon junction
files are derived from all the reads (ALL
) or only the uniquely mapping
read counts (UNIQUE
). Note that UNIQUE
is only available for some
projects: GTEx and TCGA for human.
A character(1)
specifying the home URL for recount3
or a local directory where you have mirrored recount3
. Defaults to the
load balancer http://duffel.rail.bio/recount3, but can also be
https://recount-opendata.s3.amazonaws.com/recount3/release from
https://registry.opendata.aws/recount/ or SciServer datascope from
IDIES at JHU https://sciserver.org/public-data/recount3/data. You can
set the R option recount3_url
(for example in your .Rprofile
) if
you have a favorite mirror.
A character()
with the URL(s) for the file(s) of interest.
Other internal functions for accessing the recount3 data:
annotation_ext()
,
create_rse_manual()
,
file_retrieve()
,
locate_url_ann()
,
project_homes()
,
read_counts()
,
read_metadata()
## Example for metadata files from a project from SRA
locate_url(
"SRP009615",
"data_sources/sra"
)
#> sra.sra.SRP009615.MD.gz
#> "http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/15/SRP009615/sra.sra.SRP009615.MD.gz"
#> sra.recount_project.SRP009615.MD.gz
#> "http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/15/SRP009615/sra.recount_project.SRP009615.MD.gz"
#> sra.recount_qc.SRP009615.MD.gz
#> "http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/15/SRP009615/sra.recount_qc.SRP009615.MD.gz"
#> sra.recount_seq_qc.SRP009615.MD.gz
#> "http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/15/SRP009615/sra.recount_seq_qc.SRP009615.MD.gz"
#> sra.recount_pred.SRP009615.MD.gz
#> "http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/15/SRP009615/sra.recount_pred.SRP009615.MD.gz"
## Example for metadata files from a project that is part of a collection
locate_url(
"ERP110066",
"collections/geuvadis_smartseq",
recount3_url = "http://snaptron.cs.jhu.edu/data/temp/recount3"
)
#> 2023-05-07 00:12:31.505758 caching file geuvadis_smartseq.recount_project.gz.
#> sra.sra.ERP110066.MD.gz
#> "http://snaptron.cs.jhu.edu/data/temp/recount3/human/data_sources/sra/metadata/66/ERP110066/sra.sra.ERP110066.MD.gz"
#> sra.recount_project.ERP110066.MD.gz
#> "http://snaptron.cs.jhu.edu/data/temp/recount3/human/data_sources/sra/metadata/66/ERP110066/sra.recount_project.ERP110066.MD.gz"
#> sra.recount_qc.ERP110066.MD.gz
#> "http://snaptron.cs.jhu.edu/data/temp/recount3/human/data_sources/sra/metadata/66/ERP110066/sra.recount_qc.ERP110066.MD.gz"
#> sra.recount_seq_qc.ERP110066.MD.gz
#> "http://snaptron.cs.jhu.edu/data/temp/recount3/human/data_sources/sra/metadata/66/ERP110066/sra.recount_seq_qc.ERP110066.MD.gz"
#> sra.recount_pred.ERP110066.MD.gz
#> "http://snaptron.cs.jhu.edu/data/temp/recount3/human/data_sources/sra/metadata/66/ERP110066/sra.recount_pred.ERP110066.MD.gz"
#> geuvadis_smartseq.custom.gz
#> "http://snaptron.cs.jhu.edu/data/temp/recount3/human/collections/geuvadis_smartseq/metadata/geuvadis_smartseq.custom.gz"
## Example for a BigWig file
locate_url(
"SRP009615",
"data_sources/sra",
"bw",
"human",
"SRR387777"
)
#> sra.base_sums.SRP009615_SRR387777.ALL.bw
#> "http://duffel.rail.bio/recount3/human/data_sources/sra/base_sums/15/SRP009615/77/sra.base_sums.SRP009615_SRR387777.ALL.bw"
## Locate example gene count files
locate_url(
"SRP009615",
"data_sources/sra",
"gene"
)
#> sra.gene_sums.SRP009615.G026.gz
#> "http://duffel.rail.bio/recount3/human/data_sources/sra/gene_sums/15/SRP009615/sra.gene_sums.SRP009615.G026.gz"
locate_url(
"SRP009615",
"data_sources/sra",
"gene",
annotation = "refseq"
)
#> sra.gene_sums.SRP009615.R109.gz
#> "http://duffel.rail.bio/recount3/human/data_sources/sra/gene_sums/15/SRP009615/sra.gene_sums.SRP009615.R109.gz"
## Example for a gene count file from a project that is part of a collection
locate_url(
"ERP110066",
"collections/geuvadis_smartseq",
"gene",
recount3_url = "http://snaptron.cs.jhu.edu/data/temp/recount3"
)
#> 2023-05-07 00:12:31.703993 caching file geuvadis_smartseq.recount_project.gz.
#> sra.gene_sums.ERP110066.G026.gz
#> "http://snaptron.cs.jhu.edu/data/temp/recount3/human/data_sources/sra/gene_sums/66/ERP110066/sra.gene_sums.ERP110066.G026.gz"
## Locate example junction files
locate_url(
"SRP009615",
"data_sources/sra",
"jxn"
)
#> sra.junctions.SRP009615.ALL.MM.gz
#> "http://duffel.rail.bio/recount3/human/data_sources/sra/junctions/15/SRP009615/sra.junctions.SRP009615.ALL.MM.gz"
#> sra.junctions.SRP009615.ALL.RR.gz
#> "http://duffel.rail.bio/recount3/human/data_sources/sra/junctions/15/SRP009615/sra.junctions.SRP009615.ALL.RR.gz"
#> sra.junctions.SRP009615.ALL.ID.gz
#> "http://duffel.rail.bio/recount3/human/data_sources/sra/junctions/15/SRP009615/sra.junctions.SRP009615.ALL.ID.gz"
## Example for metadata files from a project from SRA
locate_url(
"ERP001942",
"data_sources/sra"
)
#> sra.sra.ERP001942.MD.gz
#> "http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/42/ERP001942/sra.sra.ERP001942.MD.gz"
#> sra.recount_project.ERP001942.MD.gz
#> "http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/42/ERP001942/sra.recount_project.ERP001942.MD.gz"
#> sra.recount_qc.ERP001942.MD.gz
#> "http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/42/ERP001942/sra.recount_qc.ERP001942.MD.gz"
#> sra.recount_seq_qc.ERP001942.MD.gz
#> "http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/42/ERP001942/sra.recount_seq_qc.ERP001942.MD.gz"
#> sra.recount_pred.ERP001942.MD.gz
#> "http://duffel.rail.bio/recount3/human/data_sources/sra/metadata/42/ERP001942/sra.recount_pred.ERP001942.MD.gz"