This function extracts the data from the BigWig coverage files that is then used by four_panels. This function can take a while to run depending on your internet connection. Furthermore, this function relies on functionality in the rtracklayer package for reading BigWig files which does not work in Windows machines. The data extracted by this function is also used by plot_coverage.
brainflowprobes_cov(REGION, PD = brainflowprobes::pd, VERBOSE = TRUE)
Either a single hg19 genomic sequence including the chromosome, start, end, and optionally strand separated by colons (e.g., 'chr20:10199446-10288068:+'), or a string of sequences. Must be character. Chromosome must be proceeded by 'chr'.
A list of data.frames with the sumMapped
and files
columns. Defaults to the data included in this package.
A logical value indicating whether to print updates from the process of loading the data from the BigWig files.
A list of region coverage coverage data.frame lists used by
four_panels and plot_coverage. That is, a list with one
element per dataset in pd (so four: Sep
, Deg
, Cell
, Sort
).
Each element of the output list is a list with one data.frame per input
region. In the case of four_panels_example_cov
there was only one input
region hence each region coverage data.frame list has one element. A region
coverage data.frame has one column per sample and one row per genome
base-pair for the given region and dataset.
## This function loads data from BigWig files using the rtracklayer package.
## This functionality is not supported on Windows OS machines!
if (.Platform$OS.type != "windows") {
## How long this takes to run will depend on your internet connection.
example_cov <- brainflowprobes_cov("chr20:10286777-10288069:+",
PD = lapply(brainflowprobes::pd, head, n = 2)
)
## Output examination:
# A list with one element per element in brainflowprobes::pd
stopifnot(is.list(example_cov))
stopifnot(identical(
names(example_cov),
names(brainflowprobes::pd)
))
# For each dataset, brainflowprobes_cov() returns a list of region
# coverage data.frames. In this example, there was a single input region.
stopifnot(all(
sapply(example_cov, length) ==
length(
GenomicRanges::GRanges("chr20:10286777-10288069:+")
)
))
# Then each data.frame itself has 1 row per genome base-pair in the region
stopifnot(
all(
sapply(example_cov, function(x) {
nrow(x[[1]])
}) ==
GenomicRanges::width(
GenomicRanges::GRanges("chr20:10286777-10288069:+")
)
)
)
# and one column per sample in the dataset unless you subsetted the data
# like we did earlier when creating "example_cov".
stopifnot(identical(
sapply(four_panels_example_cov, function(x) {
ncol(x[[1]])
}),
sapply(pd, nrow)
))
}
#> 2023-05-07 06:26:16.90821 getRegionCoverage : attempting to load coverage data from 'files'.
#> 2023-05-07 06:26:16.936391 fullCoverage: processing chromosome chr20
#> 2023-05-07 06:26:16.974097 loadCoverage: finding chromosome lengths
#> 2023-05-07 06:26:17.194185 loadCoverage: loading BigWig file http://brain-flow-rna.s3.us-east-2.amazonaws.com/Sep/Br1113C1_polyA.bw
#> 2023-05-07 06:26:17.678744 loadCoverage: loading BigWig file http://brain-flow-rna.s3.us-east-2.amazonaws.com/Sep/Br1113N1_polyA.bw
#> 2023-05-07 06:26:18.199766 loadCoverage: applying the cutoff to the merged data
#> 2023-05-07 06:26:18.236082 filterData: originally there were 63025520 rows, now there are 63025520 rows. Meaning that 0 percent was filtered.
#> 2023-05-07 06:26:18.349378 getRegionCoverage: processing chr20
#> 2023-05-07 06:26:18.386649 getRegionCoverage: done processing chr20
#> 2023-05-07 06:26:18.439114 getRegionCoverage : attempting to load coverage data from 'files'.
#> 2023-05-07 06:26:18.452502 fullCoverage: processing chromosome chr20
#> 2023-05-07 06:26:18.504554 loadCoverage: finding chromosome lengths
#> 2023-05-07 06:26:18.702978 loadCoverage: loading BigWig file http://brain-flow-rna.s3.us-east-2.amazonaws.com/Deg/Br1385_A1_poly.bw
#> 2023-05-07 06:26:19.094863 loadCoverage: loading BigWig file http://brain-flow-rna.s3.us-east-2.amazonaws.com/Deg/Br1385_A2_poly.bw
#> 2023-05-07 06:26:19.572711 loadCoverage: applying the cutoff to the merged data
#> 2023-05-07 06:26:19.607744 filterData: originally there were 63025520 rows, now there are 63025520 rows. Meaning that 0 percent was filtered.
#> 2023-05-07 06:26:19.69635 getRegionCoverage: processing chr20
#> 2023-05-07 06:26:19.729363 getRegionCoverage: done processing chr20
#> 2023-05-07 06:26:19.756078 getRegionCoverage : attempting to load coverage data from 'files'.
#> 2023-05-07 06:26:19.767685 fullCoverage: processing chromosome chr20
#> 2023-05-07 06:26:19.785837 loadCoverage: finding chromosome lengths
#> 2023-05-07 06:26:19.931797 loadCoverage: loading BigWig file http://brain-flow-rna.s3.us-east-2.amazonaws.com/Cell/SRR1974664.bw
#> 2023-05-07 06:26:20.316571 loadCoverage: loading BigWig file http://brain-flow-rna.s3.us-east-2.amazonaws.com/Cell/SRR1974665.bw
#> 2023-05-07 06:26:20.662985 loadCoverage: applying the cutoff to the merged data
#> 2023-05-07 06:26:20.679732 filterData: originally there were 63025520 rows, now there are 63025520 rows. Meaning that 0 percent was filtered.
#> 2023-05-07 06:26:20.750237 getRegionCoverage: processing chr20
#> 2023-05-07 06:26:20.761764 getRegionCoverage: done processing chr20
#> 2023-05-07 06:26:20.788908 getRegionCoverage : attempting to load coverage data from 'files'.
#> 2023-05-07 06:26:20.801103 fullCoverage: processing chromosome chr20
#> 2023-05-07 06:26:20.823238 loadCoverage: finding chromosome lengths
#> 2023-05-07 06:26:20.973224 loadCoverage: loading BigWig file http://brain-flow-rna.s3.us-east-2.amazonaws.com/Sort/NeuN_Minus_12_PolyA.bw
#> 2023-05-07 06:26:21.477607 loadCoverage: loading BigWig file http://brain-flow-rna.s3.us-east-2.amazonaws.com/Sort/NeuN_Plus_12_PolyA.bw
#> 2023-05-07 06:26:21.833681 loadCoverage: applying the cutoff to the merged data
#> 2023-05-07 06:26:21.858979 filterData: originally there were 63025520 rows, now there are 63025520 rows. Meaning that 0 percent was filtered.
#> 2023-05-07 06:26:21.928455 getRegionCoverage: processing chr20
#> 2023-05-07 06:26:21.94028 getRegionCoverage: done processing chr20
## This is how the example data included in the package was made:
if (FALSE) {
## This can take about 10 minutes to run!
four_panels_example_cov <- brainflowprobes_cov("chr20:10286777-10288069:+")
}
## If you are interested, you could download all the BigWig files
## in the \code{brainflowprobes::pd} list of data.frames from the
## \code{files} column to your disk. Doing so will greatly increase the
## speed for \code{brainflowprobes_cov} and the functions that depend on
## this data. Then edit \code{brainflowprobes::pd} \code{files} to point to
## your local files.
## Web location of BigWig files
lapply(brainflowprobes::pd, function(x) head(x$files))
#> $Sep
#> [1] "http://brain-flow-rna.s3.us-east-2.amazonaws.com/Sep/Br1113C1_polyA.bw"
#> [2] "http://brain-flow-rna.s3.us-east-2.amazonaws.com/Sep/Br1113N1_polyA.bw"
#> [3] "http://brain-flow-rna.s3.us-east-2.amazonaws.com/Sep/Br2046C_polyA.bw"
#> [4] "http://brain-flow-rna.s3.us-east-2.amazonaws.com/Sep/Br2046N_polyA.bw"
#> [5] "http://brain-flow-rna.s3.us-east-2.amazonaws.com/Sep/Br2074C_polyA.bw"
#> [6] "http://brain-flow-rna.s3.us-east-2.amazonaws.com/Sep/Br2074N_polyA.bw"
#>
#> $Deg
#> [1] "http://brain-flow-rna.s3.us-east-2.amazonaws.com/Deg/Br1385_A1_poly.bw"
#> [2] "http://brain-flow-rna.s3.us-east-2.amazonaws.com/Deg/Br1385_A2_poly.bw"
#> [3] "http://brain-flow-rna.s3.us-east-2.amazonaws.com/Deg/Br1385_A4_poly.bw"
#> [4] "http://brain-flow-rna.s3.us-east-2.amazonaws.com/Deg/Br1385_A3_poly.bw"
#> [5] "http://brain-flow-rna.s3.us-east-2.amazonaws.com/Deg/Br1729_B1_poly.bw"
#> [6] "http://brain-flow-rna.s3.us-east-2.amazonaws.com/Deg/Br1729_B2_poly.bw"
#>
#> $Cell
#> [1] "http://brain-flow-rna.s3.us-east-2.amazonaws.com/Cell/SRR1974664.bw"
#> [2] "http://brain-flow-rna.s3.us-east-2.amazonaws.com/Cell/SRR1974665.bw"
#> [3] "http://brain-flow-rna.s3.us-east-2.amazonaws.com/Cell/SRR1974666.bw"
#> [4] "http://brain-flow-rna.s3.us-east-2.amazonaws.com/Cell/SRR1974667.bw"
#> [5] "http://brain-flow-rna.s3.us-east-2.amazonaws.com/Cell/SRR1974668.bw"
#> [6] "http://brain-flow-rna.s3.us-east-2.amazonaws.com/Cell/SRR1974669.bw"
#>
#> $Sort
#> [1] "http://brain-flow-rna.s3.us-east-2.amazonaws.com/Sort/NeuN_Minus_12_PolyA.bw"
#> [2] "http://brain-flow-rna.s3.us-east-2.amazonaws.com/Sort/NeuN_Plus_12_PolyA.bw"
#> [3] "http://brain-flow-rna.s3.us-east-2.amazonaws.com/Sort/NeuN_Minus_13_PolyA.bw"
#> [4] "http://brain-flow-rna.s3.us-east-2.amazonaws.com/Sort/NeuN_Plus_13_PolyA.bw"
#> [5] "http://brain-flow-rna.s3.us-east-2.amazonaws.com/Sort/NeuN_Minus_14_PolyA.bw"
#> [6] "http://brain-flow-rna.s3.us-east-2.amazonaws.com/Sort/NeuN_Plus_14_PolyA.bw"
#>