vignettes/slurmjobs.Rmd
slurmjobs.Rmd
slurmjobs
R
is an open-source statistical environment which can be
easily modified to enhance its functionality via packages. slurmjobs
is a R
package available via the Bioconductor repository for packages.
R
can be installed on any operating system from CRAN after which you can install
slurmjobs
by using the following commands in your R
session:
if (!requireNamespace("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
BiocManager::install("slurmjobs")
## Check that you have a valid Bioconductor installation
BiocManager::valid()
slurmjobs is designed for interacting with the SLURM job scheduler, and assumes basic familiarity with terms like “job”, “task”, and “array”, as well as the sbatch command. Background knowledge about memory (such as virtual memory and resident set size (RSS)) is helpful but not critical in using this package.
If you are asking yourself the question “Where do I start using Bioconductor?” you might be interested in this blog post.
As package developers, we try to explain clearly how to use our
packages and in which order to use the functions. But R
and
Bioconductor
have a steep learning curve so it is critical
to learn where to ask for help. The blog post quoted above mentions some
but we would like to highlight the Bioconductor support site
as the main resource for getting help: remember to use the
slurmjobs
tag and check the older
posts. Other alternatives are available such as creating GitHub
issues and tweeting. However, please note that if you want to receive
help you should adhere to the posting
guidelines. It is particularly critical that you provide a small
reproducible example and your session information so package developers
can track down the source of the error.
slurmjobs
We hope that slurmjobs will be useful for your research. Please use the following information to cite the package and the overall approach. Thank you!
## Citation info
citation("slurmjobs")
#> To cite package 'slurmjobs' in publications use:
#>
#> LieberInstitute (2025). _slurmjobs: Helper Functions for SLURM Jobs_.
#> doi:10.18129/B9.bioc.slurmjobs
#> <https://doi.org/10.18129/B9.bioc.slurmjobs>,
#> https://github.com/LieberInstitute/slurmjobs/slurmjobs - R package
#> version 1.2.5, <http://www.bioconductor.org/packages/slurmjobs>.
#>
#> LieberInstitute (2025). "slurmjobs: Helper Functions for SLURM Jobs."
#> _bioRxiv_. doi:10.1101/TODO <https://doi.org/10.1101/TODO>,
#> <https://www.biorxiv.org/content/10.1101/TODO>.
#>
#> To see these entries in BibTeX format, use 'print(<citation>,
#> bibtex=TRUE)', 'toBibtex(.)', or set
#> 'options(citation.bibtex.max=999)'.
slurmjobs
provides helper functions for interacting with
SLURM-managed
high-performance-computing environments from R. It includes functions
for creating submittable jobs (including array jobs), monitoring
partitions, and extracting info about running or complete jobs. In
addition to loading slurmjobs
, we’ll be using
dplyr
to manipulate example data about jobs.
library("slurmjobs")
#> Error in get(paste0(generic, ".", class), envir = get_method_env()) :
#> object 'type_sum.accel' not found
library("dplyr")
sbatch
When processing data on a SLURM-managed system, primarily
running R code, you’ll likely find yourself writing many “wrapper” shell
scripts that can be submitted via sbatch
to the job
scheduler. This process requires precise SLURM-specific syntax and a
large amount of repetition. job_single
aims to reduce
required configuration from the user to just a handful of options that
tend to vary most often between shell scripts (e.g. memory, number of
CPUs, time limit), and automate the rest of the shell-script-creation
process.
Shell scripts created by job_single
log key
reproducibility information, such as the user, job ID, job name, node
name, and when the job starts and ends.
# With 'create_shell = FALSE', the contents of the potential shell script are
# only printed to the screen
job_single(
name = "my_shell_script", memory = "10G", cores = 2, create_shell = FALSE
)
#> 2025-01-06 16:47:01.322032 creating the logs directory at: logs
#> #!/bin/bash
#> #SBATCH -p shared
#> #SBATCH --mem=10G
#> #SBATCH --job-name=my_shell_script
#> #SBATCH -c 2
#> #SBATCH -t 1-00:00:00
#> #SBATCH -o logs/my_shell_script.txt
#> #SBATCH -e logs/my_shell_script.txt
#> #SBATCH --mail-type=ALL
#>
#> set -e
#>
#> echo "**** Job starts ****"
#> date
#>
#> echo "**** JHPCE info ****"
#> echo "User: ${USER}"
#> echo "Job id: ${SLURM_JOB_ID}"
#> echo "Job name: ${SLURM_JOB_NAME}"
#> echo "Node name: ${HOSTNAME}"
#> echo "Task id: ${SLURM_ARRAY_TASK_ID}"
#>
#> ## Load the R module
#> module load conda_R/4.4
#>
#> ## List current modules for reproducibility
#> module list
#>
#> ## Edit with your job command
#> Rscript -e "options(width = 120); sessioninfo::session_info()"
#>
#> echo "**** Job ends ****"
#> date
#>
#> ## This script was made using slurmjobs version 1.2.5
#> ## available from http://research.libd.org/slurmjobs/
Similarly, we can specify task_num
to create an array
job– in this case, one with 10 tasks.
job_single(
name = "my_array_job", memory = "5G", cores = 1, create_shell = FALSE,
task_num = 10
)
#> 2025-01-06 16:47:01.395031 creating the logs directory at: logs
#> #!/bin/bash
#> #SBATCH -p shared
#> #SBATCH --mem=5G
#> #SBATCH --job-name=my_array_job
#> #SBATCH -c 1
#> #SBATCH -t 1-00:00:00
#> #SBATCH -o logs/my_array_job.%a.txt
#> #SBATCH -e logs/my_array_job.%a.txt
#> #SBATCH --mail-type=ALL
#> #SBATCH --array=1-10%20
#>
#> set -e
#>
#> echo "**** Job starts ****"
#> date
#>
#> echo "**** JHPCE info ****"
#> echo "User: ${USER}"
#> echo "Job id: ${SLURM_JOB_ID}"
#> echo "Job name: ${SLURM_JOB_NAME}"
#> echo "Node name: ${HOSTNAME}"
#> echo "Task id: ${SLURM_ARRAY_TASK_ID}"
#>
#> ## Load the R module
#> module load conda_R/4.4
#>
#> ## List current modules for reproducibility
#> module list
#>
#> ## Edit with your job command
#> Rscript -e "options(width = 120); sessioninfo::session_info()"
#>
#> echo "**** Job ends ****"
#> date
#>
#> ## This script was made using slurmjobs version 1.2.5
#> ## available from http://research.libd.org/slurmjobs/
Another function, job_loop()
, can be used to create more
complex array jobs as compared with job_single()
. It’s
useful when looping through one or more variables with pre-defined
values, and applying the same processing steps. The key difference is
that rather than specifying task_num
, you specify
loops
, a named list of variables to loop through. An array
job then gets created that can directly refer to the values of these
variables, rather than referring to just the array’s task ID.
job_loop()
, unlike job_single()
, also
creates an R script. The idea is that the shell script invokes the R
script internally, with a particular combination of variables. The
getopt
package is then used to read in this combination
from the command line, so that each variable can be accessed by name in
R. Let’s make that a bit more concrete.
# 'job_loop' returns a list containing the contents of the to-be-created shell
# and R scripts. Let's take a look at the shell script first
script_pair <- job_loop(
loops = list(region = c("DLPFC", "HIPPO"), feature = c("gene", "exon", "tx", "jxn")),
name = "bsp2_test"
)
cat(script_pair[["shell"]], sep = "\n")
#> #!/bin/bash
#> #SBATCH -p shared
#> #SBATCH --mem=10G
#> #SBATCH --job-name=bsp2_test
#> #SBATCH -c 1
#> #SBATCH -t 1-00:00:00
#> #SBATCH -o /dev/null
#> #SBATCH -e /dev/null
#> #SBATCH --mail-type=ALL
#> #SBATCH --array=1-8%20
#>
#> ## Define loops and appropriately subset each variable for the array task ID
#> all_region=(DLPFC HIPPO)
#> region=${all_region[$(( $SLURM_ARRAY_TASK_ID / 4 % 2 ))]}
#>
#> all_feature=(gene exon tx jxn)
#> feature=${all_feature[$(( $SLURM_ARRAY_TASK_ID / 1 % 4 ))]}
#>
#> ## Explicitly pipe script output to a log
#> log_path=logs/bsp2_test_${region}_${feature}_${SLURM_ARRAY_TASK_ID}.txt
#>
#> {
#> set -e
#>
#> echo "**** Job starts ****"
#> date
#>
#> echo "**** JHPCE info ****"
#> echo "User: ${USER}"
#> echo "Job id: ${SLURM_JOB_ID}"
#> echo "Job name: ${SLURM_JOB_NAME}"
#> echo "Node name: ${HOSTNAME}"
#> echo "Task id: ${SLURM_ARRAY_TASK_ID}"
#>
#> ## Load the R module
#> module load conda_R/4.4
#>
#> ## List current modules for reproducibility
#> module list
#>
#> ## Edit with your job command
#> Rscript bsp2_test.R --region ${region} --feature ${feature}
#>
#> echo "**** Job ends ****"
#> date
#>
#> } > $log_path 2>&1
#>
#> ## This script was made using slurmjobs version 1.2.5
#> ## available from http://research.libd.org/slurmjobs/
First, note the line
Rscript bsp2_test.R --region ${region} --feature ${feature}
.
Every task of the array job passes a unique combination of
${region}
and ${feature}
to R.
Notice also that logs from executing this shell script get named with
each of the variables’ values in addition to the array task ID. For
example, the log for the first task would be
logs/DLPFC_gene_1.txt
. Also, the array specifies 8 tasks
total (the product of the number of region
s and
feature
s).
Let’s also look at the R script.
cat(script_pair[["R"]], sep = "\n")
#> library(getopt)
#> library(sessioninfo)
#>
#> # Import command-line parameters
#> spec <- matrix(
#> c(
#> c("region", "feature"),
#> c("r", "f"),
#> rep("1", 2),
#> rep("character", 2),
#> rep("Add variable description here", 2)
#> ),
#> ncol = 5
#> )
#> opt <- getopt(spec)
#>
#> message("Using the following parameters:")
#> print(opt)
#>
#> message("Memory usage:")
#> gc()
#>
#> session_info()
#>
#> ## This script was made using slurmjobs version 1.2.5
#> ## available from http://research.libd.org/slurmjobs/
The code related to getopt
at the top of the script
reads in the unique combination of variable values into a list called
opt
here. For example, one task of the array job might
yield values for opt$region
and opt$feature
to
be "DLPFC"
and "gene"
, respectively.
Shell scripts created with job_single()
or
job_loop()
may be submitted as batch jobs with
sbatch
(e.g. sbatch myscript.sh
). Note no
additional arguments to sbatch
are required since all
configuration is specified within the shell script.
The array_submit()
helper function was also intended to
make job submission easier. In particular, it addresses a common case
where after a large array job was run, a handful of tasks fail (such as
due to temporary file-system issues). array_submit()
helps
re-submit failed tasks.
Below we’ll create an example array job with
job_single()
, then do a dry run of
array_submit()
to demonstrate its basic usage.
job_single(
name = "my_array_job", memory = "5G", cores = 1, create_shell = TRUE,
task_num = 10
)
#> 2025-01-06 16:47:01.592431 creating the logs directory at: logs
#> 2025-01-06 16:47:01.593619 creating the shell file my_array_job.sh
#> To submit the job use: sbatch my_array_job.sh
# Suppose that tasks 3, 6, 7, and 8 failed
array_submit(name = "my_array_job", task_ids = c(3, 6:8), submit = FALSE)
While task_ids
can be provided explicitly as above, the
real convenience comes from the ability to run
array_submit()
without specifying task_ids
. As
long as the original array job was created with
job_single()
or job_loop()
and submitted as-is
(on the full set of tasks), array_submit()
can
automatically find the failed tasks by reading the shell script
(my_array_job.sh
), grabbing the original array job ID from
the log, and internally calling job_report()
).
# Not run here, since we aren't on a SLURM cluster
array_submit(name = "my_array_job", submit = FALSE)
The job_info()
function provides wrappers around the
squeue
and sstat
utilities SLURM provides for
monitoring specific jobs and how busy partitions are. The general idea
is to provide the information output from squeue
into a
tibble
, while retrieving memory-utilization information
that ordinarily must be retrieved manually on a job-by-job basis with
sstat -j [specific job ID]
.
On a SLURM system, you’d run
job_info_df = job_info(user = NULL, partition = "shared")
here, to get every user’s jobs running on the “shared” partition. We’ll
load an example output directly here.
# On a real SLURM system
print(job_info_df)
#> # A tibble: 100 × 11
#> job_id max_rss_gb max_vmem_gb user array_task_id name partition cpus
#> <dbl> <dbl> <dbl> <chr> <int> <chr> <fct> <int>
#> 1 222106 NA NA user1 69 my_job_1 shared 2
#> 2 271213 NA NA user1 37 my_job_2 shared 1
#> 3 280839 NA NA user1 11 my_job_3 shared 2
#> 4 285265 NA NA user1 31 my_job_3 shared 2
#> 5 285275 NA NA user1 41 my_job_3 shared 2
#> 6 285276 NA NA user1 42 my_job_3 shared 2
#> 7 285281 NA NA user1 47 my_job_3 shared 2
#> 8 285282 NA NA user1 48 my_job_3 shared 2
#> 9 301953 NA NA user2 180 my_job_4 shared 2
#> 10 301954 NA NA user2 440 my_job_5 shared 2
#> # ℹ 90 more rows
#> # ℹ 3 more variables: requested_mem_gb <dbl>, status <fct>,
#> # wallclock_time <drtn>
The benefit to having this data in R, now, is to be able to trivially ask summarizing questions. First, “how much memory and how many CPUs am I currently using?” Knowing this answer can help ensure fair and civil use of shared computing resources, for example on a computing cluster.
Sometimes, it’s useful to know about the partitions as a whole rather
than about specific jobs. partition_info()
serves this
purpose, and parses sinfo
output into a
tibble
. We’ll load an example of the output from
partition_info(partition = NULL, all_nodes = FALSE)
.
print(partition_df)
#> # A tibble: 5 × 7
#> partition free_cpus total_cpus prop_free_cpus free_mem_gb total_mem_gb
#> <chr> <int> <int> <dbl> <dbl> <dbl>
#> 1 partition_1 48 48 1 126. 128.
#> 2 partition_2 324 384 0.844 1050. 1643.
#> 3 partition_3 48 48 1 127. 128.
#> 4 partition_4 412 1024 0.402 2806. 4126.
#> 5 partition_5 76 128 0.594 519. 1000.
#> # ℹ 1 more variable: prop_free_mem_gb <dbl>
Since all_nodes
was FALSE
, there’s one row
per partition, summarizing information across all nodes that compose
each partition. Alternatively, set all_nodes
to
TRUE
to yield one row per node.
With partition_df
, let’s summarize how busy the cluster
is as a whole, then rank partitions by amount of free memory.
# Print the proportion of CPUs and memory available for the whole cluster
partition_df |>
summarize(
prop_free_cpus = sum(free_cpus) / sum(total_cpus),
prop_free_mem_gb = sum(free_mem_gb) / sum(total_mem_gb)
) |>
print()
#> # A tibble: 1 × 2
#> prop_free_cpus prop_free_mem_gb
#> <dbl> <dbl>
#> 1 0.556 0.659
# Now let's take the top 3 partitions by memory currently available
partition_df |>
arrange(desc(free_mem_gb)) |>
select(partition, free_mem_gb) |>
slice_head(n = 3)
#> # A tibble: 3 × 2
#> partition free_mem_gb
#> <chr> <dbl>
#> 1 partition_4 2806.
#> 2 partition_2 1050.
#> 3 partition_5 519.
The job_report()
function returns in-depth information
about a single queued, running, or finished job (including a single
array job). It combines functionality from SLURM’s sstat
and sacct
to return a tibble for easy manipulation in
R.
Suppose you have a workflow that operates as an array job, and you’d
like to profile memory usage across the many tasks. Suppose we’ve done
an initial trial, setting memory relatively high just to get the jobs
running without issues. One use of job_report
could be to
determine a better memory request in a data-driven way– the better
settings can then be run on the larger dataset after the initial
test.
On an actual system with SLURM installed, you’d normally run
something like job_df = job_report(slurm_job_id)
for the
slurm_job_id
(character or integer) representing the small
test. For convenience, we’ll start from the output of
job_report
as available in the slurmjobs
package.
job_df <- readRDS(
system.file("extdata", "job_report_df.rds", package = "slurmjobs")
)
print(job_df)
#> # A tibble: 10 × 12
#> job_id user name partition cpus requested_mem_gb max_rss_gb max_vmem_gb
#> <int> <chr> <chr> <fct> <int> <dbl> <dbl> <dbl>
#> 1 297332 user1 broken_… shared 2 5 0.04 0.04
#> 2 297333 user1 broken_… shared 2 5 0.48 0.48
#> 3 297334 user1 broken_… shared 2 5 0.61 0.61
#> 4 297335 user1 broken_… shared 2 5 0.04 0.04
#> 5 297336 user1 broken_… shared 2 5 1.15 1.15
#> 6 297337 user1 broken_… shared 2 5 1.38 1.38
#> 7 297338 user1 broken_… shared 2 5 0.04 0.04
#> 8 297339 user1 broken_… shared 2 5 0.04 0.04
#> 9 297340 user1 broken_… shared 2 5 0.04 0.04
#> 10 297331 user1 broken_… shared 2 5 1.16 1.16
#> # ℹ 4 more variables: array_task_id <int>, exit_code <int>,
#> # wallclock_time <drtn>, status <fct>
Now let’s choose a better memory request:
stat_df <- job_df |>
# This example includes tasks that fail. We're only interested in memory
# for successfully completed tasks
filter(status != "FAILED") |>
summarize(
mean_mem = mean(max_vmem_gb),
std_mem = sd(max_vmem_gb),
max_mem = max(max_vmem_gb)
)
# We could choose a new memory request as 3 standard deviations above the mean
# of actual memory usage
new_limit <- stat_df$mean_mem + 3 * stat_df$std_mem
print(
sprintf(
"%.02fG is a better memory request than %.02fG, which was used before",
new_limit,
job_df$requested_mem_gb[1]
)
)
#> [1] "2.12G is a better memory request than 5.00G, which was used before"
The slurmjobs package (LieberInstitute, 2025) was made possible thanks to:
This package was developed using biocthis.
Code for creating the vignette
## Create the vignette
library("rmarkdown")
system.time(render("slurmjobs.Rmd", "BiocStyle::html_document"))
## Extract the R code
library("knitr")
knit("slurmjobs.Rmd", tangle = TRUE)
Date the vignette was generated.
#> [1] "2025-01-06 16:47:02 UTC"
Wallclock time spent generating the vignette.
#> Time difference of 1.988 secs
R
session information.
#> ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.4.2 (2024-10-31)
#> os Ubuntu 24.04.1 LTS
#> system x86_64, linux-gnu
#> ui X11
#> language en
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz UTC
#> date 2025-01-06
#> pandoc 3.6 @ /usr/bin/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> backports 1.5.0 2024-05-23 [1] RSPM (R 4.4.0)
#> bibtex 0.5.1 2023-01-26 [1] RSPM (R 4.4.0)
#> BiocManager 1.30.25 2024-08-28 [2] CRAN (R 4.4.2)
#> BiocStyle * 2.34.0 2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
#> bookdown 0.41 2024-10-16 [1] RSPM (R 4.4.0)
#> bslib 0.8.0 2024-07-29 [2] RSPM (R 4.4.0)
#> cachem 1.1.0 2024-05-16 [2] RSPM (R 4.4.0)
#> cli 3.6.3 2024-06-21 [2] RSPM (R 4.4.0)
#> crayon 1.5.3 2024-06-20 [2] RSPM (R 4.4.0)
#> desc 1.4.3 2023-12-10 [2] RSPM (R 4.4.0)
#> digest 0.6.37 2024-08-19 [2] RSPM (R 4.4.0)
#> dplyr * 1.1.4 2023-11-17 [1] RSPM (R 4.4.0)
#> evaluate 1.0.1 2024-10-10 [2] RSPM (R 4.4.0)
#> fastmap 1.2.0 2024-05-15 [2] RSPM (R 4.4.0)
#> fs 1.6.5 2024-10-30 [2] RSPM (R 4.4.0)
#> generics 0.1.3 2022-07-05 [1] RSPM (R 4.4.0)
#> glue 1.8.0 2024-09-30 [2] RSPM (R 4.4.0)
#> htmltools 0.5.8.1 2024-04-04 [2] RSPM (R 4.4.0)
#> htmlwidgets 1.6.4 2023-12-06 [2] RSPM (R 4.4.0)
#> httr 1.4.7 2023-08-15 [1] RSPM (R 4.4.0)
#> jquerylib 0.1.4 2021-04-26 [2] RSPM (R 4.4.0)
#> jsonlite 1.8.9 2024-09-20 [2] RSPM (R 4.4.0)
#> knitcitations * 1.0.12 2021-01-10 [1] RSPM (R 4.4.0)
#> knitr 1.49 2024-11-08 [2] RSPM (R 4.4.0)
#> lifecycle 1.0.4 2023-11-07 [2] RSPM (R 4.4.0)
#> lubridate 1.9.4 2024-12-08 [1] RSPM (R 4.4.0)
#> magrittr 2.0.3 2022-03-30 [2] RSPM (R 4.4.0)
#> pillar 1.10.0 2024-12-17 [2] RSPM (R 4.4.0)
#> pkgconfig 2.0.3 2019-09-22 [2] RSPM (R 4.4.0)
#> pkgdown 2.1.1 2024-09-17 [2] RSPM (R 4.4.0)
#> plyr 1.8.9 2023-10-02 [1] RSPM (R 4.4.0)
#> purrr 1.0.2 2023-08-10 [2] RSPM (R 4.4.0)
#> R6 2.5.1 2021-08-19 [2] RSPM (R 4.4.0)
#> ragg 1.3.3 2024-09-11 [2] RSPM (R 4.4.0)
#> Rcpp 1.0.13-1 2024-11-02 [2] RSPM (R 4.4.0)
#> RefManageR * 1.4.0 2022-09-30 [1] RSPM (R 4.4.0)
#> rlang 1.1.4 2024-06-04 [2] RSPM (R 4.4.0)
#> rmarkdown 2.29 2024-11-04 [2] RSPM (R 4.4.0)
#> sass 0.4.9 2024-03-15 [2] RSPM (R 4.4.0)
#> sessioninfo * 1.2.2 2021-12-06 [2] RSPM (R 4.4.0)
#> slurmjobs * 1.2.5 2025-01-06 [1] local
#> stringi 1.8.4 2024-05-06 [2] RSPM (R 4.4.0)
#> stringr 1.5.1 2023-11-14 [2] RSPM (R 4.4.0)
#> systemfonts 1.1.0 2024-05-15 [2] RSPM (R 4.4.0)
#> textshaping 0.4.1 2024-12-06 [2] RSPM (R 4.4.0)
#> tibble 3.2.1 2023-03-20 [2] RSPM (R 4.4.0)
#> tidyselect 1.2.1 2024-03-11 [1] RSPM (R 4.4.0)
#> timechange 0.3.0 2024-01-18 [1] RSPM (R 4.4.0)
#> utf8 1.2.4 2023-10-22 [2] RSPM (R 4.4.0)
#> vctrs 0.6.5 2023-12-01 [2] RSPM (R 4.4.0)
#> withr 3.0.2 2024-10-28 [2] RSPM (R 4.4.0)
#> xfun 0.49 2024-10-31 [2] RSPM (R 4.4.0)
#> xml2 1.3.6 2023-12-04 [2] RSPM (R 4.4.0)
#> yaml 2.3.10 2024-07-26 [2] RSPM (R 4.4.0)
#>
#> [1] /__w/_temp/Library
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/local/lib/R/library
#>
#> ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
This vignette was generated using BiocStyle (Oleś, 2024) with knitr (Xie, 2024) and rmarkdown (Allaire, Xie, Dervieux et al., 2024) running behind the scenes.
Citations made with RefManageR (McLean, 2017).
[1] J. Allaire, Y. Xie, C. Dervieux, et al. rmarkdown: Dynamic Documents for R. R package version 2.29. 2024. URL: https://github.com/rstudio/rmarkdown.
[2] LieberInstitute. slurmjobs: Helper Functions for SLURM Jobs. https://github.com/LieberInstitute/slurmjobs/slurmjobs - R package version 1.2.5. 2025. DOI: 10.18129/B9.bioc.slurmjobs. URL: http://www.bioconductor.org/packages/slurmjobs.
[3] M. W. McLean. “RefManageR: Import and Manage BibTeX and BibLaTeX References in R”. In: The Journal of Open Source Software (2017). DOI: 10.21105/joss.00338.
[4] A. Oleś. BiocStyle: Standard styles for vignettes and other Bioconductor documents. R package version 2.34.0. 2024. DOI: 10.18129/B9.bioc.BiocStyle. URL: https://bioconductor.org/packages/BiocStyle.
[5] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, 2024. URL: https://www.R-project.org/.
[6] H. Wickham. “testthat: Get Started with Testing”. In: The R Journal 3 (2011), pp. 5–10. URL: https://journal.r-project.org/archive/2011-1/RJournal_2011-1_Wickham.pdf.
[7] H. Wickham, W. Chang, R. Flight, et al. sessioninfo: R Session Information. R package version 1.2.2, https://r-lib.github.io/sessioninfo/. 2021. URL: https://github.com/r-lib/sessioninfo#readme.
[8] H. Wickham, R. François, L. Henry, et al. dplyr: A Grammar of Data Manipulation. R package version 1.1.4, https://github.com/tidyverse/dplyr. 2023. URL: https://dplyr.tidyverse.org.
[9] Y. Xie. knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.49. 2024. URL: https://yihui.org/knitr/.