Basics

Install sgejobs

R is an open-source statistical environment which can be easily modified to enhance its functionality via packages. sgejobs is a R package available via GitHub. R can be installed on any operating system from CRAN after which you can install sgejobs by using the following commands in your R session:

if (!requireNamespace("remotes", quietly = TRUE)) {
    install.packages("remotes")
}

remotes::install_github("LieberInstitute/sgejobs")

Asking for help

As package developers, we try to explain clearly how to use our packages and in which order to use the functions. But R and JHPCE/SGE have a steep learning curve so it is critical to learn where to ask for help. For JHPCE questions, please use check the JHPCE help page. For sgejobs please post issues in GitHub. However, please note that if you want to receive help you should adhere to the posting guidelines. It is particularly critical that you provide a small reproducible example and your session information so package developers can track down the source of the error.

Citing sgejobs

We hope that sgejobs will be useful for your research. Please use the following information to cite the package and the overall approach. Thank you!

## Citation info
citation("sgejobs")
#> To cite package 'sgejobs' in publications use:
#> 
#>   Collado-Torres L (2023). _sgejobs: Helper functions for SGE jobs at
#>   JHPCE_. R package version 0.99.2,
#>   <https://github.com/LieberInstitute/sgejobs>.
#> 
#> A BibTeX entry for LaTeX users is
#> 
#>   @Manual{,
#>     title = {sgejobs: Helper functions for SGE jobs at JHPCE},
#>     author = {Leonardo Collado-Torres},
#>     year = {2023},
#>     note = {R package version 0.99.2},
#>     url = {https://github.com/LieberInstitute/sgejobs},
#>   }

Overview

The package sgejobs contains a few helper functions for those of you interacting with a Son of Grid Engine1 high-performance computing cluster such as JHPCE.

To start off, load the package.

Creating SGE job bash scripts

At this point, if you are starting from stratch you might want to create a bash script for submitting an SGE job. For this purpose, sgejobs contains two functions that help you build either a single SGE job (could be an array job) or a script that loops through some variables and creates SGE bash scripts. These are job_single() and job_loop().

job_single() specifies many arguments that control common SGE job options such as the number of cores, the cluster queue, and memory requirements. Note that the resulting bash script contains some information that can be useful for reproducibility purposes such as the SGE job ID, the compute node it ran on, the list of modules loaded by the user, etc.

## job_single() builds a template bash script for a single job
job_single("jhpce_job", create_logdir = FALSE)
#> #!/bin/bash
#> #$ -cwd
#> #$ -l mem_free=10G,h_vmem=10G,h_fsize=100G
#> #$ -N jhpce_job
#> #$ -o logs/jhpce_job.txt
#> #$ -e logs/jhpce_job.txt
#> #$ -m e
#> 
#> echo "**** Job starts ****"
#> date
#> 
#> echo "**** JHPCE info ****"
#> echo "User: ${USER}"
#> echo "Job id: ${JOB_ID}"
#> echo "Job name: ${JOB_NAME}"
#> echo "Hostname: ${HOSTNAME}"
#> echo "Task id: ${SGE_TASK_ID}"
#> 
#> ## Load the R module (absent since the JHPCE upgrade to CentOS v7)
#> module load conda_R
#> 
#> ## List current modules for reproducibility
#> module list
#> 
#> ## Edit with your job command
#> Rscript -e "options(width = 120); sessioninfo::session_info()"
#> 
#> echo "**** Job ends ****"
#> date
#> 
#> ## This script was made using sgejobs version 0.99.2
#> ## available from http://research.libd.org/sgejobs/

## if you specify task_num then it becomes an array job
job_single("jhpce_array_job", create_logdir = FALSE, task_num = 20)
#> #!/bin/bash
#> #$ -cwd
#> #$ -l mem_free=10G,h_vmem=10G,h_fsize=100G
#> #$ -N jhpce_array_job
#> #$ -o logs/jhpce_array_job.$TASK_ID.txt
#> #$ -e logs/jhpce_array_job.$TASK_ID.txt
#> #$ -m e
#> #$ -t 1-20
#> #$ -tc 20
#> 
#> echo "**** Job starts ****"
#> date
#> 
#> echo "**** JHPCE info ****"
#> echo "User: ${USER}"
#> echo "Job id: ${JOB_ID}"
#> echo "Job name: ${JOB_NAME}"
#> echo "Hostname: ${HOSTNAME}"
#> echo "Task id: ${SGE_TASK_ID}"
#> 
#> ## Load the R module (absent since the JHPCE upgrade to CentOS v7)
#> module load conda_R
#> 
#> ## List current modules for reproducibility
#> module list
#> 
#> ## Edit with your job command
#> Rscript -e "options(width = 120); sessioninfo::session_info()"
#> 
#> echo "**** Job ends ****"
#> date
#> 
#> ## This script was made using sgejobs version 0.99.2
#> ## available from http://research.libd.org/sgejobs/

job_loop() is a little bit more complicated since you have to specify the loops named list argument. The loops argument specifies the bash variable names and values to loop through for creating a series of bash scripts that will get submitted to SGE. This type of bash script is something we use frequently, for example in the compute_weights.sh script (Collado-Torres, Burke, Peterson, Shin et al., 2019). This type of script generator I believe is something Alyssa Frazee taught me back in the day which you can see in some old repositories such as leekgroup/derSoftware. Besides the loops argument, job_loop() shares most of the options with job_single().

job_loop(
    loops = list(region = c("DLPFC", "HIPPO"), feature = c("gene", "exon", "tx", "jxn")),
    name = "bsp2_test"
)
#> #!/bin/bash
#> 
#> ## Usage:
#> # sh bsp2_test.sh
#> 
#> ## Create the logs directory
#> mkdir -p logs
#> 
#> for region in DLPFC HIPPO; do
#>     for feature in gene exon tx jxn; do
#> 
#>     ## Internal script name
#>     SHORT="bsp2_test_${region}_${feature}"
#> 
#>     # Construct shell file
#>     echo "Creating script bsp2_test_${region}_${feature}"
#>     cat > .${SHORT}.sh <<EOF
#> #!/bin/bash
#> #$ -cwd
#> #$ -l mem_free=10G,h_vmem=10G,h_fsize=100G
#> #$ -N ${SHORT}
#> #$ -o logs/${SHORT}.txt
#> #$ -e logs/${SHORT}.txt
#> #$ -m e
#> 
#> echo "**** Job starts ****"
#> date
#> 
#> echo "**** JHPCE info ****"
#> echo "User: \${USER}"
#> echo "Job id: \${JOB_ID}"
#> echo "Job name: \${JOB_NAME}"
#> echo "Hostname: \${HOSTNAME}"
#> echo "Task id: \${SGE_TASK_ID}"
#> 
#> ## Load the R module (absent since the JHPCE upgrade to CentOS v7)
#> module load conda_R
#> 
#> ## List current modules for reproducibility
#> module list
#> 
#> ## Edit with your job command
#> Rscript -e "options(width = 120); print('${region}'); print('${feature}'); sessioninfo::session_info()"
#> 
#> echo "**** Job ends ****"
#> date
#> 
#> ## This script was made using sgejobs version 0.99.2
#> ## available from http://research.libd.org/sgejobs/
#> 
#> 
#> EOF
#> 
#>     call="qsub .${SHORT}.sh"
#>     echo $call
#>     $call
#>     done
#> done

Submitting SGE jobs

Once you fine-tune your SGE bash script, you can submit it with qsub. sgejobs also contains some helper functions for a few different scenarios.

To start off, array_submit() can take a local bash script and submit it for you. However, the need for it becomes more apparent when multiple tasks for an SGE array job fail. You can detect tasks that failed using qstat | grep Eqw or similar SGE commands. To build a vector of tasks that you need to re-run, parse_task_ids() will be helpful.

## This is an example true output from "qstat | grep Eqw"
## 7770638 0.50105 compute_we lcollado     Eqw   08/16/2019 14:57:25                                    1 225001-225007:1,225009-225017:2,225019-225038:1,225040,225043,225047,225049
## parse_task_ids() can then take as input the list of task ids that failed
task_ids <- parse_task_ids("225001-225007:1,225009-225017:2,225019-225038:1,225040,225043,225047,225049")
task_ids
#> [1] "225001-225007:1" "225009-225017:2" "225019-225038:1" "225040-225040"  
#> [5] "225043-225043"   "225047-225047"   "225049-225049"

Next, you can re-submit those task ids for the given SGE bash script that failed using array_submit(). While in this example we are explicitely providing the output of parse_task_ids() to array_submit(), it is not neccessary since parse_task_ids() is used internally by array_submit().

## Choose a script name
job_name <- paste0("array_submit_example_", Sys.Date())

## Create an array job on the temporary directory
with_wd(tempdir(), {
    ## Create an array job script to use for this example
    job_single(
        name = job_name,
        create_shell = TRUE,
        task_num = 100
    )

    ## Now we can submit the SGE job for a set of task IDs
    array_submit(
        job_bash = paste0(job_name, ".sh"),
        task_ids = task_ids,
        submit = FALSE
    )
})
#> 2023-05-07 07:12:18.80442 creating the logs directory at:  logs
#> 2023-05-07 07:12:18.805849 creating the shell file array_submit_example_2023-05-07.sh
#> To submit the job use: qsub array_submit_example_2023-05-07.sh
#> 2023-05-07 07:12:18.807259 resubmitting the SGE job for task 225001-225007:1
#> qsub array_submit_example_2023-05-07.sh
#> 2023-05-07 07:12:18.807899 resubmitting the SGE job for task 225009-225017:2
#> qsub array_submit_example_2023-05-07.sh
#> 2023-05-07 07:12:18.808508 resubmitting the SGE job for task 225019-225038:1
#> qsub array_submit_example_2023-05-07.sh
#> 2023-05-07 07:12:18.809109 resubmitting the SGE job for task 225040-225040
#> qsub array_submit_example_2023-05-07.sh
#> 2023-05-07 07:12:18.809724 resubmitting the SGE job for task 225043-225043
#> qsub array_submit_example_2023-05-07.sh
#> 2023-05-07 07:12:18.810359 resubmitting the SGE job for task 225047-225047
#> qsub array_submit_example_2023-05-07.sh
#> 2023-05-07 07:12:18.810962 resubmitting the SGE job for task 225049-225049
#> qsub array_submit_example_2023-05-07.sh
#> [1] "array_submit_example_2023-05-07.sh"

Another scenario that you might run when submitting SGE array jobs is that there is a limit on the number of tasks a job can have. For example, at JHPCE that limit is 75,000. If you have a job with 150,000 tasks, you could submit first a job for tasks 1 till 75,000 then a second one for 75,001 to 150,000. However, to simplify this process, array_submit_num() does this for you.

## Choose a script name
job_name <- paste0("array_submit_num_example_", Sys.Date())

## Create an array job on the temporary directory
with_wd(tempdir(), {
    ## Create an array job script to use for this example
    job_single(
        name = job_name,
        create_shell = TRUE,
        task_num = 1
    )

    ## Now we can submit the SGE job for a given number of tasks
    array_submit_num(
        job_bash = paste0(job_name, ".sh"),
        array_num = 150000,
        submit = FALSE
    )
})
#> 2023-05-07 07:12:18.977514 creating the logs directory at:  logs
#> 2023-05-07 07:12:18.978831 creating the shell file array_submit_num_example_2023-05-07.sh
#> To submit the job use: qsub array_submit_num_example_2023-05-07.sh
#> 2023-05-07 07:12:18.980417 resubmitting the SGE job for task 1-75000
#> qsub array_submit_num_example_2023-05-07.sh
#> 2023-05-07 07:12:18.981029 resubmitting the SGE job for task 75001-150000
#> qsub array_submit_num_example_2023-05-07.sh
#> [1] "array_submit_num_example_2023-05-07.sh"

SGE job log files and accounting

If you built your SGE job bash scripts using the functions from this package, you can extract information from the log files such as the date when the job started and ended as well as the SGE job ID. These two pieces of information can be extracted using log_date() and log_jobid() as shown below:

## Example log file
bsp2_log <- system.file("extdata", "logs", "delete_bsp2.txt", package = "sgejobs")

## Find start/end dates
log_date(bsp2_log)
#>                     start                       end 
#> "2019-09-07 14:40:21 EST" "2019-09-07 14:40:59 EST"

## Find job id
log_jobid(bsp2_log)
#> [1] "92500"

The idea was that these two functions along with a third function (that is yet to be implemented) would help you find the accounting file for a given SGE job, such that you could extract the accounting information using qacct from that given file. For example, qacct -f /cm/shared/apps/sge/sge-8.1.9/default/common/accounting_20191007_0300.txt -j 92500.

In any case, once you have the SGE job id (be it an array job or not), you can extract the information using the accounting() functions. Since some require that qacct be available on your system, we split the code into functions that read the data and parse it.

## If you are at JHPCE you can run this:
if (FALSE) {
    accounting(
        c("92500", "77672"),
        "/cm/shared/apps/sge/sge-8.1.9/default/common/accounting_20191007_0300.txt"
    )
}

## However, if you are not at JHPCE, we included some example data

# Here we use the data included in the package to avoid depending on JHPCE
## where the data for job 77672 has been subset for the first two tasks.
accounting_info <- list(
    "92500" = readLines(system.file("extdata", "accounting", "92500.txt",
        package = "sgejobs"
    )),
    "77672" = readLines(system.file("extdata", "accounting", "77672.txt",
        package = "sgejobs"
    ))
)

## Here we parse the data from `qacct` into a data.frame
res <- accounting_parse(accounting_info)
#> 2023-05-07 07:12:19.15694 processing job 92500
#> 2023-05-07 07:12:19.163036 processing job 77672
#> Note: the column 'mem' is now in bytes / second.
res
#>   input_id account ar_sub_time      arid
#> 1  77672.1     sge   undefined undefined
#> 2  77672.2     sge   undefined undefined
#> 3  92500.0     sge   undefined undefined
#>                                                                     category
#> 1 -u lcollado -l h_fsize=100G,h_stack=512M,h_vmem=3G,mem_free=3G -pe local 4
#> 2 -u lcollado -l h_fsize=100G,h_stack=512M,h_vmem=3G,mem_free=3G -pe local 4
#> 3           -u lcollado -l h_fsize=100G,h_stack=512M,h_vmem=10G,mem_free=10G
#>        cpu        department            end_time exit_status failed granted_pe
#> 1 747.887s defaultdepartment 2019-09-04 11:55:14           0      0      local
#> 2 878.682s defaultdepartment 2019-09-04 12:00:26           0      0      local
#> 3   1.559s defaultdepartment 2019-09-07 14:40:59           0      0       NONE
#>          group               hostname          io    iow
#> 1 lieber_jaffe compute-051.cm.cluster 41198000000 0.000s
#> 2 lieber_jaffe compute-051.cm.cluster 35579000000 0.000s
#> 3 lieber_jaffe compute-089.cm.cluster      779092 0.000s
#>                                 jobname jobnumber   maxvmem         mem
#> 1 compute_aucs_duplicatesRemoved_v0.4.0     77672 953344000 6.93543e+11
#> 2 compute_aucs_duplicatesRemoved_v0.4.0     77672 953348000 6.31653e+11
#> 3                           delete_bsp2     92500   5867000 6.07126e+05
#>      owner priority project    qname           qsub_time ru_idrss ru_inblock
#> 1 lcollado        0    NONE shared.q 2019-09-04 11:45:54        0   33755664
#> 2 lcollado        0    NONE shared.q 2019-09-04 11:45:54        0   40880344
#> 3 lcollado        0    NONE shared.q 2019-09-07 14:40:00        0         64
#>   ru_ismrss ru_isrss ru_ixrss ru_majflt ru_maxrss ru_minflt ru_msgrcv ru_msgsnd
#> 1         0        0        0         1    953348    124265         0         0
#> 2         0        0        0         0    953348    132866         0         0
#> 3         0        0        0         0      7004     53903         0         0
#>   ru_nivcsw ru_nsignals ru_nswap ru_nvcsw ru_oublock ru_stime ru_utime
#> 1     44094           0        0  3237249        192 135.616s 612.272s
#> 2     63844           0        0  3572098        192 155.540s 723.142s
#> 3       107           0        0     8837        184   0.814s   0.745s
#>   ru_wallclock slots          start_time    taskid
#> 1         526s     4 2019-09-04 11:46:28         1
#> 2         832s     4 2019-09-04 11:46:34         2
#> 3          43s     1 2019-09-07 14:40:16 undefined

## Check the maximum memory use
as.numeric(res$maxvmem)
#> [1] 953344000 953348000   5867000

## The minimum memory (from the maxvmem) used
pryr:::show_bytes(min(res$maxvmem))
#> 5.87 MB

## And the absolute maximum
pryr:::show_bytes(max(res$maxvmem))
#> 953 MB

The accounting() functions will work for array jobs as well as regular jobs which can be helpful to identify if some tasks are failing (exit_status other than 0) as well as their maximum memory use. From these functions you can store more information about your SGE jobs than the one you get from the automatic emails (#$ -m e).

Conclusions

We hope that the functions in sgejobs will make your work more reprocible and your life easier in terms of interacting with SGE. If you think of other potential use cases for sgejobs please let us know. Also note that you might be interested in clusterRundown written by John Muschelli for getting a rundown of cluster resources through qstat, qmem, qhost as well as qacct.

Reproducibility

The sgejobs package (Collado-Torres, 2023) was made possible thanks to:

Code for creating the vignette

## Create the vignette
library("rmarkdown")
system.time(render("sgejobs-quickstart.Rmd", "BiocStyle::html_document"))

## Extract the R code
library("knitr")
knit("sgejobs-quickstart.Rmd", tangle = TRUE)

Date the vignette was generated.

#> [1] "2023-05-07 07:12:19 UTC"

Wallclock time spent generating the vignette.

#> Time difference of 3.101 secs

R session information.

#> ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.0 (2023-04-21)
#>  os       Ubuntu 22.04.2 LTS
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language en
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       UTC
#>  date     2023-05-07
#>  pandoc   2.19.2 @ /usr/local/bin/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
#>  package       * version date (UTC) lib source
#>  backports       1.4.1   2021-12-13 [1] CRAN (R 4.3.0)
#>  bibtex          0.5.1   2023-01-26 [1] RSPM (R 4.3.0)
#>  BiocManager     1.30.20 2023-02-24 [2] CRAN (R 4.3.0)
#>  BiocStyle     * 2.28.0  2023-04-25 [1] Bioconductor
#>  bookdown        0.33    2023-03-06 [1] RSPM (R 4.3.0)
#>  bslib           0.4.2   2022-12-16 [2] RSPM (R 4.3.0)
#>  cachem          1.0.8   2023-05-01 [2] RSPM (R 4.3.0)
#>  cli             3.6.1   2023-03-23 [2] RSPM (R 4.3.0)
#>  codetools       0.2-19  2023-02-01 [3] CRAN (R 4.3.0)
#>  crayon          1.5.2   2022-09-29 [2] RSPM (R 4.3.0)
#>  curl            5.0.0   2023-01-12 [2] RSPM (R 4.3.0)
#>  desc            1.4.2   2022-09-08 [2] RSPM (R 4.3.0)
#>  digest          0.6.31  2022-12-11 [2] RSPM (R 4.3.0)
#>  dplyr           1.1.2   2023-04-20 [1] RSPM (R 4.3.0)
#>  evaluate        0.20    2023-01-17 [2] RSPM (R 4.3.0)
#>  fansi           1.0.4   2023-01-22 [2] RSPM (R 4.3.0)
#>  fastmap         1.1.1   2023-02-24 [2] RSPM (R 4.3.0)
#>  fs              1.6.2   2023-04-25 [2] RSPM (R 4.3.0)
#>  generics        0.1.3   2022-07-05 [1] CRAN (R 4.3.0)
#>  glue            1.6.2   2022-02-24 [2] RSPM (R 4.3.0)
#>  hms             1.1.3   2023-03-21 [1] RSPM (R 4.3.0)
#>  htmltools       0.5.5   2023-03-23 [2] RSPM (R 4.3.0)
#>  httr            1.4.5   2023-02-24 [2] RSPM (R 4.3.0)
#>  jquerylib       0.1.4   2021-04-26 [2] RSPM (R 4.3.0)
#>  jsonlite        1.8.4   2022-12-06 [2] RSPM (R 4.3.0)
#>  knitcitations * 1.0.12  2021-01-10 [1] RSPM (R 4.3.0)
#>  knitr           1.42    2023-01-25 [2] RSPM (R 4.3.0)
#>  lifecycle       1.0.3   2022-10-07 [2] RSPM (R 4.3.0)
#>  lubridate       1.9.2   2023-02-10 [1] RSPM (R 4.3.0)
#>  magrittr        2.0.3   2022-03-30 [2] RSPM (R 4.3.0)
#>  memoise         2.0.1   2021-11-26 [2] RSPM (R 4.3.0)
#>  pillar          1.9.0   2023-03-22 [2] RSPM (R 4.3.0)
#>  pkgconfig       2.0.3   2019-09-22 [2] RSPM (R 4.3.0)
#>  pkgdown         2.0.7   2022-12-14 [2] RSPM (R 4.3.0)
#>  plyr            1.8.8   2022-11-11 [1] CRAN (R 4.3.0)
#>  pryr            0.1.6   2023-01-17 [1] RSPM (R 4.3.0)
#>  purrr           1.0.1   2023-01-10 [2] RSPM (R 4.3.0)
#>  R6              2.5.1   2021-08-19 [2] RSPM (R 4.3.0)
#>  ragg            1.2.5   2023-01-12 [2] RSPM (R 4.3.0)
#>  Rcpp            1.0.10  2023-01-22 [2] RSPM (R 4.3.0)
#>  readr           2.1.4   2023-02-10 [1] RSPM (R 4.3.0)
#>  RefManageR      1.4.0   2022-09-30 [1] CRAN (R 4.3.0)
#>  rlang           1.1.1   2023-04-28 [2] RSPM (R 4.3.0)
#>  rmarkdown       2.21    2023-03-26 [2] RSPM (R 4.3.0)
#>  rprojroot       2.0.3   2022-04-02 [2] RSPM (R 4.3.0)
#>  sass            0.4.6   2023-05-03 [2] RSPM (R 4.3.0)
#>  sessioninfo   * 1.2.2   2021-12-06 [2] RSPM (R 4.3.0)
#>  sgejobs       * 0.99.2  2023-05-07 [1] local
#>  stringi         1.7.12  2023-01-11 [2] RSPM (R 4.3.0)
#>  stringr         1.5.0   2022-12-02 [2] RSPM (R 4.3.0)
#>  systemfonts     1.0.4   2022-02-11 [2] RSPM (R 4.3.0)
#>  textshaping     0.3.6   2021-10-13 [2] RSPM (R 4.3.0)
#>  tibble          3.2.1   2023-03-20 [2] RSPM (R 4.3.0)
#>  tidyr           1.3.0   2023-01-24 [1] RSPM (R 4.3.0)
#>  tidyselect      1.2.0   2022-10-10 [1] CRAN (R 4.3.0)
#>  timechange      0.2.0   2023-01-11 [1] RSPM (R 4.3.0)
#>  tzdb            0.3.0   2022-03-28 [1] CRAN (R 4.3.0)
#>  utf8            1.2.3   2023-01-31 [2] RSPM (R 4.3.0)
#>  vctrs           0.6.2   2023-04-19 [2] RSPM (R 4.3.0)
#>  withr           2.5.0   2022-03-03 [2] RSPM (R 4.3.0)
#>  xfun            0.39    2023-04-20 [2] RSPM (R 4.3.0)
#>  xml2            1.3.4   2023-04-27 [2] RSPM (R 4.3.0)
#>  yaml            2.3.7   2023-01-23 [2] RSPM (R 4.3.0)
#> 
#>  [1] /__w/_temp/Library
#>  [2] /usr/local/lib/R/site-library
#>  [3] /usr/local/lib/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Bibliography

This vignette was generated using BiocStyle (Oleś, 2023) with knitr (Xie, 2014) and rmarkdown (Allaire, Xie, Dervieux, McPherson et al., 2023) running behind the scenes.

Citations made with knitcitations (Boettiger, 2021).

[1] J. Allaire, Y. Xie, C. Dervieux, J. McPherson, et al. rmarkdown: Dynamic Documents for R. R package version 2.21. 2023. https://github.com/rstudio/rmarkdown.

[2] C. Boettiger. knitcitations: Citations for ‘Knitr’ Markdown Files. R package version 1.0.12. 2021. https://github.com/cboettig/knitcitations.

[3] L. Collado-Torres. sgejobs: Helper functions for SGE jobs at JHPCE. R package version 0.99.2. 2023. https://github.com/LieberInstitute/sgejobs.

[4] L. Collado-Torres, E. E. Burke, A. Peterson, J. Shin, et al. “Regional Heterogeneity in Gene Expression, Regulation, and Coherence in the Frontal Cortex and Hippocampus across Development and Schizophrenia”. In: Neuron 103.2 (Jul. 2019), pp. 203-216.e8. DOI: 10.1016/j.neuron.2019.05.013. https://doi.org/10.1016/j.neuron.2019.05.013.

[5] G. Csárdi, J. Hester, H. Wickham, W. Chang, et al. remotes: R Package Installation from Remote Repositories, Including ‘GitHub’. https://remotes.r-lib.org, https://github.com/r-lib/remotes#readme. 2021.

[6] G. Grolemund and H. Wickham. “Dates and Times Made Easy with lubridate”. In: Journal of Statistical Software 40.3 (2011), pp. 1-25. https://www.jstatsoft.org/v40/i03/.

[7] J. Hester and J. Bryan. glue: Interpreted String Literals. https://github.com/tidyverse/glue, https://glue.tidyverse.org/. 2022.

[8] A. Oleś. BiocStyle: Standard styles for vignettes and other Bioconductor documents. R package version 2.28.0. 2023. DOI: 10.18129/B9.bioc.BiocStyle. https://bioconductor.org/packages/BiocStyle.

[9] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, 2023. https://www.R-project.org/.

[10] H. Wickham. pryr: Tools for Computing on the Language. R package version 0.1.6. 2023. https://github.com/hadley/pryr.

[11] H. Wickham. “testthat: Get Started with Testing”. In: The R Journal 3 (2011), pp. 5-10. https://journal.r-project.org/archive/2011-1/RJournal_2011-1_Wickham.pdf.

[12] H. Wickham, W. Chang, R. Flight, K. Müller, et al. sessioninfo: R Session Information. https://github.com/r-lib/sessioninfo#readme, https://r-lib.github.io/sessioninfo/. 2021.

[13] H. Wickham and L. Henry. purrr: Functional Programming Tools. https://purrr.tidyverse.org/, https://github.com/tidyverse/purrr. 2023.

[14] H. Wickham, J. Hester, and J. Bryan. readr: Read Rectangular Text Data. https://readr.tidyverse.org, https://github.com/tidyverse/readr. 2023.

[15] H. Wickham, D. Vaughan, and M. Girlich. tidyr: Tidy Messy Data. https://tidyr.tidyverse.org, https://github.com/tidyverse/tidyr. 2023.

[16] Y. Xie. “knitr: A Comprehensive Tool for Reproducible Research in R”. In: Implementing Reproducible Computational Research. Ed. by V. Stodden, F. Leisch and R. D. Peng. ISBN 978-1466561595. Chapman and Hall/CRC, 2014.

Download the bibliography file.


  1. The open-source version of Sun Grid Engine.↩︎