SGE limits the number of tasks an array job can have when you submit it. At JHPCE that limit is 75000, which means that if you want to submit an array job for 75001 tasks, you first need to submit the job for task IDs 1 till 75000, and then for 75001 by itself. This function simplifies that process for you.

array_submit_num(
  job_bash,
  array_num,
  submit =
    file.exists("/cm/shared/apps/sge/sge-8.1.9/default/common/accounting_20191007_0300.txt"),
  restore = TRUE,
  array_max = 75000L
)

task_ids_num(array_num, array_max = 75000L)

Arguments

job_bash

A character(1) vector with the name of a bash script in the current working directory.

array_num

An integer(1) number of tasks to submit for the given job. The function will deal with the case where array_num is greater than array_max.

submit

A logical(1) vector determining whether to actually submit the tasks or not using qsub.

restore

A logical(1) vector determining whether to restore the script to the original state.

array_max

A maximum number of task per SGE array job. At JHPCE, that is 75000.

Value

array_submit_num: Uses array_submit() to submit an array job for a given number of tasks as determined by task_ids_num().

task_ids_num: A character vector of SGE task IDs compliant with array_max.

Author

Leonardo Collado-Torres

Examples


## Choose a script name
job_name <- paste0("array_submit_num_example_", Sys.Date())

## Create an array job on the temporary directory
with_wd(tempdir(), {
    ## Create an array job script to use for this example
    job_single(
        name = job_name,
        create_shell = TRUE,
        task_num = 1
    )

    ## Now we can submit the SGE job for a given number of tasks
    array_submit_num(
        job_bash = paste0(job_name, ".sh"),
        array_num = 75001,
        submit = FALSE
    )
})
#> 2023-05-07 07:12:13.717048 creating the logs directory at:  logs
#> 2023-05-07 07:12:13.718313 creating the shell file array_submit_num_example_2023-05-07.sh
#> To submit the job use: qsub array_submit_num_example_2023-05-07.sh
#> 2023-05-07 07:12:13.719863 resubmitting the SGE job for task 1-75000
#> qsub array_submit_num_example_2023-05-07.sh
#> 2023-05-07 07:12:13.720528 resubmitting the SGE job for task 75001-75001
#> qsub array_submit_num_example_2023-05-07.sh
#> [1] "array_submit_num_example_2023-05-07.sh"


## Get the list of task IDs for a different set of task numbers
task_ids_num(1)
#> [1] "1-1"
task_ids_num(75000)
#> [1] "1-75000"
task_ids_num(75001)
#> [1] "1-75000"     "75001-75001"
task_ids_num(150001)
#> [1] "1-75000"       "75001-150000"  "150001-150001"