3 All Command Options

3.1 SPEAQeasy Options

On the command line inside the SPEAQeasy repository, one can run Software/nextflow run main.nf --help (at JHPCE, nextflow run main.nf --help) to display the full list of SPEAQeasy options.

3.1.1 Mandatory Parameters

  • --sample “single” or “paired”: the orientation of your reads
  • --strand “unstranded”, “forward”, or “reverse”: the strandness of your reads. Since strandness is inferred by sample in the pipeline, this option informs the pipeline to generate appropriate warnings if unexpected strandness is inferred. See here for recommendations about choosing this parameter.
  • --reference “hg38”, “hg19”, “mm10”, or “rat”: the reference genome to which reads are aligned, where “rat” refers to either “Rnor_6.0” or “mRatBN7.2”, depending on the choice of ensembl_version_rat

3.1.2 Optional Parameters

  • --annotation The path to the directory containing pipeline annotations. Defaults to “[SPEAQeasy dir]/Annotation”. Note only full paths may be used! If annotations are not found here, the pipeline includes a step to build them.
  • --coverage Include this flag to produce coverage bigWigs and compute expressed genomic regions. These steps are a useful precursor for analyses involving finding differentially expressed regions (DERs). Default: false
  • --custom_anno [label] Include this flag to indicate that the directory specified with --annotation [dir] includes user-provided annotation files to use instead of the default files. See the “Using custom annotation” section for more details.
  • --ercc Include this flag to enable ERCC quantification with Kallisto. This is appropriate for experiments where samples have ERCC spike-ins.
  • --experiment Name of the experiment being run (ex: “alzheimer”). Defaults to “Jlab_experiment”
  • --fullCov Include this flag to perform full coverage analysis.
  • --help Provides an explanation about each of these command options, and exits.
  • --input The path to the directory with the “samples.manifest” file. Defaults to “[SPEAQeasy dir]/input”. Note only full paths may be used!
  • --keep_unpaired include this flag to keep unpaired reads output from trimming paired-end samples, for use in alignment. Default: false, as this can cause issues in downstream tools like FeatureCounts.
  • --output The path to the directory to store pipeline output files/ objects. Defaults to “[SPEAQeasy dir]/results”. Note only full paths may be used!
  • --prefix An additional label (in conjunction with “experiment”) to use for the pipeline run- affecting result file names.
  • --qsva [path] Optional full path to a text file, containing one Ensembl transcript ID per line for each transcript desired in the final transcripts R output object (called rse_tx)
  • --small_test Uses sample files located in the test folder as input. Overrides the “–input” option.
  • --strand_mode [mode] Determines how to handle disagreement between user-asserted and SPEAQeasy-inferred strandness:
    “accept”: warn about disagreement but continue, using SPEAQeasy-inferred strandness downstream
    “declare”: warn about disagreement but continue, using user-asserted strandness downstream
    “strict”: (default) halt with an error if any disagreement occurs (since this often indicates problems in the input data) See here for recommendations about choosing this parameter.
  • --trim_mode [mode] Determines the conditions under which trimming occurs:
    “skip”: do not perform trimming on samples
    “adaptive”: [default] perform trimming on samples that have failed the FastQC “Adapter content” metric
    “force”: perform trimming on all samples
  • --unalign Include this flag to save discordant reads (when using HISAT2) or unmapped reads (when using STAR) after the alignment step (false/ not included by default)
  • --use_salmon Include this flag to quantify transcripts with Salmon rather than the default of Kallisto.
  • --use_star Include this flag to use STAR during alignment, instead of the default of HISAT2.

3.2 Nextflow Options

The nextflow command itself provides many additional options you may add to your “main” script. A few of the most commonly applicable ones are documented below. For a full list, type [path to nextflow] run -h- the full list does not appear to be documented at nextflow’s website.

  • -w [path] Path to the directory where nextflow will place temporary files. This directory can fill up very quickly, especially for large experiments, and so it can be useful to set this to a scratch directory or filesystem with plenty of storage capacity.
  • -resume Include this flag if pipeline execution halts with an error for any reason, and you wish to continue where you left off from last run. Otherwise, by default, nextflow will restart execution from the beginning.
  • -with-report [filename] Include this to produce an html report with execution details (such as memory usage, completion details, and much more)
  • N [email address] Sends email to the specified address to notify the user regarding pipeline completion. Note that nextflow relies on the sendmail tool for this functionality- therefore sendmail must be available for this option to work.