General Usage¶
CBIcall execution requires:
Input Files¶
A folder containing paired-end FASTQ files, for example:
Example input data is available under examples/input/.
Parameters File¶
CBIcall is configured through a YAML parameters file.
Below are the main parameters and their default values.
Essential Parameters¶
mode: single
pipeline: wes
input_dir: null
sample_map: null
workflow_engine: bash
gatk_version: gatk-4.6
genome: b37
cleanup_bam: false
workflow_rule: null
allow_partial_run: false
project_dir: cbicall
Optional Parameters¶
CBIcall creates a dedicated project directory to store analysis outputs. Multiple runs can coexist without modifying the original inputs.
Parameter Reference¶
cleanup_bam¶
Set to true to delete 01_bam/*.{bam,bai} after the workflow completes.
gatk_version¶
Supported values:
gatk-3.5gatk-4.6
genome¶
Supported values:
b37(default for non-mtDNA runs)hg38rsrs(forced forpipeline: mit)
mode¶
Supported values:
singlecohort
pipeline¶
Supported values:
weswgsmit
For cohort analysis, complete the corresponding single analysis for each sample first.
project_dir¶
Prefix used for the generated run directory, for example cancer_sample_001.
It can also contain a path such as foo/cancer_sample_001.
When input_dir is defined, the final run directory is created below input_dir, with a unique run identifier appended automatically.
input_dir¶
Path, relative or absolute, to the directory containing FASTQ inputs for analysis.
Example:
sample_map¶
Cohort-mode parameter pointing to the TSV file containing sample identifiers and GVCF paths.
Example:
workflow_engine¶
Supported workflow engines:
bashsnakemake
workflow_rule¶
Optional workflow target used to start a partial run. Leave it unset for a normal full workflow execution.
Partial runs are currently supported for the snakemake workflow engine.
allow_partial_run¶
Safety switch for partial runs. Set it to true together with workflow_rule to explicitly allow a partial workflow start.
If workflow_rule is set but allow_partial_run is not true, CBIcall refuses to start the run.
CLI Synopsis¶
Arguments:
-p,--param: Parameters input file (YAML)-t,--threads: Number of CPUs/Cores/Threads
Options:
-debug,--debug: Debugging level-h,--help: Brief help message-man: Full documentation-v,--version: Show version information-verbose: Enable verbose output-nc,--no-color: Disable ANSI colors in STDOUT
Example Commands¶
bin/cbicall -p param_file.yaml -t 8
bin/cbicall -p param_file.yaml -t 4 -verbose
bin/cbicall -p param_file.yaml -t 16 > log 2>&1
python3 -u ../../bin/cbicall -p param.yaml -t 8 --no-color > log
$path_to_cbicall/bin/cbicall -p param_file.yaml -t 8 -debug 5
nohup bin/cbicall -p param_file.yaml -t 4 &
Partial Runs¶
Partial runs are intended for workflow-layer restarts or targeted execution, without changing the output directory naming scheme.
Example:
Behavior:
- if
workflow_ruleis unset, the full workflow runs - if
workflow_ruleis set, the run is marked as partial in metadata and CLI output - if
workflow_ruleis set withoutallow_partial_run: true, the run is rejected
Outputs and Logging¶
Each run creates a project directory containing:
- the workflow log file
log.jsonwith CLI arguments, resolved configuration, and YAML parameters- workflow-generated outputs
The project directory name is not changed when workflow_rule is used.
Recommended Specifications¶
CBIcall is optimized for multi-core Linux desktop, workstation, or server environments.
- Works on amd64 and arm64 architectures, including Apple Silicon
- Debian-based distributions are preferred, though others may work
- At least 8 GB RAM
- At least 4 CPU cores
- At least 250 GB disk space
Additional Requirements for Source Installation¶
These are required only when installing CBIcall from source:
- Python 3.8 or newer
- Java 8
- Snakemake