Configuration Reference

CBIcall runs from a YAML parameters file plus the CLI thread setting.

bin/cbicall run -p parameters.yaml -t 4

Unknown YAML keys are rejected, so misspellings fail early instead of being ignored. Run configuration is defined in YAML. The CLI supplies runtime controls such as the parameter file, thread count, color output, and validation/test commands; it does not override YAML analysis keys.

Minimal WES single-sample run

mode:            single
pipeline:        wes
workflow_engine: bash
gatk_version:    gatk-4.6
input_dir:       CNAG999_exome/CNAG99901P_ex
genome:          b37

Core Keys

Key	Default	Values	Use
`mode`	`single`	`single`, `cohort`	Selects one-sample processing or cohort-level processing.
`pipeline`	`wes`	`wes`, `wgs`, `mit`	Selects the analysis type.
`workflow_engine`	`bash`	`bash`, `snakemake`	Selects the execution backend supported by the current workflows.
`profile`	`local`	`local`, `cnag-hpc`	Selects the runtime environment file. `cnag-hpc` uses `cnag-hpc-env.sh` instead of the default `env.sh` for Bash workflows.
`gatk_version`	`gatk-3.5`	`gatk-3.5`, `gatk-4.6`	Selects the workflow version. Use `gatk-4.6` for current WES/WGS workflows.
`resource`	`cbicall-germline-resources-v1`	resource key	Selects one bundle entry from `resources/cbicall-resource-catalog.json`.
`genome`	inferred	`b37`, `hg38`, `rsrs`	Reference genome. If omitted, CBIcall uses `b37` for WES/WGS and `rsrs` for mtDNA.
`input_dir`	`null`	path	Input sample or project directory. Relative paths are resolved from the YAML file location.
`sample_map`	`null`	path	Cohort-mode TSV containing sample IDs and gVCF paths. Relative paths are resolved from the YAML file location.
`project_dir`	`cbicall`	path or prefix	Prefix for the generated run directory.
`cleanup_bam`	`false`	`true`, `false`	Deletes intermediate BAM and BAI files after successful WES/WGS single-sample runs.

Compatibility Matrix

Workflow	Supported
`gatk-4.6` + `bash` + `wes single/cohort`	Yes
`gatk-4.6` + `bash` + `wgs single/cohort`	Yes
`gatk-4.6` + `snakemake` + `wes single/cohort`	Yes
`gatk-4.6` + `snakemake` + `wgs single/cohort`	Yes
`gatk-3.5` + `bash` + `wes single/cohort`	Legacy
`gatk-3.5` + `bash` + `mit single/cohort`	Yes, x86_64 only
`mit` + `snakemake`	No
`gatk-3.5` + `snakemake`	No

Genome rules

pipeline: mit always uses genome: rsrs.
genome: hg38 is supported only with pipeline: wgs.
pipeline: wes currently uses b37.

Input Rules

Single-Sample WES/WGS

Use input_dir pointing to the sample directory containing paired FASTQ files.

mode:            single
pipeline:        wes
workflow_engine: bash
gatk_version:    gatk-4.6
input_dir:       CNAG999_exome/CNAG99901P_ex
genome:          b37

Cohort WES/WGS

Use sample_map pointing to a TSV with sample identifiers and gVCF paths.

mode:            cohort
pipeline:        wes
workflow_engine: bash
gatk_version:    gatk-4.6
genome:          b37
sample_map:      ./sample_map.tsv

mtDNA

mtDNA workflows consume BAMs from previous WES/WGS runs. They do not start from FASTQ files.

mode:            single
pipeline:        mit
workflow_engine: bash
gatk_version:    gatk-3.5
input_dir:       CNAG999_exome/CNAG99901P_ex

Bundle Provenance

resource selects the external tools and reference data expected for the run. CBIcall checks that the selected resource is compatible with the resolved workflow and records resource provenance in log.json and run-report.json.

Use Resource Validation for resource checks and Run Comparison to compare repeated runs.

Pipeline Implementation Version

Each workflow registry entry has a CBIcall pipeline implementation version, currently v1 for the bundled workflows. Normal YAML files do not need to set this; the registry provides the default.

Set pipeline_version only when a registry entry exposes more than one implementation and a run must pin a non-default one.

Runtime Profiles

Profiles select the environment mapping used by a workflow. The default profile is local; additional profiles can be declared in the workflow registry when the same workflow needs more than one env.sh layout, for example on a shared HPC system.

Select a non-default profile in YAML:

profile: cnag-hpc

Validate the parameters YAML and resolved setup without starting the workflow:

bin/cbicall validate-param -p parameters.yaml

During a real run, the resolved profile and selected environment file are written to log.json. validate-param prints the same resolved values without creating a run directory or log file.

Command Utilities

Command	Use
`bin/cbicall run -p parameters.yaml -t 4`	Execute a normal analysis run.
`bin/cbicall validate-param -p parameters.yaml`	Dry-run preflight for one concrete run. It validates the parameters YAML, workflow, profile env file, and selected resource without launching the workflow.
`bin/cbicall validate-resources`	Check the resource catalog and, optionally, one resource key.
`bin/cbicall compare-runs RUN_A RUN_B [RUN_C ...]`	Compare two or more run directories or `run-report.json` files.
`bin/cbicall test --wes`, `--mit`, or `--all`	Runs the bundled integration examples without remembering the script path.

Advanced Keys

Key	Default	Use
`pipeline_version`	Registry default, currently `v1`	Advanced pin for a specific CBIcall pipeline implementation. Leave unset for normal runs.
`workflow_rule`	`null`	Snakemake target for a partial run. Leave unset for normal full runs.
`allow_partial_run`	`false`	Must be `true` when `workflow_rule` is set. This prevents accidental partial starts.
`organism`	`Homo sapiens`	Metadata field.
`technology`	`Illumina HiSeq`	Metadata field.

Partial runs

Partial runs are intended for targeted Snakemake execution and restarts. If workflow_rule is set without allow_partial_run: true, CBIcall refuses to start.

Output Directory Naming

Every run gets a generated directory:

<project_dir>_<workflow_engine>_<pipeline>_<mode>_<genome>_<gatk_version>_<run-id>/

When input_dir is set, this directory is created inside input_dir. See Outputs for the files produced by each workflow.

Core Keys​

Compatibility Matrix​

Input Rules​

Single-Sample WES/WGS​

Cohort WES/WGS​

mtDNA​

Bundle Provenance​

Pipeline Implementation Version​

Runtime Profiles​

Command Utilities​

Advanced Keys​

Output Directory Naming​