Skip to main content

Outputs

CBIcall writes each run into a run directory named from the workflow choices:

cbicall_<backend>_<software-stack>_<pipeline>_<mode>_<genome>_<run-id>/

The exact files depend on the selected pipeline and mode. The tables below are derived from the workflow output definitions and the checked-in example runs.

Where run directories are created

Native CBIcall pipelines (wes, wgs, and mit) create the run directory under the discovered sample/input directory. External nf-core workflows are different: because their inputs are supplied through nfcore_parameters and may point anywhere, CBIcall creates the run directory in the directory where cbicall run is launched.

Most users only need these
  • WES/WGS: use the final QC VCF in 02_varcall/.
  • WES/WGS single-sample runs: keep the gVCF if you plan cohort joint genotyping.
  • mtDNA: use 01_mtoolbox/mit_prioritized_variants.txt and the browser report in 02_browser/.

Common Run Files

FileMeaning
log.jsonStructured record of CLI arguments, resolved configuration, selected runtime profile, compact resources.bundle provenance, and runtime parameters.
cbicall-execution-contract.jsonBackend-ready execution plan created after CBIcall validates and resolves the parameters YAML. It records the command, CBIcall-controlled environment overrides, backend/provider identity, and generated backend launch files.
run-report.jsonCompact audit report with CBIcall version, Python version, Java version, workflow backend version, status, elapsed time, workflow file fingerprints, execution-contract fingerprint, resource key/version/fingerprint, output file inventory fingerprint, output fingerprints when available, and workflow log path.
run-report.htmlHuman-readable tabbed rendering of run-report.json for browsing a completed run without reading JSON directly. It separates overview, evidence, outputs, and raw JSON views; links the main run evidence; and shows software-version evidence when available. Generate it from an existing run with bin/cbicall report RUN_DIR --html.
cbicall_mqc/Optional MultiQC custom-content directory generated with bin/cbicall run --multiqc, bin/cbicall report RUN_DIR --multiqc, or bin/cbicall compare-runs ... --multiqc. It lets standard MultiQC reports include compact CBIcall run/QC summaries, pairwise comparison tables, and audit-similarity heatmaps without installing a CBIcall MultiQC plugin.
<backend>_<software-stack>_<pipeline>_<mode>_<genome>.logMain workflow log for the selected backend.
logs/*.logPer-rule or per-step logs for Snakemake/GATK 4.6 workflows.

Use config.resources.bundle.fingerprint inside log.json to check whether two runs used the same declared external dependency set.

Use workflow.fingerprint inside run-report.json to check whether two runs used the same resolved workflow file contents. If the fingerprint differs, inspect workflow.files to see which entrypoint, helper, Snakefile, or config file changed. The matching run-report.html file presents the same core audit fields in a browser-friendly view.

Screenshot of the native CBIcall run-report HTML overview tab, showing the run summary and audit sections.

Screenshot of the native CBIcall run-report HTML evidence tab, showing workflow fingerprints, resource identity, runtime versions, and linked audit artifacts.

Screenshot of the native CBIcall run-report HTML outputs tab, showing canonical files, output inventory, and file-size summaries.

Use runtime.java and runtime.configured_java to audit the Java visible on PATH and the native workflow Java configured through env.sh or shared backend config when available.

Use execution_contract.fingerprint to check whether two runs used the same normalized backend-ready execution plan. The raw contract keeps paths and run IDs for audit, while the normalized fingerprint replaces the run directory and run ID so repeated runs can still compare cleanly.

Use execution_trace to audit task count and peak RAM when the backend emits a trace. For nf-core/Nextflow runs, CBIcall parses pipeline_info/execution_trace_*.txt and records maximum peak RSS and VMEM. Native Bash runs do not have RAM summaries unless the workflow is instrumented to write them.

Use software_versions.sha256 to audit the tool-version table when available. Native workflows use declared tool versions from the selected resource catalog entry. External nf-core workflows use the software-version YAML generated by the nf-core pipeline.

MultiQC Custom Content

CBIcall can write a MultiQC custom-content directory for a completed run:

bin/cbicall report completed_run/ --multiqc
multiqc completed_run/

The generated cbicall_mqc/ directory contains several small *_mqc.yaml files. MultiQC renders these as compact CBIcall sections: numeric run statistics, workflow/resource identity, final-output fingerprints, and native sample QC when 03_stats/*.coverage.txt or 03_stats/*.sex.txt files are present. The full CBIcall audit remains in run-report.json and run-report.html; MultiQC is a companion summary for projects that already collect QC with MultiQC. No CBIcall-specific MultiQC plugin is required. Source installs include multiqc from requirements.txt so users can render the report directly.

During a new run, use:

bin/cbicall run -p parameters.yaml -t 4 --multiqc

Use outputs.file_inventory.sha256 to check whether two run directories contain the same relative file layout. This is a manifest hash of file paths, not a content hash. outputs.file_inventory.total_bytes records the total size of files included in that inventory; the HTML report renders this in human-readable units and shows the largest files separately so large runs remain readable. WES/WGS single-sample runs also include parsed VCF hash reports under outputs.vcf_hash_reports when 03_stats/*.vcf.sha256.txt is present.

Two runs can be compared directly:

bin/cbicall compare-runs run_a/ run_b/ run_c/ --alias local cloud hpc --output compare-report.txt

The text report is the audit artifact. CBIcall also writes compare-report.html by default for browsing, including field-level matrices and combined pairwise audit matrices with derived categories plus report-level similarity scores. See Run Comparison for details and an example screenshot.

External nf-core Workflows

For workflow_provider: nf-core, CBIcall keeps the external workflow output layout native:

File or directoryMeaning
cbicall_external_nextflow.params.yamlParams file generated by CBIcall and passed to nextflow run; its hash is also recorded in the execution contract.
cbicall_external_nextflow.configNextflow config generated by CBIcall to cap process CPU requests from -t/--threads and configure optional container cache paths; its hash is also recorded in the execution contract.
<pipeline>/Native nf-core output directory, for example demo/ or sarek/.
work/Nextflow work directory, excluded from the compact run file-inventory hash.
nf-core_<pipeline>_<mode>.logMain Nextflow launcher log for the external nf-core workflow.

run-report.json records the nf-core source, pinned release, nf-core profile, generated params/config-file hashes, workflow output directory, pointers to pipeline_info/MultiQC reports, the nf-core software-version YAML, and a summary of task count and peak RAM from the Nextflow execution trace when available. The generated params file also records max_cpus from the CBIcall -t/--threads value. nf-core parameters such as max_memory can be passed through nfcore_parameters. The generated Nextflow config applies the CPU value, and max_memory when present, through process.resourceLimits. When nfcore_singularity_cache_dir is set, CBIcall writes a user/project-owned Singularity and Apptainer cache/library path to the generated Nextflow config. Environment variables such as NXF_SINGULARITY_CACHEDIR belong in the shell or SLURM bootstrap, not in CBIcall's Python runner. On ARM64 hosts using the Docker profile, the generated config also pins Docker to linux/amd64 because many nf-core containers are published primarily for AMD64.

For registered external workflows, the workflow registry can declare canonical outputs. The Sarek entry declares the HaplotypeCaller VCF pattern under sarek/variant_calling/haplotypecaller/. When a matching VCF exists, CBIcall records it under outputs.canonical_outputs and adds a normalized VCF hash to outputs.vcf_hash_reports for compare-runs.

WES/WGS Single-Sample

Applies to pipeline: wes or pipeline: wgs with mode: single.

FileUse
02_varcall/<id>.hc.QC.vcf.gzFinal filtered single-sample VCF. This is the primary workflow VCF for downstream tools or review.
02_varcall/<id>.hc.QC.vcf.gz.tbiTabix index for the final VCF.
02_varcall/<id>.hc.g.vcf.gzPer-sample gVCF. Use this as input for cohort joint genotyping.
02_varcall/<id>.hc.g.vcf.gz.tbiTabix index for the gVCF.
03_stats/<id>.coverage.txtCoverage summary.
03_stats/<id>.sex.txtSex inference result from the final VCF.
03_stats/<id>.vcf.sha256.txtPer-VCF SHA-256 report with raw and normalized VCF fingerprints.
Intermediate files
FileMeaning
01_bam/<fastq-prefix>.rg.bamLane-level BAM after alignment and read-group assignment.
01_bam/<id>.rg.merged.bamBAM after merging lanes for the sample.
01_bam/<id>.rg.merged.dedup.bamDuplicate-marked BAM.
01_bam/<id>.rg.merged.dedup.metrics.txtDuplicate-marking metrics.
01_bam/<id>.rg.merged.dedup.recal.tableBQSR recalibration table.
01_bam/<id>.rg.merged.dedup.recal.bamRecalibrated BAM used for variant calling.
01_bam/*.bai or 01_bam/*.bam.baiBAM indexes.
02_varcall/<id>.hc.raw.vcf.gzRaw VCF from GenotypeGVCFs.
02_varcall/<id>.hc.raw.vcf.gz.tbiTabix index for the raw VCF.
Conditional VQSR files

These appear only when the run has enough SNPs or indels to build VQSR models.

FileMeaning
02_varcall/<id>.hc.snp.recal.vcf.gzSNP VQSR model output.
02_varcall/<id>.hc.snp.tranches.txtSNP VQSR tranche diagnostics.
02_varcall/<id>.hc.post_snp.vcf.gzVCF after applying SNP VQSR.
02_varcall/<id>.hc.indel.recal.vcf.gzINDEL VQSR model output.
02_varcall/<id>.hc.indel.tranches.txtINDEL VQSR tranche diagnostics.
02_varcall/<id>.hc.vqsr.vcf.gzVCF after applying SNP and INDEL VQSR.
note

If VQSR is skipped because there are too few variants, the final *.hc.QC.vcf.gz is still produced by hard filtering.

WES/WGS Cohort

Applies to pipeline: wes or pipeline: wgs with mode: cohort.

FileUse
02_varcall/cohort.gv.QC.vcf.gzFinal filtered cohort VCF. This is the primary joint-genotyped variant file.
02_varcall/cohort.gv.QC.vcf.gz.tbiTabix index for the final cohort VCF.
Intermediate files
FileMeaning
02_varcall/cohort.genomicsdb.<run-id>/GenomicsDB workspace used by GenomicsDBImport.
02_varcall/genomicsdbimport.doneSnakemake marker showing that GenomicsDB import completed.
02_varcall/cohort.gv.raw.vcf.gzRaw cohort VCF from GenotypeGVCFs.
02_varcall/cohort.gv.raw.vcf.gz.tbiTabix index for the raw cohort VCF.
logs/01_genomicsdbimport.logGenomicsDB import log.
logs/02_genotype_gvcfs.logCohort genotyping log.
logs/03_vqsr_and_qc.logVQSR and final filtering log.
Conditional VQSR files
FileMeaning
02_varcall/cohort.snp.recal.vcf.gzSNP VQSR model output.
02_varcall/cohort.snp.tranches.txtSNP VQSR tranche diagnostics.
02_varcall/cohort.post_snp.vcf.gzVCF after applying SNP VQSR.
02_varcall/cohort.indel.recal.vcf.gzINDEL VQSR model output.
02_varcall/cohort.indel.tranches.txtINDEL VQSR tranche diagnostics.
02_varcall/cohort.vqsr.vcf.gzVCF after applying SNP and INDEL VQSR.

mtDNA Single-Sample

Applies to pipeline: mit with mode: single.

FileUse
01_mtoolbox/mit_prioritized_variants.txtFinal prioritized mtDNA variant report with GT, DP, and heteroplasmic fraction columns appended by CBIcall.
01_mtoolbox/VCF_file.vcfmtDNA VCF from MToolBox.
02_browser/<run-id>.htmlInteractive HTML report.
02_browser/mit.jsonJSON used by the browser report.
02_browser/README.txtLocal instructions for opening the browser report.
Intermediate files
FileMeaning
01_mtoolbox/<id>-DNA_MIT.bamExtracted mitochondrial BAM used as MToolBox input.
01_mtoolbox/<id>-DNA_MIT.bam.baiBAM index.
01_mtoolbox/prioritized_variants.txtRaw MToolBox prioritized variant list before CBIcall appends genotype/depth/HF fields.
01_mtoolbox/mit.raw.jsonRaw JSON conversion of the final prioritized report.
01_mtoolbox/mt_classification_best_results.csvMToolBox haplogroup/classification output.
01_mtoolbox/processed_fastq.tar.gzMToolBox processed FASTQ archive.
01_mtoolbox/summary_*.txtMToolBox run summary.
01_mtoolbox/OUT_*/MToolBox working directory with alignment, pileup, coverage, and annotation intermediates.

mtDNA Cohort

Applies to pipeline: mit with mode: cohort.

The cohort workflow uses the same output directories as mtDNA single-sample mode, but extracts mtDNA BAMs from all matching sibling sample directories before running MToolBox jointly.

FileUse
01_mtoolbox/mit_prioritized_variants.txtFinal joint mtDNA variant report with per-sample GT, DP, and heteroplasmic fraction fields.
01_mtoolbox/VCF_file.vcfJoint mtDNA VCF from MToolBox.
02_browser/<run-id>.htmlInteractive cohort HTML report.
02_browser/mit.jsonJSON used by the browser report.
02_browser/README.txtLocal instructions for opening the browser report.
Intermediate files
FileMeaning
01_mtoolbox/<sample-id>-DNA_MIT.bamExtracted mitochondrial BAM for each cohort sample.
01_mtoolbox/<sample-id>-DNA_MIT.bam.baiBAM index for each extracted mitochondrial BAM.
01_mtoolbox/prioritized_variants.txtRaw MToolBox prioritized variant list.
01_mtoolbox/missing_variants.txtTemporary variant list used while appending cohort genotype/depth/HF fields.
01_mtoolbox/mit.raw.jsonRaw JSON conversion of the final prioritized report.
01_mtoolbox/OUT_*/MToolBox working directories and intermediate outputs.