Skip to main content

Outputs

CBIcall writes each run into a run directory named from the workflow choices:

cbicall_<engine>_<pipeline>_<mode>_<genome>_<gatk-version>_<run-id>/

The exact files depend on the selected pipeline and mode. The tables below are derived from the workflow output definitions and the checked-in example runs.

Most users only need these
  • WES/WGS: use the final QC VCF in 02_varcall/.
  • WES/WGS single-sample runs: keep the gVCF if you plan cohort joint genotyping.
  • mtDNA: use 01_mtoolbox/mit_prioritized_variants.txt and the browser report in 02_browser/.

Common Run Files

FileMeaning
log.jsonStructured record of CLI arguments, resolved configuration, selected profile, compact resources.bundle provenance, and runtime parameters.
run-report.jsonCompact audit report with CBIcall version, status, elapsed time, workflow file fingerprints, resource fingerprint, output fingerprints when available, and workflow log path.
<engine>_<pipeline>_<mode>_<genome>_<gatk-version>.logMain wrapper log for Bash runs.
logs/*.logPer-rule or per-step logs for Snakemake/GATK 4.6 workflows.

Use config.resources.bundle.fingerprint inside log.json to check whether two runs used the same declared external dependency set.

Use workflow.fingerprint inside run-report.json to check whether two runs used the same resolved workflow file contents. If the fingerprint differs, inspect workflow.files to see which entrypoint, helper, Snakefile, or config file changed. WES/WGS single-sample runs also include parsed VCF hash reports under outputs.vcf_hash_reports when 03_stats/*.vcf.sha256.txt is present.

Two runs can be compared directly:

bin/cbicall compare-runs run_a/ run_b/ run_c/ --output compare-report.txt --html compare-report.html

The text report is the audit artifact. The optional HTML report renders the same information for browsing. See Run Comparison for details and an example screenshot.

WES/WGS Single-Sample

Applies to pipeline: wes or pipeline: wgs with mode: single.

FileUse
02_varcall/<id>.hc.QC.vcf.gzFinal filtered single-sample VCF. This is the primary workflow VCF for downstream tools or review.
02_varcall/<id>.hc.QC.vcf.gz.tbiTabix index for the final VCF.
02_varcall/<id>.hc.g.vcf.gzPer-sample gVCF. Use this as input for cohort joint genotyping.
02_varcall/<id>.hc.g.vcf.gz.tbiTabix index for the gVCF.
03_stats/<id>.coverage.txtCoverage summary.
03_stats/<id>.sex.txtSex inference result from the final VCF.
Intermediate files
FileMeaning
01_bam/<fastq-prefix>.rg.bamLane-level BAM after alignment and read-group assignment.
01_bam/<id>.rg.merged.bamBAM after merging lanes for the sample.
01_bam/<id>.rg.merged.dedup.bamDuplicate-marked BAM.
01_bam/<id>.rg.merged.dedup.metrics.txtDuplicate-marking metrics.
01_bam/<id>.rg.merged.dedup.recal.tableBQSR recalibration table.
01_bam/<id>.rg.merged.dedup.recal.bamRecalibrated BAM used for variant calling.
01_bam/*.bai or 01_bam/*.bam.baiBAM indexes.
02_varcall/<id>.hc.raw.vcf.gzRaw VCF from GenotypeGVCFs.
02_varcall/<id>.hc.raw.vcf.gz.tbiTabix index for the raw VCF.
Conditional VQSR files

These appear only when the run has enough SNPs or indels to build VQSR models.

FileMeaning
02_varcall/<id>.hc.snp.recal.vcf.gzSNP VQSR model output.
02_varcall/<id>.hc.snp.tranches.txtSNP VQSR tranche diagnostics.
02_varcall/<id>.hc.post_snp.vcf.gzVCF after applying SNP VQSR.
02_varcall/<id>.hc.indel.recal.vcf.gzINDEL VQSR model output.
02_varcall/<id>.hc.indel.tranches.txtINDEL VQSR tranche diagnostics.
02_varcall/<id>.hc.vqsr.vcf.gzVCF after applying SNP and INDEL VQSR.
note

If VQSR is skipped because there are too few variants, the final *.hc.QC.vcf.gz is still produced by hard filtering.

WES/WGS Cohort

Applies to pipeline: wes or pipeline: wgs with mode: cohort.

FileUse
02_varcall/cohort.gv.QC.vcf.gzFinal filtered cohort VCF. This is the primary joint-genotyped variant file.
02_varcall/cohort.gv.QC.vcf.gz.tbiTabix index for the final cohort VCF.
Intermediate files
FileMeaning
02_varcall/cohort.genomicsdb.<run-id>/GenomicsDB workspace used by GenomicsDBImport.
02_varcall/genomicsdbimport.doneSnakemake marker showing that GenomicsDB import completed.
02_varcall/cohort.gv.raw.vcf.gzRaw cohort VCF from GenotypeGVCFs.
02_varcall/cohort.gv.raw.vcf.gz.tbiTabix index for the raw cohort VCF.
logs/01_genomicsdbimport.logGenomicsDB import log.
logs/02_genotype_gvcfs.logCohort genotyping log.
logs/03_vqsr_and_qc.logVQSR and final filtering log.
Conditional VQSR files
FileMeaning
02_varcall/cohort.snp.recal.vcf.gzSNP VQSR model output.
02_varcall/cohort.snp.tranches.txtSNP VQSR tranche diagnostics.
02_varcall/cohort.post_snp.vcf.gzVCF after applying SNP VQSR.
02_varcall/cohort.indel.recal.vcf.gzINDEL VQSR model output.
02_varcall/cohort.indel.tranches.txtINDEL VQSR tranche diagnostics.
02_varcall/cohort.vqsr.vcf.gzVCF after applying SNP and INDEL VQSR.

mtDNA Single-Sample

Applies to pipeline: mit with mode: single.

FileUse
01_mtoolbox/mit_prioritized_variants.txtFinal prioritized mtDNA variant report with GT, DP, and heteroplasmic fraction columns appended by CBIcall.
01_mtoolbox/VCF_file.vcfmtDNA VCF from MToolBox.
02_browser/<run-id>.htmlInteractive HTML report.
02_browser/mit.jsonJSON used by the browser report.
02_browser/README.txtLocal instructions for opening the browser report.
Intermediate files
FileMeaning
01_mtoolbox/<id>-DNA_MIT.bamExtracted mitochondrial BAM used as MToolBox input.
01_mtoolbox/<id>-DNA_MIT.bam.baiBAM index.
01_mtoolbox/prioritized_variants.txtRaw MToolBox prioritized variant list before CBIcall appends genotype/depth/HF fields.
01_mtoolbox/mit.raw.jsonRaw JSON conversion of the final prioritized report.
01_mtoolbox/mt_classification_best_results.csvMToolBox haplogroup/classification output.
01_mtoolbox/processed_fastq.tar.gzMToolBox processed FASTQ archive.
01_mtoolbox/summary_*.txtMToolBox run summary.
01_mtoolbox/OUT_*/MToolBox working directory with alignment, pileup, coverage, and annotation intermediates.

mtDNA Cohort

Applies to pipeline: mit with mode: cohort.

The cohort workflow uses the same output directories as mtDNA single-sample mode, but extracts mtDNA BAMs from all matching sibling sample directories before running MToolBox jointly.

FileUse
01_mtoolbox/mit_prioritized_variants.txtFinal joint mtDNA variant report with per-sample GT, DP, and heteroplasmic fraction fields.
01_mtoolbox/VCF_file.vcfJoint mtDNA VCF from MToolBox.
02_browser/<run-id>.htmlInteractive cohort HTML report.
02_browser/mit.jsonJSON used by the browser report.
02_browser/README.txtLocal instructions for opening the browser report.
Intermediate files
FileMeaning
01_mtoolbox/<sample-id>-DNA_MIT.bamExtracted mitochondrial BAM for each cohort sample.
01_mtoolbox/<sample-id>-DNA_MIT.bam.baiBAM index for each extracted mitochondrial BAM.
01_mtoolbox/prioritized_variants.txtRaw MToolBox prioritized variant list.
01_mtoolbox/missing_variants.txtTemporary variant list used while appending cohort genotype/depth/HF fields.
01_mtoolbox/mit.raw.jsonRaw JSON conversion of the final prioritized report.
01_mtoolbox/OUT_*/MToolBox working directories and intermediate outputs.