Skip to main content

Performance

Runtime Behavior

CBIcall adds negligible orchestration overhead. The Python wrapper typically remains below 2% of a 16 GB system, does not process reads or variants, and does not create Python worker threads. The -t/--threads value is passed to the selected workflow backend.

Most memory and CPU usage comes from external tools:

  • BWA-MEM Memory usage increases with thread count and reference size. BWA does not provide an internal memory cap, so limiting RAM requires external mechanisms such as ulimit.

  • GATK and Picard These tools default to using 8 GB of memory. This value can be adjusted through the CBIcall GATK 4.6 environment file or the Snakemake configuration file.

Cohort mode with GATK 4.6

Joint genotyping defaults to 64 GB of RAM for GenomicsDBImport and GenotypeGVCFs. The value is controlled by MEM_GENOTYPE in the GATK 4.6 environment file and by mem_genotype in the Snakemake workflow.

Python driver overhead

The Python driver is expected to require one CPU core only during short setup phases. For long-running variant-calling jobs, scheduler CPU and memory requests should be sized for the selected external tools and the requested workflow threads, not for the CBIcall Python process itself.

Parallelization

Parallel execution is supported, but performance does not scale linearly with additional threads. In practice, optimal throughput is usually achieved with 4-6 threads per task.

For example, on a 12-core workstation:

  • Running 3 tasks with 4 threads each is typically preferable to
  • Running 1 task with all 12 threads

The benchmark below shows the shape of this scaling for one WES single-sample run. The biggest gain comes from moving from 2 to 4 threads; after 6 threads, the improvement is small.

Run time versus number of threads for WES single-mode

ThreadsRuntime (minutes)
228.5
423.4
622.1
822.0
1021.9
1221.4
Practical default

For batch processing, start with 4 threads per task and scale by running more tasks in parallel when the machine or scheduler has available cores.