Skip to main content

mtDNA Pipelines

These pipelines extract mitochondrial reads from exome data and run MToolBox to generate mtDNA variant calls, annotations, and heteroplasmy estimates.

There are two processing modes:

  • Single-sample analysis: mit_single
  • Cohort / family analysis: mit_cohort

Both consume WES single-sample outputs and assume this nomenclature.


Choosing a Pipeline

Use CasePipelineDescription
Analyze one individualmit_singleFast, sample-specific mtDNA variant calling + HF/DP/GT extraction
Analyze a full family or cohortmit_cohortJoint variant calling across samples; useful for transmission and segregation checks

Workflow Details

mtDNA Single-Sample Pipeline

Source: View source

Workflow Diagram

mtDNA single-sample workflow

Summary

The mit_single pipeline processes one individual at a time.
It extracts mtDNA reads, runs MToolBox, and enriches the prioritized variants with:

  • GT — genotype
  • DP — depth
  • HF — heteroplasmic fraction

Inputs

  • Run inside directory: cbicall_bash_mit_single_*
  • Needs WES single-sample directory: ../../cbicall_bash_wes_single*/01_bam/input.merged.filtered.realigned.fixed.bam
  • env.sh provides:
    • REF
    • SAM path
    • MTOOLBOXDIR

Outputs

FileDescription
VCF_file.vcfmtDNA VCF from MToolBox
prioritized_variants.txtRaw prioritized list
mit_prioritized_variants.txtFinal prioritized list with GT/DP/HF

When to Use Each Pipeline

Use mit_single when:

  • You need results for one individual.
  • You are adding or reprocessing a single relative.
  • You want faster turnaround for a standalone case.

Use mit_cohort when:

  • You want a joint variant table across multiple individuals.
  • You are analyzing mtDNA inheritance within a family.
  • You need a cohort-level mtDNA table for downstream review or comparison.

Background Information

CBIcall mtDNA builds on MToolbox v1.0 and performs:


1. Preprocessing (PicardTools)

Converts BAM → FASTQ using:

  • SortSam.jar
  • MarkDuplicates.jar
  • SamFormatConverter.jar (PicardTools)

2. Alignment

  • Aligns reads to RSRS via mapExome.py
  • Uses GSNAP (2015-12-31.v7)

3. Variant Calling & Annotation (MToolBox)

Pipeline steps include:

  • mpileup (SAMtools)
  • mtVariantCaller.py
  • VCFoutput.py (with PyVCF)
  • mt-classifier.py (haplogroup prediction)
  • variants_functional_annotation.py
  • prioritization.py
  • summary.py

Reference

  1. Calabrese C. et al.
    MToolBox: a highly automated pipeline for heteroplasmy annotation and prioritization analysis of human mitochondrial variants in high-throughput sequencing.
    Bioinformatics (2014).
    Read paper