Troubleshooting
Use the error text from the terminal or workflow log to find the matching section. Most failures fall into three groups: missing external data, GATK/Picard input problems, or mtDNA-specific MToolBox issues.
- Check the main run log in the run directory.
- For Snakemake/GATK 4.6 runs, also check
logs/*.log. - Check
log.jsonto confirm the resolvedinput_dir,sample_map,genome, workflow, and run directory.
Installation and External Data
External data or tool path not found
Symptom
/usr/bin/bash: line 9: /media/mrueda/2TBS/NGSutils/gatk/gatk-4.6.2.0/gatk: No such file or directory
Likely cause
DATADIR does not point to the directory where databases and external tools are installed or mounted.
Fix
Update the data directory in the workflow configuration:
workflows/bash/gatk-4.6/env.sh
workflows/snakemake/gatk-4.6/config.yaml
For containers, make sure the host data directory is bind-mounted at the same path used by the workflow configuration.
Relative input paths resolve somewhere unexpected
Symptom
CBIcall cannot find FASTQ files, BAMs, or sample_map.tsv, even though the path looks correct from your current shell.
Likely cause
Relative input_dir and sample_map paths are resolved from the YAML file location.
Fix
Use absolute paths, or keep the YAML file next to the relative paths it references. Confirm the resolved paths in log.json.
GATK and Picard
NaN LOD value during recalibration
Symptom
NaN LOD value assigned
Likely cause
There are too few variants to train a reliable VQSR model, often too few INDELs.
Fix
Use the existing thresholds that skip VQSR when the variant count is too small, or increase the minimum threshold before rerunning. The final *.QC.vcf.gz is still produced by hard filtering when VQSR is skipped.
Not enough columns in dbSNP line
Symptom
there aren't enough columns for line ... dbsnp_137.hg19.vcf
Likely cause
The dbSNP VCF contains malformed or truncated records.
Fix
Inspect the reported line in the dbSNP VCF, replace the database file if possible, or correct the malformed record locally and document the change.
Error parsing text SAM file
Symptom
Error parsing text SAM file. Not enough fields; File /dev/stdin; Line ...
Likely cause
Secondary or supplementary alignments can introduce records that Picard/GATK rejects when the alignment stream is passed directly into read-group assignment.
Fix
Filter secondary and supplementary alignments before adding read groups:
bwa mem -M -t "$THREADS" "$REFGZ" "$R1" "$R2" \
| samtools view -bSh -F 0x900 - \
| gatk AddOrReplaceReadGroups ...
mtDNA and MToolBox
MToolBox fails on ARM / aarch64
Symptom
mit_single cannot be performed with: aarch64
or:
mit_cohort cannot be performed with: aarch64
Likely cause
The bundled MToolBox workflow is x86_64-only.
Fix
Run mtDNA workflows on an x86_64 Linux host. WES/WGS GATK 4.6 workflows can still run on supported ARM systems.
No usable BAM found for mtDNA
Symptom
ERROR: Could not find BAM for ID ...
or:
ERROR: No usable sample BAMs found. Nothing to do.
Likely cause
The mtDNA workflow expects BAMs from previous WES/WGS single-sample runs in the expected project layout.
Fix
Run WES/WGS single-sample processing first, keep the 01_bam outputs, and then rerun the mtDNA workflow from the sample or project directory described in the mtDNA example.
Unsupported N CIGAR operations
Symptom
MToolBox fails due to unsupported N operations in CIGAR strings.
Likely cause
Some reads contain skipped-region CIGAR operations that MToolBox cannot process.
Fix
Add this flag in the relevant MToolBox alignment or SAM-processing step:
--filter_reads_with_N_cigar
Low coverage and unreliable heteroplasmy fractions
Symptom
mtDNA coverage is low, or heteroplasmy fraction estimates look unstable.
Likely cause
Below roughly 10x mtDNA coverage, heteroplasmy fraction estimates are unreliable.
Fix
Flag samples below 10x median mtDNA coverage, interpret HF values cautiously, exclude low-coverage samples from HF-based analyses when needed, and consider resequencing if mtDNA interpretation is critical.
Variant Interpretation
Unexpected de novo rates in trios
Symptom
Observed de novo rates differ strongly from expectations.
Likely cause
Large deviations can indicate sample, data-quality, annotation, or pipeline issues.
Reference values
| Sample type | Typical de novo rate |
|---|---|
| Proband | ~1% |
| Parent | ~10% |
Fix
Check sample identity, pedigree labels, coverage, variant filters, and annotation assumptions before interpreting the result biologically.
Next Steps
- Confirm generated files in Outputs.
- Confirm YAML choices in Configuration Reference.
- Review runtime guidance in Performance.