Skip to main content

CLI Reference

The main entry point is bin/bff-tools.

bin/bff-tools --help

Most users only need one mode at a time. Start from the data you already have, then choose the matching command.

Modes

ModeUse whenMain output
validateYou have XLSX metadata or existing BFF JSON collectionsBFF JSON metadata collections
vcfYou have VCF or VCF.gz genomic inputBFF genomicVariations
tsvYou have SNP-array TSV or TXT inputVCF-like intermediates and BFF genomicVariations
loadYou already have BFF files and want MongoDB collectionsimported MongoDB collections
fullYou want genomic conversion plus MongoDB loading in one runBFF output and imported MongoDB collections

Command Model

Most commands follow this shape:

bin/bff-tools <mode> -i <input> -p <param.yaml> [options]

validate is the exception because it can validate metadata without a genomic parameter file:

bin/bff-tools validate -i metadata.xlsx --out-dir bff_out

Common Commands

Validate Metadata

bin/bff-tools validate -i metadata.xlsx --out-dir bff_out

Use this before genomic conversion or MongoDB loading. It catches structural metadata problems early and writes BFF entity collections.

Convert VCF

bin/bff-tools vcf -t 4 -i input.vcf.gz -p param.yaml

Use this for sequencing VCFs. The genome value in param.yaml must match the input reference build.

Convert SNP-array TSV

bin/bff-tools tsv -i input.txt.gz -p param.yaml

Use this for SNP-array style files. This mode creates VCF-like intermediates before generating BFF genomic variation output.

Load BFF Collections

bin/bff-tools load -p param.yaml

Use this after metadata and genomic variation files exist. The parameter file must point to the BFF collections and MongoDB configuration.

Convert and Load

bin/bff-tools full -t 4 -i input.vcf.gz -p param.yaml

Use this when the parameter file already points to the metadata collections and MongoDB settings.

Options You Will Use Often

OptionApplies toPurpose
-i FILEvalidate, vcf, tsv, fullinput workbook, VCF, TSV, or genomic file
-p FILEvcf, tsv, load, fullruntime parameter file
-t Nvcf, fullnumber of threads for supported stages
--out-dir DIRvalidatemetadata validation output directory
--projectdir-override DIRvcf, tsv, load, fullexplicit run directory name
--ignore-validationvalidatewrite generated JSON for inspection even when validation is noisy

Parameter File Essentials

Minimal VCF or TSV conversion:

genome: hg38

Generate static browser output as part of the run:

genome: hg38
bff2html: true

Load BFF files into MongoDB:

bff:
metadatadir: bff_out
runs: runs.json
cohorts: cohorts.json
biosamples: biosamples.json
individuals: individuals.json
analyses: analyses.json
datasets: datasets.json
genomicVariationsVcf: beacon_my_project/vcf/genomicVariationsVcf.json.gz

Choosing the Wrong Mode

If you have...Do not start withUse
only metadatavcf or fullvalidate
only a VCF and no metadata paths configuredloadvcf first
existing BFF collectionsvalidate onlyload after validating if needed
a failed conversion directoryrerunning into the same directorya new --projectdir-override value

For copy-paste workflows, see Command Recipes. For an end-to-end explanation, see Data Beaconization.