Skip to main content

Supported Inputs and Outputs

Use this page to check whether your data fits one of the supported bff-tools workflows.

Main Data Paths

Starting dataCommandMain outputNotes
Beacon metadata workbook (.xlsx)bff-tools validateBFF JSON collectionsUse the Beacon v2 workbook template for individuals, biosamples, runs, datasets, and related entities.
Existing BFF JSON metadatabff-tools validatevalidated BFF JSON collectionsUseful when metadata was produced outside the workbook template.
VCF or VCF.gzbff-tools vcfBFF genomicVariationsThe genome setting must match the input reference build and configured annotation resources.
SNP-array TSV or TXTbff-tools tsvVCF-like intermediates and BFF genomicVariationsIntended for SNP-array style data such as direct-to-consumer genotype exports.
BFF JSON collectionsbff-tools loadMongoDB collectionsRequires MongoDB and valid paths for mongoimport and mongosh.
Metadata plus VCF or TSV inputbff-tools fullBFF output plus MongoDB loadConvenience mode when configuration already points to the metadata collections.

Metadata Entities

Metadata validation can produce the standard Beacon v2 entity collections used by the toolkit:

analyses.json
biosamples.json
cohorts.json
datasets.json
individuals.json
runs.json

The exact files depend on the sheets or JSON collections present in the input.

Genomic Variation Output

VCF and TSV workflows generate genomic variation data in BFF form. The most common final output is:

genomicVariationsVcf.json.gz

The output is normally written inside a run-specific project directory, for example:

beacon_*/vcf/genomicVariationsVcf.json.gz

Optional Inspection Paths

NeedTool or optionOutput
Browse static BFF files without MongoDBbff2html: true or bff-browserlocal browser-oriented files
Query BFF collections loaded in MongoDBbff-portallightweight API and web interface
Queue many local ingestion jobsbff-queuelocal job queue and status tracking

Current Limits

  • The genomic workflow is aimed at DNA sequencing VCFs and SNP-array style TSV input.
  • Structural variants and copy-number variation support are limited.
  • Biological interpretation remains the user's responsibility; schema-valid output is not the same as clinically validated output.
  • Reference genome labels such as hg19, hg38, hs37, and b37 must be aligned with the input file and local reference data.

For copy-paste commands, continue to Command Recipes.