Skip to main content

Overview

Overview

Build Beacon v2-ready datasets from metadata and genomic files.

beacon2-cbi-tools helps validate Beacon metadata, convert VCF or SNP-array input into Beacon Friendly Format, and load the resulting collections into MongoDB.

beacon2-cbi-tools helps you prepare data for Beacon v2 deployments based on the Beacon Friendly Format (BFF).

With this toolkit you can:

  • validate metadata from XLSX or JSON files against Beacon v2 schemas
  • convert VCF or SNP-array TSV input into BFF genomicVariations
  • load BFF collections into MongoDB
  • optionally inspect the resulting data with lightweight utilities
Research-use disclaimer

This toolkit is intended for research use. Do not use generated annotations or results for medical decisions.

Typical Workflow

Most users follow this sequence:

  1. Prepare and validate metadata with bff-tools validate.
  2. Convert genomic data with bff-tools vcf or bff-tools tsv.
  3. Load the generated BFF collections into MongoDB with bff-tools load or bff-tools full.
InputXLSX or BFF metadataVCF or SNP-array TSV
Processvalidatevcf / tsv / load / full
OutputBFF JSON collectionsMongoDB and browser files

If you are new to the toolkit, use this order:

  1. Read the installation overview and pick Docker unless your environment requires Apptainer or a direct install.
  2. Use What should I run? to choose the right command for your input.
  3. Check Supported Inputs and Outputs to confirm your data fits a supported path.
  4. Run the Quick Start with the bundled test data.
  5. Use Command Recipes for copy-paste commands.
  6. Read the data beaconization tutorial before adapting the workflow to your own data.
  7. Check Validation and Reproducibility and Outputs when reviewing generated files and logs.
  8. Keep the FAQ open while configuring reference genomes, annotation resources, and MongoDB loading.

What You Need Before Starting

RequirementWhy it matters
Metadata in XLSX or BFF JSONRequired for Beacon entities such as individuals, biosamples, runs, and datasets
VCF, VCF.gz, or SNP-array TSV inputUsed to generate BFF genomicVariations
Reference genome choiceMust match your genomic input, for example hg19, hg38, hs37, or b37
External reference dataRequired by the genomic conversion workflow
MongoDBRequired only when you want to load and query BFF collections

Choose Your Path

Main Commands

The main entry point is bff-tools.

  • bff-tools validate: validate metadata and write BFF JSON collections
  • bff-tools vcf: convert a VCF or VCF.gz file into BFF
  • bff-tools tsv: convert a SNP-array TSV file into BFF
  • bff-tools load: load BFF collections into MongoDB
  • bff-tools full: run conversion plus loading in one step

Utilities

The toolkit also includes optional utilities for browsing or queueing jobs:

  • bff-browser: browse static BFF files without a database
  • bff-portal: query BFF data stored in MongoDB
  • bff-queue: run and monitor many ingestion jobs on a workstation