Skip to content

OMOP to BFF

Information

The Beacon v2 schema enforces the presence of specific properties to achieve successful validation. In cases where no suitable match is found, DEFAULT values are employed to guarantee conformity.

OMOP SPECIMEN rows can now be emitted as first-class Beacon biosamples, but only in entity-aware BFF mode such as -obff --entities biosamples --out-dir out/ or -obff --entities individuals biosamples --out-dir out/.

OMOP SPECIMEN to Beacon biosamples support should still be considered experimental. The mapping is implemented and covered by local tests and schema validation, but it is still pending review and validation with external collaborators.

With --stream, OMOP BFF output is written as line-delimited JSON suitable for MongoDB-style ingestion. Stream mode supports individuals, biosamples, or both together, each written to its own file in --out-dir. Aggregate entities such as datasets and cohorts are not available in stream mode.

If biosamples are explicitly requested and the OMOP input does not contain the SPECIMEN table, the conversion fails with a focused error. If SPECIMEN exists but is empty, the conversion succeeds and emits an empty biosamples collection.

Version 0.31

Target model: BFF

Entity: individuals, biosamples

Terms

diseases

Source field Target field Notes
CONDITION_OCCURRENCE.condition_concept_id diseases.diseaseCode Mapped through OHDSI concepts
CONDITION_OCCURRENCE.condition_start_date + PERSON.birth_datetime diseases.ageOfOnset Derived age
CONDITION_OCCURRENCE.condition_status_concept_id diseases.stage Defaulted when absent
CONDITION_OCCURRENCE.* diseases._info.CONDITION_OCCURRENCE.OMOP_columns Provenance payload
VISIT_OCCURRENCE context diseases._visit Added when visit context is available
missing CONDITION_OCCURRENCE.condition_status_concept_id diseases.stage Defaults to NCIT:C126101 / Not Available

ethnicity

Source field Target field Notes
PERSON.race_source_value ethnicity Normalized through ontology lookup

exposures

Source field Target field Notes
OBSERVATION.observation_concept_id exposures.exposureCode Only observations classified as exposures are used
OBSERVATION.observation_date + PERSON.birth_datetime exposures.ageAtExposure Derived age
OBSERVATION.observation_date exposures.date Direct
OBSERVATION.unit_concept_id exposures.unit Defaulted when absent
OBSERVATION.value_as_number exposures.value \N is converted to -1
DEFAULT exposures.duration Added for Beacon completeness
OBSERVATION.* exposures._info.OBSERVATION.OMOP_columns Provenance payload
missing OBSERVATION.unit_concept_id exposures.unit Defaults to NCIT:C126101 / Not Available
DEFAULT exposures.duration Defaults to P0Y in the OMOP-specific path
OBSERVATION.value_as_number = \N exposures.value Defaults to -1

geographicOrigin

Source field Target field Notes
OBSERVATION.value_as_concept_id geographicOrigin Preferred when the observation represents Country of birth; normalized through ontology lookup
OBSERVATION.value_as_string geographicOrigin Preferred string fallback when the observation represents Country of birth
OBSERVATION.value_source_value geographicOrigin Preferred string fallback when the observation represents Country of birth
PERSON.ethnicity_source_value geographicOrigin Fallback when no Country of birth observation can be resolved

id

Source field Target field Notes
PERSON.person_id id Stringified in Beacon output

info

Source field Target field Notes
PERSON.* info.PERSON.OMOP_columns Raw OMOP row is preserved
PERSON.birth_datetime info.dateOfBirth Timestamp form
convertPheno info.convertPheno Emitted outside --test mode
missing PERSON.gender_concept_id none The participant is skipped entirely in this direction

interventionsOrProcedures

Source field Target field Notes
PROCEDURE_OCCURRENCE.procedure_concept_id interventionsOrProcedures.procedureCode Mapped through OHDSI concepts
PROCEDURE_OCCURRENCE.procedure_date + PERSON.birth_datetime interventionsOrProcedures.ageAtProcedure Derived age
PROCEDURE_OCCURRENCE.procedure_date interventionsOrProcedures.dateOfProcedure Direct
DEFAULT interventionsOrProcedures.bodySite Added for Beacon completeness
PROCEDURE_OCCURRENCE.* interventionsOrProcedures._info.PROCEDURE_OCCURRENCE.OMOP_columns Provenance payload
VISIT_OCCURRENCE context interventionsOrProcedures._visit Added when visit context is available
DEFAULT interventionsOrProcedures.bodySite Defaults to NCIT:C126101 / Not Available

karyotypicSex

NA

measures

Source field Target field Notes
MEASUREMENT.measurement_concept_id measures.assayCode Mapped through OHDSI concepts
MEASUREMENT.measurement_date measures.date Direct
MEASUREMENT.value_as_concept_id measures.measurementValue Used for ontology-valued measurements
MEASUREMENT.value_as_number measures.measurementValue.quantity.value Used for numeric measurements
MEASUREMENT.unit_concept_id measures.measurementValue.quantity.unit Defaulted when absent
MEASUREMENT.operator_concept_id + numeric value + unit measures.measurementValue.quantity.referenceRange Derived range payload
MEASUREMENT.measurement_date + PERSON.birth_datetime measures.observationMoment Derived age
MEASUREMENT.measurement_date + PERSON.birth_datetime measures.procedure.ageAtProcedure Mirrors observationMoment
MEASUREMENT.measurement_date measures.procedure.dateOfProcedure Direct
MEASUREMENT.measurement_type_concept_id measures.procedure.procedureCode Mapped through OHDSI concepts
DEFAULT measures.procedure.bodySite Added for Beacon completeness
MEASUREMENT.* measures._info.MEASUREMENT.OMOP_columns Provenance payload
VISIT_OCCURRENCE context measures._visit Added when visit context is available
missing MEASUREMENT.unit_concept_id measures.measurementValue.quantity.unit Defaults to NCIT:C126101 / Not Available
MEASUREMENT.value_as_number = \N and no value_as_concept_id measures.measurementValue.quantity Defaults to quantity -1 with Not Available unit and -1/-1 reference range
missing MEASUREMENT.measurement_concept_id none The row is skipped rather than emitting a default measure
DEFAULT measures.procedure.bodySite Defaults to NCIT:C126101 / Not Available

pedigrees

NA

phenotypicFeatures

Source field Target field Notes
OBSERVATION.observation_concept_id phenotypicFeatures.featureType Only non-exposure observations are used
OBSERVATION.observation_date + PERSON.birth_datetime phenotypicFeatures.onset Derived age
OBSERVATION.* phenotypicFeatures._info.OBSERVATION.OMOP_columns Provenance payload
VISIT_OCCURRENCE context phenotypicFeatures._visit Added when visit context is available

sex

Source field Target field Notes
PERSON.gender_concept_id sex Mapped through OHDSI concepts and then normalized to Beacon terms
missing PERSON.gender_concept_id none The participant is skipped before an individual is emitted

treatments

Source field Target field Notes
DRUG_EXPOSURE.drug_concept_id treatments.treatmentCode Mapped through OHDSI concepts
DRUG_EXPOSURE.drug_exposure_start_date + PERSON.birth_datetime treatments.ageAtOnset Derived age
DEFAULT treatments.routeOfAdministration Placeholder
DEFAULT treatments.doseIntervals Initialized as an empty list
DRUG_EXPOSURE.* treatments._info.DRUG_EXPOSURE.OMOP_columns Provenance payload
VISIT_OCCURRENCE context treatments._visit Added when visit context is available
DEFAULT treatments.routeOfAdministration Defaults to NCIT:C126101 / Not Available
DEFAULT treatments.doseIntervals Defaults to an empty list

Biosamples

biosamples

Source field Target field Notes
SPECIMEN.specimen_id biosamples.id Stringified in Beacon output
SPECIMEN.person_id biosamples.individualId Stringified in Beacon output
SPECIMEN.specimen_concept_id biosamples.sampleOriginType Mapped through OHDSI concepts; defaulted when absent
SPECIMEN.anatomic_site_concept_id biosamples.sampleOriginDetail Mapped through OHDSI concepts when present
SPECIMEN.specimen_type_concept_id biosamples.obtentionProcedure.procedureCode Mapped through OHDSI concepts when present
SPECIMEN.specimen_date biosamples.collectionDate Direct
SPECIMEN.specimen_date + PERSON.birth_datetime biosamples.collectionMoment Derived age
SPECIMEN.disease_status_concept_id biosamples.histologicalDiagnosis Mapped through OHDSI concepts when present
SPECIMEN.specimen_source_id / SPECIMEN.specimen_source_value none Kept only in provenance; not promoted to Beacon schema fields by default
DEFAULT biosamples.biosampleStatus Defaulted for Beacon completeness
convertPheno biosamples.info.convertPheno Emitted outside --test mode
SPECIMEN.* biosamples.info.SPECIMEN.OMOP_columns Provenance payload
missing SPECIMEN.specimen_concept_id biosamples.sampleOriginType Defaults to NCIT:C126101 / Not Available
DEFAULT biosamples.biosampleStatus Defaults to NCIT:C126101 / Not Available

About exposures

exposures terms are obtained from this CSV file. You can use a different csv file with the option --exposures-file.