Skip to main content

REDCap

Experimental

REDCap conversion is still experimental. Please use it with caution.

REDCap stands for Research Electronic Data Capture. REDCap documentation.

RoleInput
Accepted inputCSV export, dictionary, mapping file
Main outputBFF, PXF, OMOP-CDM
NotesKeep exports as UTF-8 text files

REDCap as input

REDCap projects are inherently "free format", meaning the project creator has the flexibility to determine the identifiers for variables, data dictionaries, and other elements.

REDCap project creation user’s guide

“We always recommend reviewing your variable names with a statistician or whoever will be analyzing your data. This is especially important if this is the first time you are building a database.”

Due to the flexibility of REDCap projects, it can be challenging to develop a solution that accommodates the wide range of possibilities. Nonetheless, we were able to successfully convert data from REDCap project exports to both Beacon v2 and Phenopackets v2 formats using a mapping file. These conversions were achieved as part of the 3TR Project.

About REDCap longitudinal data

REDCap stores event information, however, Beacon v2 Models currently lack a way to store longitudinal data. To address this, we will store event data under the propery info.

About REDCap export formats

REDCap provides various options for exporting data. We accept the option "All data (all records and fields)" including CSV and Microsoft Excel format, along with an accompanying data dictionary in CSV format. Exportation in REDCap CDISC ODM (XML) format is discussed in the section on CDISC-ODM.

Keep REDCap text exports as UTF-8 text files

Do not open and resave REDCap CSV exports or CSV data dictionaries with spreadsheet software such as Excel before running convert-pheno. This may alter UTF-8 encoding and corrupt non-ASCII characters such as µ, , accents, or degree symbols, which can then break dictionary values and ontology mappings.

We'll need three files:

  1. REDCap export (CSV)
  2. REDCap data dictionary (CSV)
  3. Mapping file (YAML or JSON) (see tutorial)
Can CSV files be compressed?

Yes. We also accept as input files compressed with gzip.

convert-pheno -iredcap redcap.csv --redcap-dictionary dictionary.csv --mapping-file mapping.yaml -obff individuals.json

By default, the generated BFF keeps a copy of the source row under info.REDCap_columns so users can audit mapped values against the original REDCap export. Use --no-source-info to omit that raw source snapshot.

If you want to inspect ontology search results, you can also request a TSV audit file:

convert-pheno -iredcap redcap.csv --redcap-dictionary dictionary.csv --mapping-file mapping.yaml -obff individuals.json --search-audit-tsv search-audit.tsv

If you also want synthesized Beacon datasets and cohorts, keep -obff and use entity mode:

convert-pheno -iredcap redcap.csv --redcap-dictionary dictionary.csv --mapping-file mapping.yaml -obff --entities individuals datasets cohorts --out-dir out/

In this mode, the top-level beacon section of the mapping file can override dataset and cohort metadata such as id, name, description, version, or cohortType.

The audit file is tab-separated and currently includes columns such as:

  • row
  • original_term_label
  • converted_term_label
  • converted_term_id
  • ontology
  • match_status
  • match_source

This is useful when users want to review how REDCap source values were resolved against SQLite-backed ontologies before trusting the final conversion.