CLI Reference

Command

bin/omop-csv-validator --ddl DDL.sql --input DATA.csv [options]

Required options

`--ddl`

Path to the PostgreSQL DDL file containing CREATE TABLE definitions.

`--input`

Path to the input CSV file to validate.

The CLI accepts one CSV file per run. It does not take multiple OMOP tables in a single invocation.

Validation is streamed row by row, so large files can be processed without loading the full input into memory first.

Optional options

--sep is a fallback option

The validator normally infers the separator from the input file.

Use --sep only when you want to override detection explicitly or when the file is ambiguous, for example --sep $'\t'.

`--table`, `-t`

Explicitly choose the table schema instead of inferring it from the CSV filename.

`--save-schemas`

Write the generated schema set to a JSON file.

`--no-color`, `-nc`

Disable ANSI color output.

`--json`

Emit a machine-readable JSON result object instead of the default human-readable output.

This is the recommended mode for R or other automation clients.

The CLI still validates the file row by row in this mode and only accumulates failing rows for the final row_errors payload.

`--turbo`

Use the compiled fast-path validator instead of the default JSON::Validator engine.

This mode is optional and is mainly intended for large CSV files where validation throughput becomes a practical issue.

For normal-sized files, stay on the default engine unless you have a specific reason to switch.

The external behavior stays the same:

same exit codes
same JSON output shape
same row numbering
same report formats

For implementation details and benchmark numbers, see Implementation.

`--report-tsv`

Write a tab-separated validation report that spreadsheet users can open directly in Excel or LibreOffice.

The report keeps the original input columns and appends these validation columns:

_validation_row
_validation_status
_validation_error_count
_validation_messages

Use this when you want to sort, filter, or review failing rows in a spreadsheet.

The report is TSV, not native .xlsx, so it does not carry Excel cell colors by itself.

The report is written incrementally while the input is being validated, which makes this mode suitable for large files as well.

When this mode is enabled, the CLI keeps stdout compact on validation failure and does not print the full row-by-row error listing.

`--report-xlsx`

Write a native Excel workbook with two sheets:

Summary
Validation

The Validation sheet keeps the original input columns and appends the same _validation_* columns used in the TSV report.

This mode also adds spreadsheet-oriented presentation:

colored OK and ERROR status cells
conditional row coloring in the validation sheet
frozen header row and autofilter

Use this when your reviewers work primarily in Excel and want a ready-to-open workbook instead of plain text output.

When this mode is enabled, the CLI keeps stdout compact on validation failure and does not print the full row-by-row error listing.

The workbook rows are written as validation proceeds, so this mode does not require the full CSV to be held in memory first.

`--help`, `-h`

Show the built-in help text.

`--version`, `-V`

Show the CLI version.

Exit behavior

exits 0 when validation succeeds
exits 1 when validation errors are found
in --json mode, exits 2 for fatal setup errors such as a missing schema

JSON output shape

When --json is enabled, the CLI writes one top-level result object with these fields:

input_file
schema_name
ok
error_count
row_errors

For fatal setup errors, the object also includes:

fatal_error

Each row_errors entry includes:

row
messages

JSON contract stability

Treat --json as the supported automation interface for this tool.

That means the following are intended to remain stable for R, Python, and workflow clients:

the top-level JSON object shape
the documented keys
the row-level row and messages fields
exit code 0 for success
exit code 1 for validation failures
exit code 2 for fatal setup errors

Human-readable output is intended for interactive use and may change more freely than the JSON mode.

Command​

Required options​

--ddl​

--input​

Optional options​

--table, -t​

--save-schemas​

--no-color, -nc​

--json​

--turbo​

--report-tsv​

--report-xlsx​

--help, -h​

--version, -V​

Exit behavior​

JSON output shape​

JSON contract stability​