Utilities
reorder-csv.pl
utils/reorder-csv.pl is a supporting utility for reordering CSV columns to match the order defined in a DDL file.
When you need it
You do not need to reorder columns before using omop-csv-validator.
The validator matches data by header name, so validation is header-based rather than position-based.
reorder-csv.pl is useful when a downstream step expects columns in canonical OMOP table order, for example:
- a PostgreSQL
COPYor\copyworkflow that loads fields positionally - a SQLite
.importworkflow that loads fields positionally - a loader that assumes table-order columns
- an exchange or normalization step where you want a consistent DDL-derived layout
It is useful when:
- your source CSV contains the right fields but not in the expected order
- you want a DDL-driven reorder step before import or in other tooling where column position matters
Typical workflow
In most cases the flow is:
- validate the exported file as-is
- reorder it only if the next tool expects canonical column order
- import or hand off the reordered file
If your PostgreSQL load command already provides an explicit column list, you may not need reordering at all.
Example
utils/reorder-csv.pl \
--ddl ddl/OMOPCDM_postgresql_5.4_ddl.sql \
--input PERSON.csv \
--output PERSON.reordered.csv
For a SQLite DDL:
utils/reorder-csv.pl \
--ddl ddl/OMOPCDM_sqlite_5.4_ddl.sql \
--ddl-type sqlite \
--input PERSON.csv \
--output PERSON.reordered.csv
Quick SQLite debugging workflow
If you want a lightweight local check before loading into PostgreSQL, SQLite can be useful for quick debugging.
The important detail is that SQLite .import is positional. That is why reordering can matter here even though validation itself does not require ordered columns.
Example:
bin/omop-csv-validator \
--ddl path/to/OMOPCDM_sqlite_5.4_ddl.sql \
--input PERSON.csv
utils/reorder-csv.pl \
--ddl path/to/OMOPCDM_sqlite_5.4_ddl.sql \
--ddl-type sqlite \
--input PERSON.csv \
--output PERSON.reordered.csv
sqlite3 omop_debug.db <<'EOF'
.read path/to/OMOPCDM_sqlite_5.4_ddl.sql
.mode csv
.separator "\t"
.import '| tail -n +2 PERSON.reordered.csv' person
SELECT COUNT(*) FROM person;
EOF
This is useful for:
- quick local sanity checks
- debugging transformed exports
- checking whether a reordered file can be loaded structurally before moving to PostgreSQL
It is not a substitute for a full PostgreSQL load when you need production-faithful behavior.
Notes
- this script is a helper, not the main validation entry point
- it is mainly for downstream load preparation, not for core validation
- it supports
--tableif the filename does not map cleanly to the target table - it defaults
--ddl-typetopostgresql - it inserts
\Nfor missing columns in the reordered output - it drops extra input columns that are not present in the target DDL order
- it can infer the separator, but
--sepis still available as an override