Skip to content

๐Ÿ  Introduction

ClarID-Tools

Welcome to the documentation for ClarID-Tools

๐Ÿ“ Description

ClarID-Tools is a toolkit for working with the ClarID specification, including the reference command-line implementation, preprocessing utilities, and example workflows. The objective is to standardize how subject and biosample metadata are transformed into compact, informative IDs for downstream integration, tracking, and exchange.


๐Ÿ”ฌ Key Features

  • ๐Ÿงฌ Biosample and Subject ID generation from structured metadata
  • ๐Ÿฉบ Support for clinical and experimental metadata, including species, tissue, assay, condition, and more
  • ๐Ÿ“„ Human-readable and stub-formatted modes for compact or verbose identifiers
  • ๐Ÿงช Bulk and single-record encoding/decoding
  • โœ… Schema validation using JSON Schema and YAML codebooks
  • ๐Ÿ“ฆ Command-line interface

Before You Start

Choose the command style that matches how you are using ClarID-Tools:

  • If you cloned this repository, run commands from the repository root with bin/clarid-tools.
  • If you installed ClarID-Tools system-wide or with CPAN, run clarid-tools.
  • You can omit --codebook unless you want to use a custom codebook. The packaged default codebook is used automatically.
  • The qrcode subcommand requires external tools:
  • host install or git checkout: install qrencode and zbarimg (zbar-tools)
  • Docker image: these tools are already included

Start Here

If you are new to ClarID-Tools, use this order:

  1. Read the Quickstart for copy-paste examples.
  2. Check the CLI Reference for all command options.
  3. Review the Specification if you need the exact identifier structure.
  4. See the Use Cases and Use Cases for real GDC-derived examples.

Choosing an ID Format

Use human format when the identifier will be read directly by people in reports, spreadsheets, filenames, or manual review workflows.

Use stub format when compactness matters more than readability, for example in QR codes, labels, or constrained downstream systems.

Both formats are generated from the same metadata and codebook, so you can encode and decode between structured metadata and IDs consistently.

What This Repository Implements

The ClarID paper defines the identifier concept and rationale. This repository provides the reference command-line implementation, the codebook/schema validation workflow, and end-to-end examples for subject and biosample identifiers.