CLI Reference¶

The public CLI entry point is:

spectral-library

Repository code behind the CLI is now split into:

spectral_library.mapping
spectral_library.distribution
spectral_library.sources
spectral_library.normalization

Only the commands below are part of the public CLI contract.

Global Options¶

Option	Meaning
`--version`	print the package version and exit
`--json-errors`	emit machine-readable JSON errors to `stderr`
`--json-logs`	emit structured JSON log events to `stderr`

Public Commands¶

`build-mapping-library`¶

Build the prepared runtime used by mapping.

Required arguments:

Flag	Meaning
`--siac-root`	SIAC-style export root containing metadata and normalized spectra tables
`--source-sensor`	one or more source sensor ids to precompute
`--output-root`	output directory for the prepared runtime

Optional arguments:

Flag	Meaning
`--dtype`	floating-point output dtype, default `float32`
`--srf-root`	when provided, load extra local sensor-definition JSON documents alongside built-in `rsrf` sensors; each file must be a valid `rsrf_sensor_definition`, with mapping segment metadata in `bands[].extensions.spectral_library.segment`
`--knn-index-backend`	optionally persist ANN indexes for `faiss`, `pynndescent`, or `scann` during build

When you rely on built-in rsrf sensors instead of local JSON definitions, install rsrf>=0.3.1. On first use, rsrf can bootstrap the matching canonical runtime data into its local cache automatically. Set RSRF_ROOT only to force a specific checkout or mirrored runtime root, and preseed RSRF_CACHE_DIR for offline environments.

`download-prepared-library`¶

Download a pre-built prepared runtime from a GitHub Release or a direct URL.

Required arguments:

Flag	Meaning
`--output-root`	local directory to extract the runtime into

Optional arguments:

Flag	Meaning
`--url`	direct URL to a `.tar.gz` runtime archive (skips GitHub Release lookup)
`--tag`	GitHub Release tag to download from (e.g. `v0.6.3`); defaults to latest
`--sha256`	expected SHA-256 hex digest for the archive
`--no-verify`	skip runtime validation after extraction

`validate-prepared-library`¶

Validate a prepared runtime root.

Required arguments:

Flag	Meaning
`--prepared-root`	runtime root created by `build-mapping-library`

Optional arguments:

Flag	Meaning
`--no-verify-checksums`	skip file hashing while still requiring the full runtime layout

`map-reflectance`¶

Map one sample from a source sensor to a target sensor or to a reconstructed spectral output.

Required arguments:

Flag	Meaning
`--source-sensor`	source sensor id
`--input`	single-sample CSV
`--output-mode`	one of `target_sensor`, `vnir_spectrum`, `swir_spectrum`, `full_spectrum`
`--output`	output CSV path

Conditionally required:

Flag	Meaning
`--target-sensor`	required when `--output-mode target_sensor`

Optional arguments:

Flag	Meaning
`--prepared-root`	override the default published runtime with a custom prepared runtime root
`--k`	nearest-neighbor count, default `10`
`--min-valid-bands`	minimum valid source bands per segment, default `1`
`--neighbor-estimator`	`mean`, `distance_weighted_mean`, or `simplex_mixture`, default `mean`
`--knn-backend`	shortlist search backend: `numpy`, `scipy_ckdtree`, `faiss`, `pynndescent`, or `scann`; default `numpy`
`--knn-eps`	approximation slack / search-accuracy knob for supported ANN backends; `0` keeps `scipy_ckdtree` exact
`--exclude-row-id`	exclude one or more prepared row ids from neighbor selection
`--exclude-sample-name`	exclude one or more prepared `sample_name` values from neighbor selection
`--diagnostics-output`	optional JSON summary path
`--neighbor-review-output`	optional CSV path with one row per segment-neighbor review record

Single-sample input layouts:

Layout	Columns
long	`band_id,reflectance[,valid]`
wide	one row with one source-band column per band id, optional `valid_<band_id>` columns

Single-sample outputs:

Output mode	CSV shape
`target_sensor`	`band_id,segment,reflectance`
spectral outputs	`wavelength_nm,reflectance`

If --neighbor-review-output is set, the command also writes a CSV review table with one row per segment-neighbor pair. It includes the sample id, segment status, neighbor rank, distance, estimator weight, query band values, and the selected neighbor's source-band values. The review rows also record knn_backend and knn_eps.

When --prepared-root is omitted, the CLI resolves the package-matched published runtime into the user cache and reuses it on later runs.

The JSON diagnostics also include a heuristic confidence_score at the overall mapping level and per segment. It is derived from valid-band coverage, neighbor distances, estimator weight concentration, and source-space fit RMSE. Treat it as a ranking aid, not a calibrated probability. The diagnostics also include confidence_policy with the current routing thresholds:

high / accept: >= 0.85
medium / manual_review: >= 0.60 and < 0.85
low / reject: < 0.60

`map-reflectance-batch`¶

Map many samples from one CSV.

Required arguments:

Flag	Meaning
`--source-sensor`	source sensor id
`--input`	batch CSV
`--output-mode`	public output mode
`--output`	output CSV path

Conditionally required:

Flag	Meaning
`--target-sensor`	required when `--output-mode target_sensor`

Optional arguments:

Flag	Meaning
`--prepared-root`	override the default published runtime with a custom prepared runtime root
`--k`	nearest-neighbor count
`--min-valid-bands`	minimum valid source bands per segment
`--neighbor-estimator`	`mean`, `distance_weighted_mean`, or `simplex_mixture`
`--knn-backend`	shortlist search backend: `numpy`, `scipy_ckdtree`, `faiss`, `pynndescent`, or `scann`
`--knn-eps`	approximation slack / search-accuracy knob for supported ANN backends
`--exclude-row-id`	exclude one or more prepared row ids for every batch sample
`--exclude-sample-name`	exclude one or more prepared `sample_name` values for every batch sample
`--self-exclude-sample-id`	exclude rows whose prepared `sample_name` matches each batch `sample_id`
`--diagnostics-output`	optional JSON summary path
`--neighbor-review-output`	optional CSV path with one row per sample-segment-neighbor review record

Batch input layouts:

Layout	Columns
long	`sample_id,band_id,reflectance[,valid][,exclude_row_id]`
wide	`sample_id`, optional `exclude_row_id`, one column per source band, and optional `valid_<band_id>` columns

If exclude_row_id is present in the batch CSV, the command excludes that one exact prepared row for the corresponding sample before retrieval. This is the recommended way to run held-out full-library examples without removing other rows that share the same class or source.

Batch outputs:

Output mode	CSV shape
`target_sensor`	one row per sample, `sample_id` plus one column per target band
spectral outputs	one row per sample, `sample_id` plus `nm_<wavelength>` columns

If --neighbor-review-output is set, the command writes a long-form CSV with one row per sample, segment, and retained neighbor. This is the easiest public way to audit which candidates were shortlisted and how the estimator reweighted them.

When --prepared-root is omitted, the CLI resolves the package-matched published runtime into the user cache and reuses it on later runs.

`benchmark-mapping`¶

Benchmark retrieval against the regression baseline.

Required arguments:

Flag	Meaning
`--source-sensor`	source sensor id
`--target-sensor`	target sensor id
`--report`	JSON output path

Optional arguments:

Flag	Meaning
`--prepared-root`	override the default published runtime with a custom prepared runtime root
`--k`	nearest-neighbor count, default `10`
`--test-fraction`	held-out fraction, default `0.2`
`--max-test-rows`	optional cap on held-out rows, default `0` in the CLI and `None` in Python
`--random-seed`	split seed, default `0`
`--neighbor-estimator`	retrieval estimator to benchmark: `mean`, `distance_weighted_mean`, or `simplex_mixture`; default `mean`
`--knn-backend`	shortlist search backend: `numpy`, `scipy_ckdtree`, `faiss`, `pynndescent`, or `scann`; default `numpy`
`--knn-eps`	approximation slack / search-accuracy knob for supported ANN backends

When --prepared-root is omitted, the CLI resolves the package-matched published runtime into the user cache and reuses it on later runs.

Optional backend install extras:

python3 -m pip install "spectral-library[knn]"
python3 -m pip install "spectral-library[knn-faiss]"
python3 -m pip install "spectral-library[knn-pynndescent]"
python3 -m pip install "spectral-library[knn-scann]"

Benchmark Automation¶

For larger prepared runtimes, the repository also ships a multi-scenario benchmark runner:

PYTHONPATH=src python scripts/run_full_library_benchmarks.py \
  --prepared-root path/to/prepared_runtime \
  --neighbor-estimator simplex_mixture \
  --knn-backend numpy \
  --k 10 \
  --max-test-rows 512 \
  --output-root build/full-library-benchmarks \
  --thresholds benchmarks/default_thresholds.json \
  --fail-on-thresholds

It writes per-scenario JSON reports under runs/, plus aggregate summary.csv, summary.json, and reports.json.

For non-numpy backends, the runner also records a same-scenario comparison to the exact numpy baseline and can fail when backend drift exceeds the baseline_deltas thresholds in benchmarks/default_thresholds.json.

Exit Codes¶

Code	Meaning
`0`	Success
`1`	General failure (invalid input, runtime error, or validation failure)
`2`	Argument parsing error (missing required flags, unknown options)

All non-zero exits produce a human-readable message on stderr. Use --json-errors for machine-readable error envelopes.

Error Codes¶

Each structured error carries an error_code field for programmatic handling:

`error_code`	Raised by	Meaning
`invalid_sensor_schema`	`build-mapping-library`	Sensor schema data from `rsrf` or local JSON is malformed, unsupported, or violates band rules
`prepare_failed`	`build-mapping-library`	Runtime build failed (missing SIAC files, array errors)
`invalid_prepared_library`	`validate-prepared-library`, `map-reflectance`	Runtime layout or checksums are invalid
`prepared_library_incompatible`	`validate-prepared-library`, `map-reflectance`	Runtime schema version is not supported by this package version
`invalid_mapping_input`	`map-reflectance`, `map-reflectance-batch`	Input reflectance is invalid, sensor not found, or no valid segments

Error Behavior¶

Public commands provide:

non-zero exit codes on failure
human-readable errors by default
JSON error envelopes with --json-errors
newline-delimited JSON log events with --json-logs

The error envelope includes:

error_code
message
command
context

The JSON log envelope includes:

command
event
level
timestamp
context
elapsed_ms on completion and failure events

Event values:

command_started
command_completed
command_failed

Level values:

info for start and completion
error for failures

Internal Commands¶

Source acquisition and library-packaging workflows are intentionally separated from the public mapping CLI. Maintainers should use:

spectral-library-internal --help

That maintainer-only entrypoint exposes the retained internal commands:

plan-matrix
fetch-source
fetch-batch
assemble-database
tidy-results
normalize-sources
plot-quality
filter-coverage
build-library-package

Legacy direct invocation through spectral-library <internal-command> still works for repository automation, but those commands are hidden from the public help and are not part of the stable public mapping contract. See Internal Build Pipeline for the maintained workflow.

CLI Reference¶

Global Options¶

Public Commands¶

build-mapping-library¶

download-prepared-library¶

validate-prepared-library¶

map-reflectance¶

map-reflectance-batch¶

benchmark-mapping¶

Benchmark Automation¶

Exit Codes¶

Error Codes¶

Error Behavior¶

Related Docs¶

Internal Commands¶

`build-mapping-library`¶

`download-prepared-library`¶

`validate-prepared-library`¶

`map-reflectance`¶

`map-reflectance-batch`¶

`benchmark-mapping`¶