Spectral-Library Additions Required To Remove Sensor Definition And Runtime Bridging From SIAC¶
Goal¶
spectral-library should become the source of truth for custom and canonical
sensor definitions used during spectral mapping. SIAC should stop owning:
- per-band RSRF lookup and realization
- custom sensor-definition assembly
vnir/swirsegment assignment for mapping runtimes- temporary
srf_rootJSON emission - temporary
siac_rootexport formatting
After this change, SIAC should only pass source observations, target-band
requests, and a hyperspectral library into spectral-library, then consume the
mapping outputs.
Problem Statement¶
SIAC still contains a spectral-response and runtime-bridge layer that should belong upstream.
Current SIAC-owned seams:
python/siac/adapters/rsrf.pyloads per-band response definitions fromrsrfand rebuildsSensorConfigobjects with attached sampled curves.python/siac/adapters/satellite/sentinel2.pydepends on that adapter and still falls back to SIAC-local built-in sensor metadata when upstream lookup fails.python/siac/algorithms/surface/spectral_mapping.pystill:- realizes source and target band responses
- assigns runtime band ids
- assigns
vnir/swir - writes custom
rsrf_sensor_definitionJSON tosrf_root - exports a SIAC-formatted hyperspectral library to
siac_root
This split is wrong because the mapping package already depends on rsrf and
already validates sensor definitions internally. SIAC should not need a second
bridge that reconstructs the same concepts.
Current Boundary¶
| Area | SIAC today | spectral-library today | Why the split is wrong |
|---|---|---|---|
| Canonical sensor metadata | Maintains local SensorConfig and SensorBand definitions, plus sensor-specific preprocessor attachment |
Knows canonical mapping sensors and their mapping-specific band layout | Canonical sensor ownership is duplicated across packages |
| Canonical RSRF lookup | Calls rsrf directly through python/siac/adapters/rsrf.py |
Also calls rsrf internally when building runtimes |
SIAC should not be a second rsrf client for the same mapping path |
| Custom sensor definitions | Builds rsrf_sensor_definition payloads in python/siac/algorithms/surface/spectral_mapping.py |
Requires those payloads, but only after SIAC serializes them to disk | SIAC should not have to write srf_root JSON |
| Target sensor registration | Stages both source and target definitions into the runtime build area so the target is present in sensor_schema.json |
Precomputes only source_sensors during build and resolves target_sensor later from packaged schemas |
SIAC should not need a second staging path just to make custom targets visible at runtime |
| Segment assignment | Assigns segment from band center wavelength and clips realized support before serialization | Validates segment support, but does not own the caller-facing segment policy | Segment ownership is split across packages |
| Hyperspectral library export | Writes SIAC tabular CSV files to a temporary siac_root |
Requires siac_root on disk for runtime build |
SIAC should be able to pass an in-memory library |
| Runtime identity and caching | Hashes custom sensor payloads and upstream versions to decide cache reuse | Builds prepared runtimes from on-disk inputs | The caller should not need to know the runtime cache artifact format |
Target Boundary¶
The desired ownership rule is:
rsrfowns response-definition semantics and realization.spectral-libraryowns sensor-definition coercion, segment policy, runtime build inputs, and prepared-runtime caching.- SIAC owns scene data,
xarrayintegration, source reflectance arrays, target band requests, and output packaging.
In the target state, SIAC should be able to say:
- Here is the hyperspectral library.
- Here is the source sensor or source band set.
- Here is the target sensor or target band set.
- Map these reflectance rows.
Everything below that boundary should belong to spectral-library.
Required Additions In spectral-library¶
| Capability | What spectral-library must provide | What SIAC can delete |
|---|---|---|
| In-memory hyperspectral library input | Accept canonical wavelengths, spectra, and sample ids directly, without requiring caller-written siac_root tabular files |
_export_hyperspectral_library_root, _ensure_siac_library_root, and SIAC-owned temporary library export handling |
| In-memory sensor input | Accept canonical sensor ids, rsrf_sensor_definition mappings, or neutral custom sensor inputs directly, without requiring caller-written srf_root JSON |
_write_sensor_schema and SIAC-owned temporary sensor-schema emission |
| Custom band input model | Accept per-band input as one of: canonical rsrf band reference, sampled response_definition, or center/FWHM gaussian / band_spec input |
SIAC-owned response-definition assembly and per-band bridge code |
| Segment policy ownership | Own the rule for vnir / swir assignment when the caller does not provide a segment. Default policy should be center-wavelength-based. |
SIAC segment_for_band(...) as a mapping-runtime concern |
| Internal segment clipping | When rsrf realizes a gaussian / band_spec, clip the realized positive support to the declared segment before runtime validation |
SIAC segmentize_curve(...) as a runtime-serialization concern |
| Stable sensor and band identity | Preserve caller band ids when provided, generate deterministic ids when omitted, preserve caller order, and return output columns in the same logical order | SIAC band-id rewriting, runtime band-id bookkeeping, and output remapping glue |
| Internal custom-sensor caching | Hash normalized custom sensor inputs and cache the prepared runtime internally | SIAC-owned runtime signature logic for custom sensor payloads |
| Direct runtime build API | Build a runtime from in-memory library + sensor inputs and return a reusable runtime object | Caller knowledge of siac_root, srf_root, and prepared-runtime file staging |
| Target-sensor availability in runtime | Accept target sensors during runtime build, or register them lazily through the same public input API used for source sensors | SIAC-owned target schema staging solely to satisfy runtime lookup |
| Clear error model | Distinguish invalid sensor input, missing canonical sensor, missing band, invalid segment policy, and realization failure | SIAC-side generic fallback paths and broad warning-based recovery |
| Optional prepared-runtime persistence | Allow callers to persist the runtime to disk when they want distribution or offline reuse, but do not require filesystem staging for normal use | SIAC staging directories as a mandatory integration detail |
Required Integration Contract For SIAC¶
SIAC should consume a narrow upstream API with the following behavior:
Sensor input¶
spectral-library should accept any of:
- canonical sensor id string such as
"sentinel-2a_msi" - full
rsrf_sensor_definitionmapping - neutral custom
SensorInputobject
The neutral custom SensorInput contract should support:
sensor_id: str | Nonebands: Sequence[BandInput]band_id_policy: "preserve" | "deterministic"segment_policy: "explicit" | "center_wavelength"
If segment_policy="center_wavelength", the package shall assign:
vnirfor center wavelength< 1000 nmswirfor center wavelength>= 1000 nm
Each BandInput should support:
band_id: str | Nonecenter_wavelength_nm: float | Nonefwhm_nm: float | Noneresponse_definition: Mapping[str, Any] | Nonersrf_sensor_id: str | Nonersrf_band_id: str | Nonersrf_representation_variant: str | Nonesegment: str | None
Library input¶
spectral-library should accept a neutral in-memory hyperspectral library
contract:
wavelengths_nmspectrasample_ids- optional provenance metadata
The caller should not need to write SIAC-formatted CSV files first.
Runtime build¶
The runtime build surface should look like:
from spectral_library import build_mapping_runtime
runtime = build_mapping_runtime(
library=library_input,
source_sensors=[source_sensor_input],
target_sensors=[target_sensor_input],
cache_root=Path("..."), # optional
output_root=Path("..."), # optional
)
Required behavior:
- If
output_rootis omitted, runtime build may use an internal cache or temp directory. - If
output_rootis provided, the runtime is persisted in the existing prepared-runtime format. - The returned runtime object exposes normalized sensor ids and band ids.
- Both source and target sensors are accepted through the same public input model. The caller does not need to stage target schemas separately just so the runtime can resolve them later.
Mapping¶
The mapping surface should accept a runtime object plus reflectance rows in the exact source-band order declared by the caller's normalized source sensor.
The returned target columns must preserve the normalized target-band order without requiring caller-side padding or canonical subset expansion.
Proposed Design¶
1. Add neutral input models to spectral-library¶
Add public types that are independent of SIAC internals:
BandInputSensorInputHyperspectralLibraryInputPreparedRuntime
These types must be neutral. SIAC should not have to pass a SIAC-specific
SensorBand or SensorConfig object into spectral-library.
2. Add a sensor coercion layer above the current schema loader¶
Add a public coercion entry point:
from spectral_library import coerce_sensor_input
sensor = coerce_sensor_input(
sensor_input,
segment_policy="center_wavelength",
)
This layer should:
- resolve canonical sensor ids through
rsrf - accept already-normalized
rsrf_sensor_definitionpayloads - accept band-spec and gaussian inputs and normalize them through
rsrf - assign missing band ids deterministically
- assign missing segments from center wavelength when allowed
- clip realized support to segment bounds before validation
- return one normalized runtime sensor object
3. Move segment ownership into spectral-library¶
The current split is:
- SIAC assigns segment from center wavelength
spectral-libraryvalidates support stays inside the segment
This is the wrong boundary for FWHM-only inputs. The new rule should be:
- If the caller gives an explicit segment, use it.
- Otherwise assign segment from center wavelength.
- Realize any
gaussian/band_specthroughrsrf. - Clip the realized curve to the segment range.
- Validate the clipped support and continue.
This keeps rsrf responsible for realization semantics and
spectral-library responsible for mapping-segment semantics.
It also removes the current split where SIAC chooses the segment while
spectral-library only enforces support bounds afterward.
4. Add in-memory runtime build¶
Add a public builder that accepts in-memory library and sensor inputs. The
existing prepared-runtime layout may stay unchanged internally, but the caller
must not need to know about siac_root and srf_root.
Implementation options:
- build internally in a temp/cache directory and return
PreparedRuntime - or add a true in-memory build path and optional persistence
Either option is acceptable for SIAC as long as the on-disk staging details are
fully owned by spectral-library.
5. Preserve caller order and identity¶
For custom sensor inputs, spectral-library must guarantee:
- band order is preserved
- output columns are returned in normalized target-band order
- stable ids are returned for both source and target sensors
This removes the need for SIAC to:
- generate hashed sensor ids itself
- rename the primary NIR band itself
- pad source subsets into canonical source layouts
- remap target outputs through local band-id bookkeeping
6. Keep the low-level file builder as an implementation detail¶
The current build_mapping_library(siac_root, srf_root, output_root, ...)
surface can remain as a lower-level API for maintainers or offline tooling.
The new SIAC-facing API should sit above it and own:
- in-memory input coercion
- temporary staging if needed
- prepared-runtime cache reuse
- normalization and validation
Responsibilities After The Change¶
| Concern | Owner after change |
|---|---|
| Canonical response-definition semantics | rsrf |
| Canonical sensor-definition lookup | spectral-library using rsrf |
| Custom sensor-definition coercion | spectral-library |
| Segment assignment and segment clipping for mapping | spectral-library |
| Prepared-runtime caching and persistence | spectral-library |
xarray flattening and restoration |
SIAC |
| Observation extraction and TOA preprocessing | SIAC |
| Surface-prior orchestration | SIAC |
SIAC Code Expected To Be Deleted¶
Once the new spectral-library API exists and SIAC upgrades to it, SIAC should
be able to delete or drastically shrink:
python/siac/adapters/rsrf.py- the RSRF fallback branch in
python/siac/adapters/satellite/sentinel2.py - custom sensor-definition and runtime-staging logic in
python/siac/algorithms/surface/spectral_mapping.py
The remaining SIAC spectral-mapping code should be limited to:
- source observation flattening
- target
xarrayreconstruction - uncertainty propagation
- workflow integration
Rollout Assumptions¶
This design does not require SIAC backward compatibility with older
spectral-library integration paths.
Assumptions:
- SIAC may raise its minimum
spectral-libraryversion to the first release that includes the new API. - SIAC may delete the old local bridge directly once that floor is adopted.
- The old low-level file builder may remain upstream, but SIAC does not need to keep dual-path support.
Verification Plan¶
The upstream change is complete when the following are true:
- [ ] SIAC does not import
python/siac/adapters/rsrf.pyduring normal spectral mapping or preprocessor sensor-config resolution. - [ ] SIAC does not write temporary
srf_rootJSON. - [ ] SIAC does not write temporary
siac_rootCSV exports for mapping. - [ ] SIAC does not assign mapping segments for runtime build.
- [ ] SIAC does not have to stage target schemas separately from source schemas.
- [ ] A Sentinel-2 preprocessor path can resolve its runtime sensor metadata
through
spectral-libraryalone. - [ ] A custom target-band set defined only by center wavelength and FWHM can be
passed directly to
spectral-library. - [ ] A mixed-source mapping path preserves source-band order and target output order without caller-side padding.
- [ ] Error cases distinguish:
- missing canonical sensor
- missing band within a canonical sensor
- invalid custom band input
- invalid segment assignment
- failure to realize a response definition
Non-Goals¶
This document does not propose:
- redesigning SIAC runtime payloads broadly
- changing solver or correction science
- moving scene preprocessing into
spectral-library - changing the prepared-runtime file format unless needed internally
Open Questions¶
- Should
spectral-libraryexpose a neutralSensorInputtype publicly, or accept plain mappings and dataclasses interchangeably? - Should in-memory runtime build be a true no-files path, or is internal temp staging acceptable as long as the caller does not manage it?
- Should
spectral-libraryown preprocessor-ready sensor objects, or only mapping-ready sensor/runtime objects? - Should deterministic id generation be caller-visible, or treated as an internal cache-detail only?