Optimization Guide¶
ARC-SCOPE optimisation calibrates selected SCOPE parameters against observed
targets after ARC has retrieved the canopy state. ARC outputs such as LAI,
Cab, and Cw remain fixed. The optimiser changes only the parameters you
choose in optim_config, runs SCOPE repeatedly, and keeps the parameter values
that reduce the configured loss.
Use optimisation when you have observations that correspond to SCOPE outputs:
- SIF observations for the fluorescence workflow.
- Thermal radiance or land-surface-temperature-like observations for the thermal workflow.
- SIF, thermal, or flux observations for the coupled
energy-balanceworkflow.
What the Runner Does¶
When PipelineConfig.optimize=True, optim_config["enabled"] = True, or a
nested runner payload contains optim.enabled, ArcScopePipeline.run() changes
from a single forward simulation to this loop:
- Run ARC, bridge, weather, geometry, and SCOPE input preparation as usual.
- Load observations from
optim_config["observations"]orobservations_path. - Choose parameters from
optim_config["parameters"],parameter_preset, or the workflow default. - Inject trial parameter values into the prepared SCOPE input dataset.
- Run SCOPE and compare selected output variables with observations.
- Minimise the scalar loss with scipy.
- Inject the optimised parameters and run one final SCOPE simulation.
- Return the optimised input, output, and
OptimizationResult.
Target names must match SCOPE output names
target_variables are SCOPE output variable names, not generic labels.
Run a baseline simulation first and inspect result.scope_output_ds.data_vars
if you are unsure which names your installed scope-rtm version emits.
Shared Configuration Pattern¶
Every optimisation run needs observations and target variables:
from arc_scope.pipeline import ArcScopePipeline, PipelineConfig
config = PipelineConfig(
geojson_path="field.geojson",
start_date="2021-05-15",
end_date="2021-10-01",
crop_type="wheat",
start_of_season=170,
year=2021,
scope_workflow="fluorescence",
scope_root_path="./upstream/SCOPE",
optim_config={
"enabled": True,
"observations_path": "observed_sif.nc",
"target_variables": ["F740"],
},
)
result = ArcScopePipeline(config).run()
summary = result.optimization_result
observations_path should point to a NetCDF dataset with variables matching
target_variables. You can also pass an in-memory xarray.Dataset:
import xarray as xr
observations = xr.Dataset(
{"F740": ("time", observed_f740_values)},
coords={"time": observation_times},
)
config.optim_config["observations"] = observations
ARC-SCOPE compares finite values after flattening each target variable. The current objective aligns by common array length, so you should prepare observations on the same time/space subset as the model output.
Choosing Parameters¶
If parameters is omitted, ARC-SCOPE uses a workflow default:
| Workflow | Default parameters | Typical observation target |
|---|---|---|
fluorescence |
fqe |
SIF, e.g. F740 or F685 |
thermal |
rss, rbs |
thermal output, e.g. Loutt or Lot_ |
energy-balance |
fqe, rss, rbs, Cd, rwc |
mixed SIF, thermal, and flux outputs |
You can override the defaults with explicit parameter specs:
"parameters": [
{
"name": "fqe",
"initial": 0.01,
"lower": 0.001,
"upper": 0.1,
"transform": "log",
},
{
"name": "rss",
"initial": 500.0,
"lower": 10.0,
"upper": 5000.0,
"transform": "log",
"optimize": False,
},
]
transform controls how scipy sees the parameter:
identity: optimise directly, clipped to bounds.log: optimise positive parameters such asfqe,rss,rbs, andCd.logit: optimise values bounded between lower and upper, such asrwc.
optimize=False keeps a parameter fixed while still injecting it into the SCOPE
input dataset.
Generated Example Artifacts¶
The figures and numeric outputs below are generated by a reproducible script:
PYTHONPATH=src python -m arc_scope.experiments.optimization_examples \
--output-dir docs/assets/optimization
The same generator is exposed through the installed console script
arcope-optimization-examples, the local example
python examples/07_optimization_examples.py, and the Pixi task
pixi run optimization-examples.
The generator needs matplotlib for SVG output; install arcope[docs] or use
the repository Pixi environment.
The script uses deterministic proxy SCOPE runners so the documentation can be
rebuilt without live ARC retrieval, ERA5 credentials, or scope-rtm runtime
assets. It still exercises ARC-SCOPE's real optimisation path:
run_pipeline_optimization(), ScopeObjective, ParameterSet, and
ScipyOptimizer.
Generated files:
Proxy examples vs scientific production runs
These generated examples are real optimisation outputs from runnable code,
but their forward model is a deterministic proxy. For scientific outputs,
use the same configuration pattern with ArcScopePipeline.run() and real
SCOPE target variables from your installed scope-rtm workflow.
Example 1: Fit SIF by Tuning fqe¶
Use this for experiments where SIF is the scientific target. The fluorescence
workflow defaults to the SIF preset, so the explicit parameters block below is
optional but recommended for auditability.
config = PipelineConfig(
geojson_path="field.geojson",
start_date="2021-05-15",
end_date="2021-10-01",
crop_type="wheat",
start_of_season=170,
year=2021,
scope_workflow="fluorescence",
scope_root_path="./upstream/SCOPE",
save_scope_netcdf=True,
optim_config={
"enabled": True,
"observations_path": "observed_sif.nc",
"target_variables": ["F740"],
"parameters": [
{
"name": "fqe",
"initial": 0.01,
"lower": 0.001,
"upper": 0.1,
"transform": "log",
}
],
"optimizer": {"type": "scipy", "method": "L-BFGS-B"},
"max_iter": 50,
"tol": 1e-6,
},
)
result = ArcScopePipeline(config).run()
fit = result.optimization_result
print(fit.parameters_initial)
print(fit.parameters_optimized)
print(fit.initial_loss, fit.optimized_loss)
Generated output from docs/assets/optimization/summary.json:
{'fqe': 0.01}
{'fqe': 0.0180}
10.423289 0.000047
Interpretation:
fqeincreased from0.01to0.0180, so the initial fluorescence yield was too low for the observed SIF.optimized_loss < initial_loss, so the fitted SCOPE output is closer to the observations under the configured MSE objective.- The final
result.scope_output_dsis the SCOPE run with the fittedfqe, not the original forward simulation.
Example 2: Fit Thermal Outputs by Tuning Soil Resistances¶
Use this when the target is thermal radiance or a thermal product derived from
the thermal workflow. The default thermal preset tunes rss and rbs.
config = PipelineConfig(
geojson_path="field.geojson",
start_date="2021-05-15",
end_date="2021-10-01",
crop_type="wheat",
start_of_season=170,
year=2021,
scope_workflow="thermal",
scope_root_path="./upstream/SCOPE",
optim_config={
"enabled": True,
"observations_path": "observed_thermal.nc",
"target_variables": ["Loutt"],
"parameter_preset": "thermal",
"optimizer": "scipy",
"max_iter": 80,
"tol": 1e-6,
},
)
result = ArcScopePipeline(config).run()
fit = result.optimization_result
Generated output from docs/assets/optimization/summary.json:
fit.parameters_initial
# {'rss': 500.0, 'rbs': 10.0}
fit.parameters_optimized
# {'rss': 838.5037, 'rbs': 18.6914}
fit.initial_loss, fit.optimized_loss
# (3.557060, 0.000168)
Interpretation:
- Higher
rssmeans the fitted run needed greater soil surface resistance than the default. - Higher
rbsmeans the fitted run needed stronger boundary-layer resistance. - These values are calibration parameters for the configured observations and period. They are not direct ARC retrieval products.
If your SCOPE output uses a different thermal variable name, change
target_variables. Common candidates are Loutt, Lot_, Eoutt, or a
postprocessed LST variable if you add one to the output dataset.
Example 3: Fit Coupled Energy Balance¶
Use energy-balance when the run should fit fluorescence and thermal/flux
behavior in the coupled branch. The default preset tunes:
fqe: fluorescence quantum efficiencyrss: soil surface resistancerbs: boundary-layer resistanceCd: drag coefficientrwc: relative water content
config = PipelineConfig(
geojson_path="field.geojson",
start_date="2021-05-15",
end_date="2021-10-01",
crop_type="wheat",
start_of_season=170,
year=2021,
scope_workflow="energy-balance",
scope_root_path="./upstream/SCOPE",
scope_options={"soil_heat_method": 2},
optim_config={
"enabled": True,
"observations_path": "observed_energy_balance.nc",
"target_variables": ["F740", "Loutt", "LE"],
"parameter_preset": "energy-balance",
"optimizer": {"type": "scipy", "method": "L-BFGS-B"},
"max_iter": 120,
"tol": 1e-6,
},
)
result = ArcScopePipeline(config).run()
fit = result.optimization_result
Generated output from docs/assets/optimization/summary.json:
fit.parameters_initial
# {'fqe': 0.01, 'rss': 500.0, 'rbs': 10.0, 'Cd': 0.2, 'rwc': 0.5}
fit.parameters_optimized
# {'fqe': 0.0162, 'rss': 708.5325, 'rbs': 14.4040, 'Cd': 0.3202, 'rwc': 0.5902}
fit.initial_loss, fit.optimized_loss
# (60.651931, 0.000868)
Interpretation:
- The optimiser is fitting all targets with one scalar loss.
- The final parameter set is a compromise across SIF, thermal, and flux targets.
- If one target has much larger numeric units than another, the default MSE can be dominated by that target.
For mixed-unit fitting, normalise observations before fitting or provide a
custom loss_fn that returns comparable magnitudes for each target. The current
runner calls loss_fn(predicted, observed) once per target and sums the returned
losses.
Example 4: Fit Only One Parameter in a Larger Workflow¶
You can run the energy-balance workflow while fitting only SIF yield and holding thermal parameters fixed:
optim_config = {
"enabled": True,
"observations_path": "observed_sif_for_energy_balance.nc",
"target_variables": ["F740"],
"parameters": [
{"name": "fqe", "initial": 0.01, "lower": 0.001, "upper": 0.1, "transform": "log"},
{"name": "rss", "initial": 500.0, "lower": 10.0, "upper": 5000.0, "transform": "log", "optimize": False},
{"name": "rbs", "initial": 10.0, "lower": 1.0, "upper": 100.0, "transform": "log", "optimize": False},
],
}
This is useful when you want the coupled model physics but only trust one class of observations for calibration.
The generated parameter summary compares initial values, fitted values, and the known generating values used by the proxy examples:
Reading the Outcome¶
result.optimization_result contains:
| Field | Meaning |
|---|---|
status |
"optimized" when the optimisation branch completed. |
target_variables |
SCOPE output variables used in the loss. |
initial_loss |
Loss from the initial parameter values before fitting. |
optimized_loss |
Loss from the final fitted parameter values. |
parameters_initial |
Physical parameter values used at the start. |
parameters_optimized |
Physical parameter values after fitting. |
optimizer |
Optimiser implementation and method, e.g. scipy:L-BFGS-B. |
converged |
Whether scipy reported convergence. A non-converged fit may still improve the loss but should be reviewed. |
metadata |
Workflow and parameter-count metadata. |
The optimised datasets also include attributes for manifest writers:
result.scope_output_ds.attrs["arc_scope_optimization_status"]
result.scope_output_ds.attrs["arc_scope_optimization_parameters_optimized"]
result.scope_output_ds.attrs["arc_scope_optimization_initial_loss"]
result.scope_output_ds.attrs["arc_scope_optimization_optimized_loss"]
These attributes are intended to prevent optimised and non-optimised runs from looking identical in downstream artifacts.
Practical Checks¶
Before treating a fit as scientific output, check:
optimized_lossis lower thaninitial_loss.convergedisTrue, or the non-convergence reason is understood from scipy logs.- Fitted parameters are not pinned exactly at lower or upper bounds.
- Target variables and observations use compatible units.
- A hold-out period or independent observation source gives similar behavior.
Optimisation improves agreement with the supplied observations. It does not prove that a parameter is uniquely identifiable or physically true.