Optimization Guide¶

ARC-SCOPE optimisation calibrates selected SCOPE parameters against observed targets after ARC has retrieved the canopy state. ARC outputs such as LAI, Cab, and Cw remain fixed. The optimiser changes only the parameters you choose in optim_config, runs SCOPE repeatedly, and keeps the parameter values that reduce the configured loss.

Use optimisation when you have observations that correspond to SCOPE outputs:

SIF observations for the fluorescence workflow.
Thermal radiance or land-surface-temperature-like observations for the thermal workflow.
SIF, thermal, or flux observations for the coupled energy-balance workflow.

What the Runner Does¶

When PipelineConfig.optimize=True, optim_config["enabled"] = True, or a nested runner payload contains optim.enabled, ArcScopePipeline.run() changes from a single forward simulation to this loop:

Run ARC, bridge, weather, geometry, and SCOPE input preparation as usual.
Load observations from optim_config["observations"] or observations_path.
Choose parameters from optim_config["parameters"], parameter_preset, or the workflow default.
Inject trial parameter values into the prepared SCOPE input dataset.
Run SCOPE and compare selected output variables with observations.
Minimise the scalar loss with scipy.
Inject the optimised parameters and run one final SCOPE simulation.
Return the optimised input, output, and OptimizationResult.

Target names must match SCOPE output names

target_variables are SCOPE output variable names, not generic labels. Run a baseline simulation first and inspect result.scope_output_ds.data_vars if you are unsure which names your installed scope-rtm version emits.

Shared Configuration Pattern¶

Every optimisation run needs observations and target variables:

from arc_scope.pipeline import ArcScopePipeline, PipelineConfig

config = PipelineConfig(
    geojson_path="field.geojson",
    start_date="2021-05-15",
    end_date="2021-10-01",
    crop_type="wheat",
    start_of_season=170,
    year=2021,
    scope_workflow="fluorescence",
    scope_root_path="./upstream/SCOPE",
    optim_config={
        "enabled": True,
        "observations_path": "observed_sif.nc",
        "target_variables": ["F740"],
    },
)

result = ArcScopePipeline(config).run()
summary = result.optimization_result

observations_path should point to a NetCDF dataset with variables matching target_variables. You can also pass an in-memory xarray.Dataset:

import xarray as xr

observations = xr.Dataset(
    {"F740": ("time", observed_f740_values)},
    coords={"time": observation_times},
)

config.optim_config["observations"] = observations

ARC-SCOPE compares finite values after flattening each target variable. The current objective aligns by common array length, so you should prepare observations on the same time/space subset as the model output.

Choosing Parameters¶

If parameters is omitted, ARC-SCOPE uses a workflow default:

Workflow	Default parameters	Typical observation target
`fluorescence`	`fqe`	SIF, e.g. `F740` or `F685`
`thermal`	`rss`, `rbs`	thermal output, e.g. `Loutt` or `Lot_`
`energy-balance`	`fqe`, `rss`, `rbs`, `Cd`, `rwc`	mixed SIF, thermal, and flux outputs

You can override the defaults with explicit parameter specs:

"parameters": [
    {
        "name": "fqe",
        "initial": 0.01,
        "lower": 0.001,
        "upper": 0.1,
        "transform": "log",
    },
    {
        "name": "rss",
        "initial": 500.0,
        "lower": 10.0,
        "upper": 5000.0,
        "transform": "log",
        "optimize": False,
    },
]

transform controls how scipy sees the parameter:

identity: optimise directly, clipped to bounds.
log: optimise positive parameters such as fqe, rss, rbs, and Cd.
logit: optimise values bounded between lower and upper, such as rwc.

optimize=False keeps a parameter fixed while still injecting it into the SCOPE input dataset.

Generated Example Artifacts¶

The figures and numeric outputs below are generated by a reproducible script:

PYTHONPATH=src python -m arc_scope.experiments.optimization_examples \
  --output-dir docs/assets/optimization

The same generator is exposed through the installed console script arcope-optimization-examples, the local example python examples/07_optimization_examples.py, and the Pixi task pixi run optimization-examples. The generator needs matplotlib for SVG output; install arcope[docs] or use the repository Pixi environment.

The script uses deterministic proxy SCOPE runners so the documentation can be rebuilt without live ARC retrieval, ERA5 credentials, or scope-rtm runtime assets. It still exercises ARC-SCOPE's real optimisation path: run_pipeline_optimization(), ScopeObjective, ParameterSet, and ScipyOptimizer.

Generated files:

Proxy examples vs scientific production runs

These generated examples are real optimisation outputs from runnable code, but their forward model is a deterministic proxy. For scientific outputs, use the same configuration pattern with ArcScopePipeline.run() and real SCOPE target variables from your installed scope-rtm workflow.

Example 1: Fit SIF by Tuning `fqe`¶

Use this for experiments where SIF is the scientific target. The fluorescence workflow defaults to the SIF preset, so the explicit parameters block below is optional but recommended for auditability.

config = PipelineConfig(
    geojson_path="field.geojson",
    start_date="2021-05-15",
    end_date="2021-10-01",
    crop_type="wheat",
    start_of_season=170,
    year=2021,
    scope_workflow="fluorescence",
    scope_root_path="./upstream/SCOPE",
    save_scope_netcdf=True,
    optim_config={
        "enabled": True,
        "observations_path": "observed_sif.nc",
        "target_variables": ["F740"],
        "parameters": [
            {
                "name": "fqe",
                "initial": 0.01,
                "lower": 0.001,
                "upper": 0.1,
                "transform": "log",
            }
        ],
        "optimizer": {"type": "scipy", "method": "L-BFGS-B"},
        "max_iter": 50,
        "tol": 1e-6,
    },
)

result = ArcScopePipeline(config).run()
fit = result.optimization_result
print(fit.parameters_initial)
print(fit.parameters_optimized)
print(fit.initial_loss, fit.optimized_loss)

Generated output from docs/assets/optimization/summary.json:

{'fqe': 0.01}
{'fqe': 0.0180}
10.423289 0.000047

Generated SIF optimisation fit

Interpretation:

fqe increased from 0.01 to 0.0180, so the initial fluorescence yield was too low for the observed SIF.
optimized_loss < initial_loss, so the fitted SCOPE output is closer to the observations under the configured MSE objective.
The final result.scope_output_ds is the SCOPE run with the fitted fqe, not the original forward simulation.

Example 2: Fit Thermal Outputs by Tuning Soil Resistances¶

Use this when the target is thermal radiance or a thermal product derived from the thermal workflow. The default thermal preset tunes rss and rbs.

config = PipelineConfig(
    geojson_path="field.geojson",
    start_date="2021-05-15",
    end_date="2021-10-01",
    crop_type="wheat",
    start_of_season=170,
    year=2021,
    scope_workflow="thermal",
    scope_root_path="./upstream/SCOPE",
    optim_config={
        "enabled": True,
        "observations_path": "observed_thermal.nc",
        "target_variables": ["Loutt"],
        "parameter_preset": "thermal",
        "optimizer": "scipy",
        "max_iter": 80,
        "tol": 1e-6,
    },
)

result = ArcScopePipeline(config).run()
fit = result.optimization_result

Generated output from docs/assets/optimization/summary.json:

fit.parameters_initial
# {'rss': 500.0, 'rbs': 10.0}

fit.parameters_optimized
# {'rss': 838.5037, 'rbs': 18.6914}

fit.initial_loss, fit.optimized_loss
# (3.557060, 0.000168)

Generated thermal optimisation fit

Interpretation:

Higher rss means the fitted run needed greater soil surface resistance than the default.
Higher rbs means the fitted run needed stronger boundary-layer resistance.
These values are calibration parameters for the configured observations and period. They are not direct ARC retrieval products.

If your SCOPE output uses a different thermal variable name, change target_variables. Common candidates are Loutt, Lot_, Eoutt, or a postprocessed LST variable if you add one to the output dataset.

Example 3: Fit Coupled Energy Balance¶

Use energy-balance when the run should fit fluorescence and thermal/flux behavior in the coupled branch. The default preset tunes:

fqe: fluorescence quantum efficiency
rss: soil surface resistance
rbs: boundary-layer resistance
Cd: drag coefficient
rwc: relative water content

config = PipelineConfig(
    geojson_path="field.geojson",
    start_date="2021-05-15",
    end_date="2021-10-01",
    crop_type="wheat",
    start_of_season=170,
    year=2021,
    scope_workflow="energy-balance",
    scope_root_path="./upstream/SCOPE",
    scope_options={"soil_heat_method": 2},
    optim_config={
        "enabled": True,
        "observations_path": "observed_energy_balance.nc",
        "target_variables": ["F740", "Loutt", "LE"],
        "parameter_preset": "energy-balance",
        "optimizer": {"type": "scipy", "method": "L-BFGS-B"},
        "max_iter": 120,
        "tol": 1e-6,
    },
)

result = ArcScopePipeline(config).run()
fit = result.optimization_result

Generated output from docs/assets/optimization/summary.json:

fit.parameters_initial
# {'fqe': 0.01, 'rss': 500.0, 'rbs': 10.0, 'Cd': 0.2, 'rwc': 0.5}

fit.parameters_optimized
# {'fqe': 0.0162, 'rss': 708.5325, 'rbs': 14.4040, 'Cd': 0.3202, 'rwc': 0.5902}

fit.initial_loss, fit.optimized_loss
# (60.651931, 0.000868)

Generated coupled energy-balance optimisation fit

Interpretation:

The optimiser is fitting all targets with one scalar loss.
The final parameter set is a compromise across SIF, thermal, and flux targets.
If one target has much larger numeric units than another, the default MSE can be dominated by that target.

For mixed-unit fitting, normalise observations before fitting or provide a custom loss_fn that returns comparable magnitudes for each target. The current runner calls loss_fn(predicted, observed) once per target and sums the returned losses.

Example 4: Fit Only One Parameter in a Larger Workflow¶

You can run the energy-balance workflow while fitting only SIF yield and holding thermal parameters fixed:

optim_config = {
    "enabled": True,
    "observations_path": "observed_sif_for_energy_balance.nc",
    "target_variables": ["F740"],
    "parameters": [
        {"name": "fqe", "initial": 0.01, "lower": 0.001, "upper": 0.1, "transform": "log"},
        {"name": "rss", "initial": 500.0, "lower": 10.0, "upper": 5000.0, "transform": "log", "optimize": False},
        {"name": "rbs", "initial": 10.0, "lower": 1.0, "upper": 100.0, "transform": "log", "optimize": False},
    ],
}

This is useful when you want the coupled model physics but only trust one class of observations for calibration.

The generated parameter summary compares initial values, fitted values, and the known generating values used by the proxy examples:

Generated optimisation parameter summary

Reading the Outcome¶

result.optimization_result contains:

Field	Meaning
`status`	`"optimized"` when the optimisation branch completed.
`target_variables`	SCOPE output variables used in the loss.
`initial_loss`	Loss from the initial parameter values before fitting.
`optimized_loss`	Loss from the final fitted parameter values.
`parameters_initial`	Physical parameter values used at the start.
`parameters_optimized`	Physical parameter values after fitting.
`optimizer`	Optimiser implementation and method, e.g. `scipy:L-BFGS-B`.
`converged`	Whether scipy reported convergence. A non-converged fit may still improve the loss but should be reviewed.
`metadata`	Workflow and parameter-count metadata.

The optimised datasets also include attributes for manifest writers:

result.scope_output_ds.attrs["arc_scope_optimization_status"]
result.scope_output_ds.attrs["arc_scope_optimization_parameters_optimized"]
result.scope_output_ds.attrs["arc_scope_optimization_initial_loss"]
result.scope_output_ds.attrs["arc_scope_optimization_optimized_loss"]

These attributes are intended to prevent optimised and non-optimised runs from looking identical in downstream artifacts.

Practical Checks¶

Before treating a fit as scientific output, check:

optimized_loss is lower than initial_loss.
converged is True, or the non-convergence reason is understood from scipy logs.
Fitted parameters are not pinned exactly at lower or upper bounds.
Target variables and observations use compatible units.
A hold-out period or independent observation source gives similar behavior.

Optimisation improves agreement with the supplied observations. It does not prove that a parameter is uniquely identifiable or physically true.

Optimization Guide¶

What the Runner Does¶

Shared Configuration Pattern¶

Choosing Parameters¶

Generated Example Artifacts¶

Example 1: Fit SIF by Tuning fqe¶

Example 2: Fit Thermal Outputs by Tuning Soil Resistances¶

Example 3: Fit Coupled Energy Balance¶

Example 4: Fit Only One Parameter in a Larger Workflow¶

Reading the Outcome¶

Practical Checks¶

Example 1: Fit SIF by Tuning `fqe`¶