Skip to content

Optimization Guide

ARC-SCOPE optimisation calibrates selected SCOPE parameters against observed targets after ARC has retrieved the canopy state. ARC outputs such as LAI, Cab, and Cw remain fixed. The optimiser changes only the parameters you choose in optim_config, runs SCOPE repeatedly, and keeps the parameter values that reduce the configured loss.

Use optimisation when you have observations that correspond to SCOPE outputs:

  • SIF observations for the fluorescence workflow.
  • Thermal radiance or land-surface-temperature-like observations for the thermal workflow.
  • SIF, thermal, or flux observations for the coupled energy-balance workflow.

What the Runner Does

When PipelineConfig.optimize=True, optim_config["enabled"] = True, or a nested runner payload contains optim.enabled, ArcScopePipeline.run() changes from a single forward simulation to this loop:

  1. Run ARC, bridge, weather, geometry, and SCOPE input preparation as usual.
  2. Load observations from optim_config["observations"] or observations_path.
  3. Choose parameters from optim_config["parameters"], parameter_preset, or the workflow default.
  4. Inject trial parameter values into the prepared SCOPE input dataset.
  5. Run SCOPE and compare selected output variables with observations.
  6. Minimise the scalar loss with scipy.
  7. Inject the optimised parameters and run one final SCOPE simulation.
  8. Return the optimised input, output, and OptimizationResult.

Target names must match SCOPE output names

target_variables are SCOPE output variable names, not generic labels. Run a baseline simulation first and inspect result.scope_output_ds.data_vars if you are unsure which names your installed scope-rtm version emits.

Shared Configuration Pattern

Every optimisation run needs observations and target variables:

from arc_scope.pipeline import ArcScopePipeline, PipelineConfig

config = PipelineConfig(
    geojson_path="field.geojson",
    start_date="2021-05-15",
    end_date="2021-10-01",
    crop_type="wheat",
    start_of_season=170,
    year=2021,
    scope_workflow="fluorescence",
    scope_root_path="./upstream/SCOPE",
    optim_config={
        "enabled": True,
        "observations_path": "observed_sif.nc",
        "target_variables": ["F740"],
    },
)

result = ArcScopePipeline(config).run()
summary = result.optimization_result

observations_path should point to a NetCDF dataset with variables matching target_variables. You can also pass an in-memory xarray.Dataset:

import xarray as xr

observations = xr.Dataset(
    {"F740": ("time", observed_f740_values)},
    coords={"time": observation_times},
)

config.optim_config["observations"] = observations

ARC-SCOPE compares finite values after flattening each target variable. The current objective aligns by common array length, so you should prepare observations on the same time/space subset as the model output.

Choosing Parameters

If parameters is omitted, ARC-SCOPE uses a workflow default:

Workflow Default parameters Typical observation target
fluorescence fqe SIF, e.g. F740 or F685
thermal rss, rbs thermal output, e.g. Loutt or Lot_
energy-balance fqe, rss, rbs, Cd, rwc mixed SIF, thermal, and flux outputs

You can override the defaults with explicit parameter specs:

"parameters": [
    {
        "name": "fqe",
        "initial": 0.01,
        "lower": 0.001,
        "upper": 0.1,
        "transform": "log",
    },
    {
        "name": "rss",
        "initial": 500.0,
        "lower": 10.0,
        "upper": 5000.0,
        "transform": "log",
        "optimize": False,
    },
]

transform controls how scipy sees the parameter:

  • identity: optimise directly, clipped to bounds.
  • log: optimise positive parameters such as fqe, rss, rbs, and Cd.
  • logit: optimise values bounded between lower and upper, such as rwc.

optimize=False keeps a parameter fixed while still injecting it into the SCOPE input dataset.

Generated Example Artifacts

The figures and numeric outputs below are generated by a reproducible script:

PYTHONPATH=src python -m arc_scope.experiments.optimization_examples \
  --output-dir docs/assets/optimization

The same generator is exposed through the installed console script arcope-optimization-examples, the local example python examples/07_optimization_examples.py, and the Pixi task pixi run optimization-examples. The generator needs matplotlib for SVG output; install arcope[docs] or use the repository Pixi environment.

The script uses deterministic proxy SCOPE runners so the documentation can be rebuilt without live ARC retrieval, ERA5 credentials, or scope-rtm runtime assets. It still exercises ARC-SCOPE's real optimisation path: run_pipeline_optimization(), ScopeObjective, ParameterSet, and ScipyOptimizer.

Generated files:

Proxy examples vs scientific production runs

These generated examples are real optimisation outputs from runnable code, but their forward model is a deterministic proxy. For scientific outputs, use the same configuration pattern with ArcScopePipeline.run() and real SCOPE target variables from your installed scope-rtm workflow.

Example 1: Fit SIF by Tuning fqe

Use this for experiments where SIF is the scientific target. The fluorescence workflow defaults to the SIF preset, so the explicit parameters block below is optional but recommended for auditability.

config = PipelineConfig(
    geojson_path="field.geojson",
    start_date="2021-05-15",
    end_date="2021-10-01",
    crop_type="wheat",
    start_of_season=170,
    year=2021,
    scope_workflow="fluorescence",
    scope_root_path="./upstream/SCOPE",
    save_scope_netcdf=True,
    optim_config={
        "enabled": True,
        "observations_path": "observed_sif.nc",
        "target_variables": ["F740"],
        "parameters": [
            {
                "name": "fqe",
                "initial": 0.01,
                "lower": 0.001,
                "upper": 0.1,
                "transform": "log",
            }
        ],
        "optimizer": {"type": "scipy", "method": "L-BFGS-B"},
        "max_iter": 50,
        "tol": 1e-6,
    },
)

result = ArcScopePipeline(config).run()
fit = result.optimization_result
print(fit.parameters_initial)
print(fit.parameters_optimized)
print(fit.initial_loss, fit.optimized_loss)

Generated output from docs/assets/optimization/summary.json:

{'fqe': 0.01}
{'fqe': 0.0180}
10.423289 0.000047

Generated SIF optimisation fit

Interpretation:

  • fqe increased from 0.01 to 0.0180, so the initial fluorescence yield was too low for the observed SIF.
  • optimized_loss < initial_loss, so the fitted SCOPE output is closer to the observations under the configured MSE objective.
  • The final result.scope_output_ds is the SCOPE run with the fitted fqe, not the original forward simulation.

Example 2: Fit Thermal Outputs by Tuning Soil Resistances

Use this when the target is thermal radiance or a thermal product derived from the thermal workflow. The default thermal preset tunes rss and rbs.

config = PipelineConfig(
    geojson_path="field.geojson",
    start_date="2021-05-15",
    end_date="2021-10-01",
    crop_type="wheat",
    start_of_season=170,
    year=2021,
    scope_workflow="thermal",
    scope_root_path="./upstream/SCOPE",
    optim_config={
        "enabled": True,
        "observations_path": "observed_thermal.nc",
        "target_variables": ["Loutt"],
        "parameter_preset": "thermal",
        "optimizer": "scipy",
        "max_iter": 80,
        "tol": 1e-6,
    },
)

result = ArcScopePipeline(config).run()
fit = result.optimization_result

Generated output from docs/assets/optimization/summary.json:

fit.parameters_initial
# {'rss': 500.0, 'rbs': 10.0}

fit.parameters_optimized
# {'rss': 838.5037, 'rbs': 18.6914}

fit.initial_loss, fit.optimized_loss
# (3.557060, 0.000168)

Generated thermal optimisation fit

Interpretation:

  • Higher rss means the fitted run needed greater soil surface resistance than the default.
  • Higher rbs means the fitted run needed stronger boundary-layer resistance.
  • These values are calibration parameters for the configured observations and period. They are not direct ARC retrieval products.

If your SCOPE output uses a different thermal variable name, change target_variables. Common candidates are Loutt, Lot_, Eoutt, or a postprocessed LST variable if you add one to the output dataset.

Example 3: Fit Coupled Energy Balance

Use energy-balance when the run should fit fluorescence and thermal/flux behavior in the coupled branch. The default preset tunes:

  • fqe: fluorescence quantum efficiency
  • rss: soil surface resistance
  • rbs: boundary-layer resistance
  • Cd: drag coefficient
  • rwc: relative water content
config = PipelineConfig(
    geojson_path="field.geojson",
    start_date="2021-05-15",
    end_date="2021-10-01",
    crop_type="wheat",
    start_of_season=170,
    year=2021,
    scope_workflow="energy-balance",
    scope_root_path="./upstream/SCOPE",
    scope_options={"soil_heat_method": 2},
    optim_config={
        "enabled": True,
        "observations_path": "observed_energy_balance.nc",
        "target_variables": ["F740", "Loutt", "LE"],
        "parameter_preset": "energy-balance",
        "optimizer": {"type": "scipy", "method": "L-BFGS-B"},
        "max_iter": 120,
        "tol": 1e-6,
    },
)

result = ArcScopePipeline(config).run()
fit = result.optimization_result

Generated output from docs/assets/optimization/summary.json:

fit.parameters_initial
# {'fqe': 0.01, 'rss': 500.0, 'rbs': 10.0, 'Cd': 0.2, 'rwc': 0.5}

fit.parameters_optimized
# {'fqe': 0.0162, 'rss': 708.5325, 'rbs': 14.4040, 'Cd': 0.3202, 'rwc': 0.5902}

fit.initial_loss, fit.optimized_loss
# (60.651931, 0.000868)

Generated coupled energy-balance optimisation fit

Interpretation:

  • The optimiser is fitting all targets with one scalar loss.
  • The final parameter set is a compromise across SIF, thermal, and flux targets.
  • If one target has much larger numeric units than another, the default MSE can be dominated by that target.

For mixed-unit fitting, normalise observations before fitting or provide a custom loss_fn that returns comparable magnitudes for each target. The current runner calls loss_fn(predicted, observed) once per target and sums the returned losses.

Example 4: Fit Only One Parameter in a Larger Workflow

You can run the energy-balance workflow while fitting only SIF yield and holding thermal parameters fixed:

optim_config = {
    "enabled": True,
    "observations_path": "observed_sif_for_energy_balance.nc",
    "target_variables": ["F740"],
    "parameters": [
        {"name": "fqe", "initial": 0.01, "lower": 0.001, "upper": 0.1, "transform": "log"},
        {"name": "rss", "initial": 500.0, "lower": 10.0, "upper": 5000.0, "transform": "log", "optimize": False},
        {"name": "rbs", "initial": 10.0, "lower": 1.0, "upper": 100.0, "transform": "log", "optimize": False},
    ],
}

This is useful when you want the coupled model physics but only trust one class of observations for calibration.

The generated parameter summary compares initial values, fitted values, and the known generating values used by the proxy examples:

Generated optimisation parameter summary

Reading the Outcome

result.optimization_result contains:

Field Meaning
status "optimized" when the optimisation branch completed.
target_variables SCOPE output variables used in the loss.
initial_loss Loss from the initial parameter values before fitting.
optimized_loss Loss from the final fitted parameter values.
parameters_initial Physical parameter values used at the start.
parameters_optimized Physical parameter values after fitting.
optimizer Optimiser implementation and method, e.g. scipy:L-BFGS-B.
converged Whether scipy reported convergence. A non-converged fit may still improve the loss but should be reviewed.
metadata Workflow and parameter-count metadata.

The optimised datasets also include attributes for manifest writers:

result.scope_output_ds.attrs["arc_scope_optimization_status"]
result.scope_output_ds.attrs["arc_scope_optimization_parameters_optimized"]
result.scope_output_ds.attrs["arc_scope_optimization_initial_loss"]
result.scope_output_ds.attrs["arc_scope_optimization_optimized_loss"]

These attributes are intended to prevent optimised and non-optimised runs from looking identical in downstream artifacts.

Practical Checks

Before treating a fit as scientific output, check:

  1. optimized_loss is lower than initial_loss.
  2. converged is True, or the non-convergence reason is understood from scipy logs.
  3. Fitted parameters are not pinned exactly at lower or upper bounds.
  4. Target variables and observations use compatible units.
  5. A hold-out period or independent observation source gives similar behavior.

Optimisation improves agreement with the supplied observations. It does not prove that a parameter is uniquely identifiable or physically true.