Skip to content

Unguarded inputs allow invalid data into calibration #232

@bdestombe

Description

@bdestombe

Four places where invalid, non-physical, or missing data flows silently into calibration with no warning to the user. All four are silent failure modes — the code runs to completion and produces a result that looks reasonable.

1. parse_st_var validation is dead code

src/dtscalibration/calibrate_utils.py:25-36:

def parse_st_var(st, st_var):
    ...
    st_var_sec = st_var(st) if callable(st_var) else xr.ones_like(st) * st_var
    return st_var_sec    # <-- function returns here

    assert np.all(np.isfinite(st_var_sec)), \"NaN/inf values detected ...\"
    assert np.all(st_var_sec > 0.0), \"Negative values detected ...\"
    return st_var_sec

The first return on line 26 is unreachable code; the two assert statements are never executed. A caller passing st_var=-1.0, or a callable that returns NaN where the Stokes signal is near zero (e.g. a Poisson model), proceeds silently into the WLS solver.

Fix: delete the duplicate early return.

2. var_fun can return negative values for low-intensity Stokes

src/dtscalibration/variance_helpers.py:173:

return lambda stokes: slope * stokes + offset

The OLS fit at :151-158 is unconstrained, so offset can be negative. The function emits a warning but does not prevent it. When offset < 0 and a downstream stokes value is small enough, var_fun(stokes) is negative. Negative variance fed to the WLS calibration produces a negative weight and an undefined weighted sum.

Fix: clip the returned variance at a small positive value (or fit slope, offset under a non-negativity constraint).

3. NaN bypasses Stokes positivity check

src/dtscalibration/dts_accessor.py:624-631 and :1021-1036:

assert not np.any(self.st.isel(x=ix_sec) <= 0.0)

NaN <= 0 evaluates to False, so this assertion silently passes when reference-section Stokes contains NaN. The contaminated values then flow into log(st/ast) (= NaN) and the calibration solver returns garbage parameters. The user sees no error.

Fix: combine with np.isfinite, e.g. assert np.all(np.isfinite(st)) and np.all(st > 0.0).

4. merge_double_ended silently drops mismatched x

src/dtscalibration/dts_accessor_utils.py:731-740:

ds_bw = ds_bw.reindex({\"x\": ds[\"x\"]}, method=\"nearest\", tolerance=0.99 * x_resolution)
...
ds = ds.dropna(dim=\"x\")

The reindex turns any x further than 0.99·Δx from a forward-grid point into NaN, and dropna(dim='x') then collapses every x with NaN in any variable — including pre-existing NaNs in unrelated channels (e.g. tmp). A user feeding two non-aligned single-ended files can end up with a much shorter cable than expected, with no count or warning of how many points were dropped.

Fix: report (warn or raise) the number of dropped points; consider scoping the dropna to the merge variables only.

Suggested tests

  • Pass st_var=-1.0 to calibrate_single_ended; assert it raises (currently passes silently).
  • Pass st_var=lambda st: np.full_like(st, np.nan); assert it raises.
  • Construct a variance_stokes_linear fit where the reference data forces offset < 0; assert var_fun(0.0) >= 0 (currently fails).
  • Inject np.nan into one reference-section pixel of ds.st; assert the calibration call raises (currently silently produces garbage).
  • Build ds_fw and ds_bw with intentionally non-aligned x (offset by 0.5·Δx); assert merge_double_ended either warns or raises with a count of dropped points.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions