Skip to content

Minor cleanup: trans_att boundary off-by-one, dead get_taf_values branch, plot mean(sqrt(var)), linear-stokes ddof=0 #239

@bdestombe

Description

@bdestombe

Four small standalone hygiene fixes bundled to reduce issue noise. Each is independently trivial. None of them reaches the threshold of a dedicated issue, but together they're a natural housekeeping PR.

1. trans_att == x_sec[-1] 1-pixel boundary asymmetry

Location: src/dtscalibration/calibrate_utils.py:296-301, 1780-1785, 1935-1940 (three identical guards)

Convention everywhere else is "x ≥ trans_att gets the forward TA contribution" (e.g. calibrate_utils.py:2219, dts_accessor_utils.py:363-364). The lower-boundary case trans_att = x_sec[0] is handled correctly by the elif branch (sets ix_sec_ta_ix0 = 0, all reference pixels include TA). But the upper-boundary case trans_att = x_sec[-1] enters the if branch and sets ix_sec_ta_ix0 = nx, so zero reference pixels include the TA contribution — even though by the >= convention, x_sec[-1] itself should.

Fix: replace each if transient_att_xi >= x_sec[-1]: with if transient_att_xi > x_sec[-1]:. Same form for the matching-sections version (> x[-1]) and the double-ended version. This makes the upper boundary symmetric with the lower one.

2. ParameterIndexSingleEnded.get_taf_values(axis="x") is dead code with an inconsistent assertion

Location: src/dtscalibration/dts_accessor_utils.py:367-369

If any caller ever passes axis="x" (with pval of shape (nx, npar)), the assertion arr.shape == (self.nta, self.nx, self.nt) fails (when nx != nt) because get_taf_pars returns shape (nta, nt, nx) — the np.stack(..., axis=-1) at line 340 puts the row dimension last. Even past the assertion, the loop arr_out[x >= taxi] += tai[x >= taxi] would index the wrong axis.

The bug is dead in production — every call site (lines 629, 639, 648, 658, 667) uses axis="" or axis="time". The double-ended counterpart at :195-202 does the right thing via np.transpose(..., axes=(1, 2, 0)).

Fix: add the matching np.transpose(arr, axes=(0, 2, 1)) after get_taf_pars in the axis="x" branch (preferred — symmetric with the double-ended class) or delete the dead branch.

3. plot_sigma_report averages σ over time as mean(√Var), not √mean(Var)

Location: src/dtscalibration/plot.py:533, 642

Both the projected-accuracy and the projected-precision lines build the per-x σ as np.sqrt(...).mean(dim="time"). By Jensen's inequality (sqrt is concave), E_t[√Var(x,t)] ≤ √E_t[Var(x,t)]. For a noise field whose variance changes through time (laser power drift, time-varying reference temperature), the plotted curve underestimates the σ a reader would compute by averaging variances first.

The convention used elsewhere in the package (variance_helpers.py, the WLS weighting) treats variances as the additive primitive — consistent with pooling variances first.

Fix: swap operator order — np.sqrt(ds[temp_var_acc_label].mean(dim="time")) (and the precision counterpart). Or, if the existing mean(√Var) is intentional, update the surrounding labels to say so.

4. variance_stokes_linear_helper per-bin variance uses ddof=0

Location: src/dtscalibration/variance_helpers.py:146

st_sort_var = resid_sec[isort].reshape((nbin, -1)).var(axis=1)

np.var defaults to the biased estimator. With small nperbin = st_sec.size / nbin (e.g. 5), the bias is 25%; with typical thousands of points per bin, ~0.1%.

Distinct from #231 (which is about ddof in variance_stokes_constant_helper:35 and variance_stokes_exponential_helper:129 — wrong direction, ddof=1 should be ddof=npar). This is the opposite issue — default ddof=0 should be at least ddof=1.

Fix: .var(axis=1, ddof=1).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions