Skip to content

v1 issue 8#115

Draft
aclerc wants to merge 2 commits into
v1from
v1-issue8
Draft

v1 issue 8#115
aclerc wants to merge 2 commits into
v1from
v1-issue8

Conversation

@aclerc

@aclerc aclerc commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Issue 8 — Cross-prediction bias-cancellation for shrinkage-driven conditional bias (power_model)

Goal: remove the counterfactual model's conditional (shrinkage) bias — the F5 root
cause — by cancelling it between two symmetric train/predict directions on weather-matched
data. Applies to both the overall and per-condition estimates.

Scope

  • Matching-variable analysis (one-off, do first). On Hill of Towie, run a feature-
    importance analysis to choose which ERA5 variables to match on (likely wind speed +
    direction, possibly more). Record the chosen set + rationale in docs/v1/findings.md
    and hard-code it as the default matching set. ERA5 is preferred for its full coverage
    and temporal stability; the set may later be tuned per wind farm.
  • ERA5 coarsened-exact matching (CEM). New utility: bin baseline vs upgraded rows on
    the chosen (synced) ERA5 variables; within each cell subsample the larger side to the
    smaller side's count (seeded); drop one-sided cells. Yields equal-count, weather-matched
    baseline/upgraded sets. The matching axis (ERA5) is distinct from the reporting/binning
    axis (test-turbine ws/TI, kept as today so bins match ground truth).
  • Two directions + geometric combine. Forward: train on matched baseline, predict
    matched upgraded → r_fwd (overall and per bin via energy_ratio_by_bin). Reverse:
    train on matched upgraded, predict matched baseline → r_rev. Combine
    uplift = sqrt((1+r_fwd)/(1+r_rev)) − 1 (exact under a common per-bin multiplicative
    shrinkage); also emit implied bias 1/sqrt((1+r_fwd)(1+r_rev)) as a diagnostic. Guard
    non-positive (1+r) and empty/sparse bins.
  • Opt-in flag (e.g. bias_correct: bool = False) so the corrected overall +
    conditional can be A/B'd against current behaviour before any default flips.
  • Reuse the existing outcome model factory (make_outcome_model) and CONDITION_BINS;
    extend diagnostics to overlay corrected vs current conditional curves against truth.

Done when: with the flag on, study_power_model_compare.py (Issue 7) shows the
ti_dependent_cp / ws_dependent_cp conditional curves materially flatter toward truth
and the overall P50 no worse than today; a regression test recovers a known condition-
dependent uplift more accurately with correction on than off; findings.md updated.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements Issue 8’s opt-in cross-prediction bias-cancellation for power_model by weather-matching baseline/upgraded periods via ERA5 coarsened exact matching (CEM), combining forward/reverse ratios, and re-leveling the corrected conditional decomposition onto the unchanged full-data overall headline.

Changes:

  • Add bias_correct path to PowerModelMethod with ERA5 CEM matching, two-direction combine, re-leveling, and new diagnostics outputs (overall/by-bin shrinkage + CEM balance/cells).
  • Extend energy_ratio_by_bin to expose per-bin energy sums (sum_actual, sum_counterfactual) and update/add unit tests accordingly.
  • Add one-off ERA5 matching-variable importance analysis script and update docs/findings; wire --bias-correct flag into study_power_model_compare.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/benchmarking/harness/test_conditions.py Adds coverage for new per-bin sum outputs from energy_ratio_by_bin.
tests/benchmarking/baselines/test_study_power_model_compare.py Verifies _make_power_model propagates bias_correct and defaults matching vars.
tests/benchmarking/baselines/test_power_model_method.py Adds unit + regression tests for two-direction combine, re-leveling, and bias-correct flow/diagnostics.
tests/benchmarking/baselines/test_power_model_matching.py New unit tests for CEM matching utility (cells, balance, seeded subsampling).
docs/v1/issues.md Notes follow-up idea for derived ERA5 atmospheric features.
docs/v1/findings.md Records Issue 8 findings (matching vars, estimator, A/B results) and rationale.
benchmarking/harness/conditions.py Extends energy_ratio_by_bin to include per-bin sums and defines empty-bin sum behavior.
benchmarking/baselines/study_power_model_compare.py Adds _make_power_model, --bias-correct flag, and A/B-safe baseline update guard.
benchmarking/baselines/power_model/method.py Implements bias_correct estimation, default matching vars/edges, re-leveling helpers, and per-bin diagnostics plumbing.
benchmarking/baselines/power_model/matching.py Adds pure CEM utility returning matched positions + balance/per-cell diagnostics.
benchmarking/baselines/power_model/diagnostics.py Writes bias-correction CSVs and implied-shrinkage plot utility.
benchmarking/baselines/inspect_prepost_hard_case.py Runs and overlays uncorrected vs bias-corrected power model outputs for inspection.
benchmarking/baselines/inspect_era5_matching_importance.py Adds one-off ERA5 feature-importance analysis for selecting matching variables.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +130 to +132
for var, edges in bin_edges.items():
cut = pd.cut(matching_frame[var], bins=edges, labels=False) # NaN outside edges / on NaN input
columns.append(cut.to_numpy(dtype=float))
Comment on lines +52 to +58
def test_equal_counts_within_every_retained_cell(self) -> None:
result = _match()
retained = result.per_cell[result.per_cell["n_matched"] > 0]
# after matching each retained cell has the same count on both sides
assert (retained["n_matched"] == retained["n_matched"]).all()
# cell A keeps 1/side, cell B keeps 2/side
assert sorted(retained["n_matched"].tolist()) == [1, 2]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants