Fix ICON unstructured-grid GRIB loading by bypassing earthkit grid en…#197
Open
icedoom888 wants to merge 1 commit into
Open
Fix ICON unstructured-grid GRIB loading by bypassing earthkit grid en…#197icedoom888 wants to merge 1 commit into
icedoom888 wants to merge 1 commit into
Conversation
…gine
ICON-CH1/CH2 native GRIB is on an unstructured (triangular) grid. On this
cluster earthkit-data v1.0 cannot build its TensorGrid for such fields: its
grid engine requires the GRIB gridSpec, but the bundled eckit-geo codec cannot
parse ECMWF's ICON .ek grid file (eckit::codec::InvalidRecord: version not
found) and the auto-downloader is unavailable. Since SingleDatasetBuilder.build
always constructs the grid, no to_xarray profile flag can avoid the crash; the
real ValueError gets masked as a confusing "'ValueError' object is not callable"
TypeError.
Detect unstructured fields via gridType and route them through a new
_unstructured_fieldlist_to_xarray helper that assembles the Dataset directly
from per-field values/metadata and attaches latitude/longitude from the local
ICON grid file (meteodatalab.icon_grid.load_grid_from_balfrin, imported lazily).
Output is structurally equivalent to the regular earthkit path (flat `values`
spatial dim, forecast_reference_time/step dims, valid_time coord, pressure
levels split to {param}_{level}), so all downstream verification code and
TOT_PREC de-accumulation work unchanged. Regular grids keep the existing path.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
9994902 to
59332bd
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this fixes
Reading ICON-CH1/CH2 GRIB (baselines and ML-forecaster output) crashed during verification with a misleading
TypeError: 'ValueError' object is not callable. This makesverification_metrics_baselineand forecaster verification work again.Root cause
ICON GRIB is on an unstructured (triangular) grid (
gridType=unstructured_grid, GRIB2 template 101). earthkit-data v1.0'sto_xarrayunconditionally builds aTensorGrid, which for unstructured fields needs the GRIBgridSpec. On our clustergridSpeccan't be obtained: the bundled eckit-geo codec can't parse ECMWF's ICON.ekgrid file (eckit::codec::InvalidRecord: version not found) and the auto-downloader is broken. The underlyingValueErroris then masked by an earthkit lazy-handler bug, surfacing as the confusingTypeError. Noto_xarrayprofile flag avoids the grid build, andmeteodatalab.grib_decoderis import-incompatible with this earthkit version.Fix
Detect unstructured fields by
gridTypeand route only those through a new_unstructured_fieldlist_to_xarrayhelper that builds the Dataset directly from field values + metadata and attacheslatitude/longitudefrom the local ICON grid file (meteodatalab.icon_grid.load_grid_from_balfrin, imported lazily). Regular grids keep the existing earthkit path, untouched.Key property: coordinates come from the local grid file, so there is no runtime network dependency and no eckit-geo involvement — which is what makes it robust on compute nodes.
The unstructured output is structurally identical to the regular path (flat
valuesdim,forecast_reference_time/stepdims,valid_timecoord,{param}_{level}naming for pressure levels, global attrs), so all downstream code —map_forecast_to_truth, the metrics, and TOT_PREC de-accumulation — is unchanged.Scope
src/data_input/__init__.py(load_from_grib_file+ new helpers).workflow/scripts/data_extract_baseline.pyhas separate, unrelated breakage on this earthkit version and is intentionally left out of scope; the new helper is reusable for a follow-up.Testing
Validated on balfrin against a full
evalml experimentrun (ICON-CH1/CH2 hindcast, Jan–Jun 2025):verification_metrics_baselinejob crashed on ICON-CH1/CH2 GRIB with the maskedTypeError.latitude/longitudeand physically sensible fields; end-to-endverification_metrics.pyagainst KENDA-CH1 yields the expected lead-time skill degradation (e.g. T_2M MAE 0.48 K @ step 0 → 1.42 K @ step 120, CORR 0.985 → 0.923); TOT_PREC de-accumulates with no spurious negatives.rotated_llGRIB still routes through the unchanged earthkit path; existingtests/unit/test_data_input.pypass.