Skip to content

Yet another alternative implementation for working with derived variables#3053

Draft
bouweandela wants to merge 2 commits intomainfrom
derived-alternative-2
Draft

Yet another alternative implementation for working with derived variables#3053
bouweandela wants to merge 2 commits intomainfrom
derived-alternative-2

Conversation

@bouweandela
Copy link
Copy Markdown
Member

@bouweandela bouweandela commented Apr 24, 2026

Description

Yet another alternative implementation to #2777 and #3051 for finding input datasets for derived variables and loading them. Adds a new class method `esmvalcore.dataset.DerivedDataset that can be used to work with derived variables.

Note that derived variables with optional input datasets are not yet supported. The only derived variable that uses this feature is amoc at the moment, so the impact of that is limited.

Link to documentation: https://esmvaltool--3053.org.readthedocs.build/projects/ESMValCore/en/3053/notebooks/derived-variables.html


Before you get started

Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.


To help with the number pull requests:

@bouweandela bouweandela changed the title Alternative implementation for finding input datasets for derived var… Yet another alternative implementation for working with derived variables Apr 24, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 24, 2026

Codecov Report

❌ Patch coverage is 37.50000% with 25 lines in your changes missing coverage. Please review.
✅ Project coverage is 96.00%. Comparing base (5ca0b1d) to head (de9351d).

Files with missing lines Patch % Lines
esmvalcore/dataset.py 37.50% 25 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3053      +/-   ##
==========================================
- Coverage   96.15%   96.00%   -0.15%     
==========================================
  Files         270      270              
  Lines       15805    15844      +39     
==========================================
+ Hits        15197    15211      +14     
- Misses        608      633      +25     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@bouweandela bouweandela requested a review from schlunma April 24, 2026 16:20
Copy link
Copy Markdown
Contributor

@schlunma schlunma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Bouwe, looks good! I prefer this approach over #3051. This will certainly be very helpful when using the ESMValCore Python API.

I am not a very big fan of adding a feature to the Dataset class but not using it when parsing recipes, but in this case it's much easier to separate this. Incorporating this cleanly into the existing code was by far the hardest problem I had in #3051.

Will look into this in more detail once the PR is ready for review! Thanks so much!

Comment thread esmvalcore/dataset.py

def __init__(self, **facets: FacetValue) -> None:
self.facets: Facets = facets
self._input_datasets: tuple[Dataset, ...] = ()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is not used anywhere

Comment thread esmvalcore/dataset.py
standard_name=var_info.standard_name, # type: ignore[union-attr]
)

def from_files(self) -> Iterator[Self]:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def from_files(self) -> Iterator[Self]:
def from_files(self) -> Generator[Self]:

I think a Generator is more accurate (every Generator is an Iterator, but not vice versa).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants