[ENH] Reduce complexity of `run_flow_on_task` func by Omswastik-11 · Pull Request #1596 · openml/openml-python

Omswastik-11 · 2026-01-04T12:23:08Z

Summary

This PR refactors run_flow_on_task, which had grown to ~160 lines with high cyclomatic complexity, by extracting small helper functions with clear, single responsibilities. The main function is now a readable orchestrator with clearly defined steps.

Changes

Extracted helper functions

_validate_flow_and_task_inputs
Handles input validation and backward-compatible argument handling
_sync_flow_with_server
Synchronizes the flow with the server and checks for duplicate runs
_prepare_run_environment
Prepares environment information and run tags
_create_run_from_results
Builds the OpenMLRun object from execution results

Internal structure improvements

Introduced a _RunResults NamedTuple to bundle execution outputs
(data_content, trace, evaluations) and reduce long parameter lists

Type Safety Improvements

Replaced assert statements with explicit ValueError / TypeError exceptions
Added type guards before accessing attributes that may be None
Made boolean parameters keyword-only where appropriate

Fixes #1580

codecov-commenter · 2026-01-06T12:43:32Z

Codecov Report

❌ Patch coverage is 9.37500% with 58 lines in your changes missing coverage. Please review.
✅ Project coverage is 54.53%. Comparing base (e653ef6) to head (6c7a996).

Files with missing lines	Patch %	Lines
openml/runs/functions.py	9.37%	58 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1596      +/-   ##
==========================================
- Coverage   54.67%   54.53%   -0.15%     
==========================================
  Files          63       63              
  Lines        5108     5129      +21     
==========================================
+ Hits         2793     2797       +4     
- Misses       2315     2332      +17

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

geetu040

Looks really nice, I have left a few comments with only minor changes requested.

Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>

…ython into issue-1580

Omswastik-11 · 2026-01-13T09:25:29Z

added cast(int , task_id) because of mypy pre-commit error .

geetu040

Nicely refactored, LGTM.

CC: @fkiraly, @SimonBlanke for review/merge.

for more information, see https://pre-commit.ci

SimonBlanke

@Omswastik-11 Do you see a way to increase the test coverage here? This is not a hard requirement.

geetu040

Actually there is no unit test for the function openml.runs.run_flow_on_task, could you please add one. There are some tests that use openml.runs.run_flow_on_task internally, but it would be nice to have an independent test that only checks this functionality. You can add this test in tests/test_runs.
Also if the helper functions in openml.runs.run_flow_on_task can be tested at unit (suggested in #1596 (review)), that would be nice, but again, it's not a hard requirement.

geetu040

LGTM!
@fkiraly please review/merge.

fkiraly

Nice! I left some recommendations on how to further simplify the code flow.

geetu040

please resolve conflicts with main

Copilot

Pull request overview

This PR refactors the run_flow_on_task function to reduce complexity and improve maintainability. The function, which had grown to ~160 lines with high cyclomatic complexity, is now a clear orchestrator that delegates to well-defined helper functions.

Changes:

Extracted four helper functions with single responsibilities: input validation, server synchronization, environment preparation, and run creation
Improved type safety by replacing assert statements with explicit ValueError/TypeError exceptions
Made boolean parameters keyword-only in helper functions for better API clarity
Added unit tests for the new helper functions

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
openml/runs/functions.py	Refactored `run_flow_on_task` by extracting helper functions: `_validate_flow_and_task_inputs`, `_sync_flow_with_server`, `_prepare_run_environment`, and `_create_run_from_results`
tests/test_runs/test_run_functions.py	Added unit tests for `_sync_flow_with_server` and `_create_run_from_results` helper functions, and imported `OrderedDict` for test data

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-25T11:01:10Z

+    data_content, trace, fold_evaluations, sample_evaluations = _run_task_get_arffcontent(
        model=flow.model,
        task=task,
        extension=flow.extension,
        add_local_measures=add_local_measures,
        n_jobs=n_jobs,
    )


The PR description mentions introducing a _RunResults NamedTuple to bundle execution outputs and reduce long parameter lists, but this NamedTuple is not present in the actual implementation. The function _run_task_get_arffcontent still returns a tuple that is unpacked directly in line 486. If the NamedTuple was intended but not implemented, consider either updating the PR description to match the implementation or implementing the NamedTuple as described.

@Omswastik-11 update the PR description to remove this part.

geetu040

@fkiraly, can we merge this now? comments from your last review #1596 (review) are resolved.

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Omswastik-11 added 2 commits January 4, 2026 17:51

reduce complexity of un_on_flow func

35b0977

Merge branch 'main' into issue-1580

5a4a089

Omswastik-11 marked this pull request as ready for review January 5, 2026 14:14

Merge branch 'main' into pr/1596

1a006fb

geetu040 suggested changes Jan 12, 2026

View reviewed changes

Comment thread openml/runs/functions.py Outdated

Comment thread openml/runs/functions.py Outdated

Comment thread openml/runs/functions.py

Comment thread openml/runs/functions.py Outdated

Comment thread openml/runs/functions.py Outdated

Comment thread openml/runs/functions.py

Omswastik-11 added 3 commits January 13, 2026 12:30

refactor the helping functions for un_on_flow func

93aa877

Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>

Merge branch 'issue-1580' of https://github.com/Omswastik-11/openml-p…

331b4be

…ython into issue-1580

remove redudandent checkings

6771fb4

Omswastik-11 requested a review from geetu040 January 13, 2026 09:23

geetu040 approved these changes Jan 13, 2026

View reviewed changes

Omswastik-11 and others added 4 commits January 14, 2026 16:44

Merge branch 'main' into issue-1580

22f52a8

[pre-commit.ci] auto fixes from pre-commit.com hooks

04a6e0f

for more information, see https://pre-commit.ci

Merge branch 'main' into issue-1580

d3460d0

Merge branch 'main' into issue-1580

1ec0302

SimonBlanke suggested changes Jan 17, 2026

View reviewed changes

Comment thread openml/runs/functions.py Outdated

geetu040 assigned Omswastik-11 Jan 19, 2026

Omswastik-11 added 2 commits January 26, 2026 15:42

Merge branch 'main' into issue-1580

df9a36a

Merge branch 'main' into issue-1580

cefae00

geetu040 suggested changes Feb 4, 2026

View reviewed changes

Omswastik-11 requested a review from geetu040 February 12, 2026 11:22

Omswastik-11 added 2 commits February 12, 2026 17:02

added the tests

8d05331

correct the tests

5a683da

geetu040 approved these changes Feb 12, 2026

View reviewed changes