Skip to content

[ENH] Reduce complexity of run_flow_on_task func#1596

Open
Omswastik-11 wants to merge 22 commits intoopenml:mainfrom
Omswastik-11:issue-1580
Open

[ENH] Reduce complexity of run_flow_on_task func#1596
Omswastik-11 wants to merge 22 commits intoopenml:mainfrom
Omswastik-11:issue-1580

Conversation

@Omswastik-11
Copy link
Copy Markdown
Contributor

@Omswastik-11 Omswastik-11 commented Jan 4, 2026

Summary

This PR refactors run_flow_on_task, which had grown to ~160 lines with high cyclomatic complexity, by extracting small helper functions with clear, single responsibilities. The main function is now a readable orchestrator with clearly defined steps.


Changes

Extracted helper functions

  • _validate_flow_and_task_inputs
    Handles input validation and backward-compatible argument handling

  • _sync_flow_with_server
    Synchronizes the flow with the server and checks for duplicate runs

  • _prepare_run_environment
    Prepares environment information and run tags

  • _create_run_from_results
    Builds the OpenMLRun object from execution results

Internal structure improvements

  • Introduced a _RunResults NamedTuple to bundle execution outputs
    (data_content, trace, evaluations) and reduce long parameter lists

Type Safety Improvements

  • Replaced assert statements with explicit ValueError / TypeError exceptions
  • Added type guards before accessing attributes that may be None
  • Made boolean parameters keyword-only where appropriate

Fixes #1580

@Omswastik-11 Omswastik-11 marked this pull request as ready for review January 5, 2026 14:14
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jan 6, 2026

Codecov Report

❌ Patch coverage is 9.37500% with 58 lines in your changes missing coverage. Please review.
✅ Project coverage is 54.53%. Comparing base (e653ef6) to head (6c7a996).

Files with missing lines Patch % Lines
openml/runs/functions.py 9.37% 58 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1596      +/-   ##
==========================================
- Coverage   54.67%   54.53%   -0.15%     
==========================================
  Files          63       63              
  Lines        5108     5129      +21     
==========================================
+ Hits         2793     2797       +4     
- Misses       2315     2332      +17     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Collaborator

@geetu040 geetu040 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really nice, I have left a few comments with only minor changes requested.

Comment thread openml/runs/functions.py Outdated
Comment thread openml/runs/functions.py Outdated
Comment thread openml/runs/functions.py
Comment thread openml/runs/functions.py Outdated
Comment thread openml/runs/functions.py Outdated
Comment thread openml/runs/functions.py
@Omswastik-11 Omswastik-11 requested a review from geetu040 January 13, 2026 09:23
@Omswastik-11
Copy link
Copy Markdown
Contributor Author

added cast(int , task_id) because of mypy pre-commit error .

image

Copy link
Copy Markdown
Collaborator

@geetu040 geetu040 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely refactored, LGTM.

CC: @fkiraly, @SimonBlanke for review/merge.

Copy link
Copy Markdown
Collaborator

@SimonBlanke SimonBlanke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Omswastik-11 Do you see a way to increase the test coverage here? This is not a hard requirement.

Comment thread openml/runs/functions.py Outdated
Copy link
Copy Markdown
Collaborator

@geetu040 geetu040 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually there is no unit test for the function openml.runs.run_flow_on_task, could you please add one. There are some tests that use openml.runs.run_flow_on_task internally, but it would be nice to have an independent test that only checks this functionality. You can add this test in tests/test_runs.
Also if the helper functions in openml.runs.run_flow_on_task can be tested at unit (suggested in #1596 (review)), that would be nice, but again, it's not a hard requirement.

Copy link
Copy Markdown
Collaborator

@geetu040 geetu040 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
@fkiraly please review/merge.

Comment thread openml/runs/functions.py Outdated
Comment thread openml/runs/functions.py Outdated
Comment thread openml/runs/functions.py
Comment thread openml/runs/functions.py Outdated
Comment thread openml/runs/functions.py
Copy link
Copy Markdown
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I left some recommendations on how to further simplify the code flow.

@Omswastik-11 Omswastik-11 requested a review from fkiraly February 17, 2026 08:53
Copy link
Copy Markdown
Collaborator

@geetu040 geetu040 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please resolve conflicts with main

Copilot AI review requested due to automatic review settings February 25, 2026 10:57
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the run_flow_on_task function to reduce complexity and improve maintainability. The function, which had grown to ~160 lines with high cyclomatic complexity, is now a clear orchestrator that delegates to well-defined helper functions.

Changes:

  • Extracted four helper functions with single responsibilities: input validation, server synchronization, environment preparation, and run creation
  • Improved type safety by replacing assert statements with explicit ValueError/TypeError exceptions
  • Made boolean parameters keyword-only in helper functions for better API clarity
  • Added unit tests for the new helper functions

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
openml/runs/functions.py Refactored run_flow_on_task by extracting helper functions: _validate_flow_and_task_inputs, _sync_flow_with_server, _prepare_run_environment, and _create_run_from_results
tests/test_runs/test_run_functions.py Added unit tests for _sync_flow_with_server and _create_run_from_results helper functions, and imported OrderedDict for test data

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread openml/runs/functions.py
Comment on lines +486 to 492
data_content, trace, fold_evaluations, sample_evaluations = _run_task_get_arffcontent(
model=flow.model,
task=task,
extension=flow.extension,
add_local_measures=add_local_measures,
n_jobs=n_jobs,
)
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description mentions introducing a _RunResults NamedTuple to bundle execution outputs and reduce long parameter lists, but this NamedTuple is not present in the actual implementation. The function _run_task_get_arffcontent still returns a tuple that is unpacked directly in line 486. If the NamedTuple was intended but not implemented, consider either updating the PR description to match the implementation or implementing the NamedTuple as described.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Omswastik-11 update the PR description to remove this part.

Copy link
Copy Markdown
Collaborator

@geetu040 geetu040 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fkiraly, can we merge this now? comments from your last review #1596 (review) are resolved.

Copilot AI review requested due to automatic review settings April 26, 2026 15:49
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[MNT] Refactor run_flow_on_task-function

6 participants