Fix: exit non-zero when all slides fail in aggregate_slide_features_batch#132
Closed
raylim wants to merge 1 commit into
Closed
Fix: exit non-zero when all slides fail in aggregate_slide_features_batch#132raylim wants to merge 1 commit into
raylim wants to merge 1 commit into
Conversation
…atch Previously, when TITAN slide aggregation failed for every slide in a batch (e.g. CUDA OOM), extract_features would exit 0 with no output files. Nextflow then showed confusing 'exit:0 Failed Tasks' and spuriously retried the tasks using the cluster-profile retry policy. Fix (two locations): 1. feature_extract.py aggregate_slide_features_batch: raise RuntimeError when len(failed_slides) == num_slides, for both model and non-model paths. 2. extract_features.py _main_batch: defense-in-depth check after aggregate_slide_features_batch — raise if all output .pt files are missing. With errorStrategy='ignore' in nextflow.config, exit:1 is properly ignored; WDS coverage check resets the slides to PENDING for retry in the next batch. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Collaborator
Author
|
Duplicate of #131, which has been merged. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When TITAN slide aggregation fails for every slide in a batch (e.g. CUDA OOM),
extract_featureswas silently exiting 0 with no output files. Nextflow showed confusingexit:0 Failed Tasksand spuriously retried tasks.Changes
feature_extract.py aggregate_slide_features_batch: raiseRuntimeErrorwhen all slides fail (both model and non-model paths).extract_features.py _main_batch: defense-in-depth — raise if all output.ptfiles are missing after aggregation.With
errorStrategy='ignore'in nextflow.config, exit:1 is cleanly ignored; WDS coverage check resets slides to PENDING.Testing
pytest tests/mussel/utils/test_feature_extract.py tests/mussel/cli/test_extract_features.py