Restrict closed/open training workloads to {unet3d, retinanet} (depends on #432) by FileSystemGuy · Pull Request #439 · mlcommons/storage

FileSystemGuy · 2026-06-13T04:19:50Z

Summary

Update the official training-workload set for closed/open submission from {unet3d, resnet50, cosmoflow} to {unet3d, retinanet} (Rules.md 2.1.11). Propagate the new set through the runtime validator, submission validator, dataset constants, error suggestions, result directory examples, and unit tests.

The whatif exploration mode is unchanged and still accepts the full superset: unet3d, retinanet, cosmoflow, resnet50, dlrm, flux.

Dependency

This PR depends on #432 and is stacked on top of FileSystemGuy-rules-validator. The base will be retargeted to main once #432 merges. Reviewing now: please review only the diff specific to this PR (the GitHub UI will show this automatically while #432 is open).

Marked draft until #432 is merged.

Changes

Rules.md — 2.1.11 text rule + Closed/Open result tree examples reduced to the two-workload set.
mlpstorage_py/submission_checker/
- constants.py: NUM_DATASET_{TRAIN,EVAL}_{FILES,FOLDERS} keyed by {unet3d, retinanet}.
- checks/submission_structure_checks.py: _VALID_TRAINING_WORKLOADS, STRUCT-11 docstring, violation message.
- checks/training_checks.py (3.3.1): updated comments / skip-diagnostic message.
- configuration/configuration.py: updated comment example.
mlpstorage_py/rules/submission_checkers/training.py: supported_models = MODELS_CLOSED (was MODELS, the whatif superset).
mlpstorage_py/validation_helpers.py: --model suggestion now mentions (unet3d or retinanet).
mlpstorage_py/reporting/directory_validator.py: example workload list.
mlpstorage_py/rules_legacy.py: docstring example.
tests/unit/test_rules_checkers.py: updated TrainingSubmissionRulesChecker.supported_models test to assert the new closed/open set and explicitly reject resnet50/cosmoflow.

Out of scope (intentionally unchanged)

MODELS = [cosmoflow, resnet50, unet3d, dlrm, retinanet, flux] constant — still the full whatif superset.
whatif CLI choices and the whatif test parameterizations.
training/README.md and the top-level README.md — already reflect the new set.

Test plan

pytest tests/unit against changed-area suites: 257 passed (rules_checkers, config, validation_helpers, cli_parser, parser_modes, help_behavior).
Full unit-test pass (skipping modules requiring pyarrow/numpy not installed locally): 1172 passed, 2 skipped, 1 unrelated pre-existing failure (test_version_lookup_uses_correct_distribution_name — needs pip install -e .).
CI to re-run on full env.
Manual: run mlpstorage closed training --help → shows unet3d | retinanet; mlpstorage whatif training --help → still shows all six.

Rules.md 2.1.11 now enumerates only "unet3d" and "retinanet" as the official training workloads for closed/open submission. Propagate the new set through the runtime validator, submission validator, dataset constants, error suggestions, and result directory examples. The whatif mode keeps the full six-model exploration set unchanged (unet3d, retinanet, cosmoflow, resnet50, dlrm, flux). - Rules.md: text rule + Closed/Open result tree examples - submission_checker: - constants: NUM_DATASET_{TRAIN,EVAL}_{FILES,FOLDERS} keyed by {unet3d, retinanet} - submission_structure_checks STRUCT-11: valid workload set, docstring, violation message - training_checks 3.3.1: comments / skip-diagnostic message - configuration: comment example - rules/submission_checkers/training: supported_models now MODELS_CLOSED instead of full MODELS superset - validation_helpers: --model suggestion text - reporting/directory_validator: example list - rules_legacy: docstring example - tests: update TrainingSubmissionRulesChecker supported_models test to assert the new closed/open set

github-actions · 2026-06-13T04:19:59Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

The illustrative cross-workload example referred to "ResNet-50 training task" → "3D-Unet training task". ResNet-50 is no longer in the closed/open training workload set (Rules.md 2.1.11, {unet3d, retinanet}). Replace with RetinaNet → 3D-Unet so the example uses current workloads.

FileSystemGuy changed the title ~~Restrict closed/open training workloads to {unet3d, retinanet}~~ Restrict closed/open training workloads to {unet3d, retinanet} (depends on #432) Jun 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restrict closed/open training workloads to {unet3d, retinanet} (depends on #432)#439

Restrict closed/open training workloads to {unet3d, retinanet} (depends on #432)#439
FileSystemGuy wants to merge 2 commits into
FileSystemGuy-rules-validatorfrom
FileSystemGuy-training-to-new-models

FileSystemGuy commented Jun 13, 2026

Uh oh!

github-actions Bot commented Jun 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

FileSystemGuy commented Jun 13, 2026

Summary

Dependency

Changes

Out of scope (intentionally unchanged)

Test plan

Uh oh!

github-actions Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented Jun 13, 2026 •

edited

Loading