refactor(workflows/spatialxe): extract params.* to take: inputs#150
Open
an-altosian wants to merge 10 commits intonf-core:devfrom
Open
refactor(workflows/spatialxe): extract params.* to take: inputs#150an-altosian wants to merge 10 commits intonf-core:devfrom
an-altosian wants to merge 10 commits intonf-core:devfrom
Conversation
Add inline comment documenting the semantics of each exit-code range in the errorStrategy retry list. Notably, 2147483647 is Integer.MAX_VALUE — Nextflow's sentinel for tasks that died before writing .exitcode (e.g. AWS Batch spot reclamation, kubernetes preemption, grid-scheduler cancellations) — not an AWS-specific exit code. Cite Nextflow docs/aws.md for the AWS case. Addresses PR nf-core#139 review comment r3165322845.
…m() for STARDIST ext.args
Replaces the Groovy-fallthrough `findAll()` (no closure) with the documented
stdlib `join` + `trim` chain. Internal double-spaces from empty-string entries
get collapsed by bash word-splitting at `${args}` interpolation; `.trim()`
removes leading/trailing whitespace.
Matches the existing project idiom used in
subworkflows/local/utils_nfcore_spatialxe_pipeline/main.nf:341,352
(toolCitationText / toolBibliographyText).
Addresses PR nf-core#139 review comment r3165345526.
This Dockerfile was unreferenced by any module: a repo-wide grep for "modules/local/utility/Dockerfile" returns zero hits. Each utility submodule (convert_mask_uint32, extract_dapi, resize_tif, segger2xr, etc.) declares its own container directly via `container "..."`. The associated `conda.yml` it expected (per the COPY instruction) does not exist next to it. Leftover from an earlier development phase; no longer needed. Addresses PR nf-core#139 review comment r3165438419.
…hema_input.json Add nf-schema `format` keys so the validator confirms paths exist: - bundle: format: directory-path - image: format: file-path These validate path existence at runtime, not name conventions — so the team's concern about varying 10x bundle/image naming doesn't conflict. The format key is documented at https://nextflow-io.github.io/nf-schema/latest/nextflow_schema/nextflow_schema_specification/ Addresses PR nf-core#139 review comments r3165291219 (C3) and r3165292878 (C4).
…ll via ext.args
Removes the inline `params.baysor_tiling_min_transcripts_per_cell` reference
from the stitch process script. Instead:
- The XENIUM_PATCH_STITCH withName: block in conf/modules.config sets the
flag via ext.args
- The .nf script declares `def args = task.ext.args ?: ''` and interpolates
`${args}` into the command
Per Florian's narrowing directive, only the cited param at line 40 is fixed.
Other params references in the file (if any) remain as-is for follow-up work.
Addresses PR nf-core#139 review comment r3165442902.
Replaces the three inline `python3 - <<'PYEOF'` heredoc blocks (parquet
column statistics, tile-split workaround, NaN bd.x fix) plus the bundle
symlink shell loop with a single module binary at
resources/usr/bin/run_create_dataset.py.
The .nf script block shrinks from ~190 lines to ~15. The container
(quay.io/dongzehe/segger:1.0.14) stays general — pipeline-specific
workarounds for upstream segger bugs now live in the module binary
where they can be reviewed, tested, and removed when fixed upstream.
Each WORKAROUND function in the binary has a section comment naming
the upstream bug and removal condition.
Pattern matches existing baysor reference modules
(modules/local/baysor/{create_dataset,preprocess}/) which use
resources/usr/bin/ scripts called without ${moduleDir}/ prefix.
Addresses PR nf-core#139 review comment r3165415041 (segger create_dataset only;
predict refactor in next commit; train kept as-is per single-line debug
P2 carve-out).
Replaces inline GPU enumeration (`python3 -c` for torch.cuda.device_count), predict_parquet.py path resolver, and two `sed -i` runtime patches against installed segger source with a single module binary at resources/usr/bin/run_predict.py. The .nf script block shrinks from ~50 lines to ~15. The two sed patches (torch.no_grad() for VRAM savings, deterministic GPU seed) move into a named `patch_predict_parquet()` function with a clear "remove once upstreamed to segger" comment. Container (quay.io/dongzehe/segger:1.0.14) stays general — patches live in the pipeline. Addresses PR nf-core#139 review comment r3165415041 (segger predict).
…: inputs Per the project rule that params.* should only be read in the entry workflow (main.nf), the SPATIALXE named workflow now declares formal `take:` inputs for all 23 unique params.* references that were inline in the workflow body. The 23 unique params (50 total reference sites in the workflow body): baysor_prior, baysor_tiling, buffer_samples, buffer_size, cellpose_model, features, gene_panel, gene_synonyms, method, mode, multiqc_config, multiqc_logo, multiqc_methods_description, offtarget_probe_tracking, outdir, probes_fasta, qupath_polygons, reference_annotations, relabel_genes, run_qc, segmentation_mask, tiling, xeniumranger_only Addresses PR nf-core#139 review comment r3165543921 thoroughly — full SPATIALXE-level sweep instead of just the cited buffer_samples / buffer_size, matching the project preference for clean workflow interfaces. Subworkflow-level params.* extractions (~82 references across local subworkflows) are out of scope and tracked separately.
… with tar.gz test data The format: directory-path / file-path validators added in d59fac3 break CI because: - bundle field accepts EITHER a directory path (real Xenium bundle) OR a .tar.gz archive URL (test data, extracted at runtime by UNTAR). Neither 'directory-path' nor 'file-path' fits both, and both fail against remote URLs. - image field is similar — can be a local OME-TIFF path, a URL, or empty. The reviewer's suggestion in r3165291219 / r3165292878 was reasonable in isolation but conflicts with this pipeline's dual-purpose input semantics where the same field can be a path or a URL or an archive. Reverts the format keys; keeps the type / pattern / errorMessage entries.
This was referenced Apr 30, 2026
…dict The heredocs were inadvertently introduced when extracting the inline Python to module binaries. Project convention (CLAUDE.md "Version Reporting") is topic-channel only — no `versions.yml` files. Topic channels (versions_segger, versions_python, etc.) emit the actual versions; the hardcoded `segger: 0.1.0` heredoc was unused and risked going stale. No other modules/local/* writes a versions.yml file. Aligns the segger modules with the rest of the pipeline.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Addresses PR #139 review comment r3165543921 — "params used outside the main entry workflow ... it would be better if these were extracted to workflow inputs and passed in at the entry workflow stage".
The
SPATIALXEnamed workflow now declares formaltake:inputs for all 23 uniqueparams.*references (50 reference sites) that were inline in the workflow body.params.*is now read only in the entry workflow (main.nf).The 23 unique params extracted (alphabetical):
Why a full sweep instead of just the cited line
The reviewer's comment cited line 124 (
params.buffer_samples,params.buffer_size) but framed it as a general principle — "these were extracted to workflow inputs" (plural). Florian's narrowing directive ("stick to the params that are mentioned in the comments") applied to the release PR (#148) scope only. Since this PR is a dedicated refactor PR with no review-process pressure, the full sweep is in scope and produces a clean architectural boundary at the SPATIALXE workflow.Out of scope here (~82
params.*references in local subworkflows) — tracked separately as future cleanup.Notes for reviewers
main.nfmatches thetake:block ordering exactly (alphabetical). Verify with a side-by-side read.nextflow config -profile testparses without errors after the change.devwith PR Address PR #139 review: segger refactor + 5 targeted fixes #148's commits in their base.Test plan
nextflow config -profile testpassesgrep -oE 'params\.[a-zA-Z_]+' workflows/spatialxe.nfreturns zero results