fix(workflows): auto-wrap bare workflow definitions in {"specification": ...}#460
Open
imbgar-roboflow wants to merge 2 commits intoroboflow:mainfrom
Open
fix(workflows): auto-wrap bare workflow definitions in {"specification": ...}#460imbgar-roboflow wants to merge 2 commits intoroboflow:mainfrom
imbgar-roboflow wants to merge 2 commits intoroboflow:mainfrom
Conversation
…n": ...}
`roboflow workflow create --definition <file>` and
`Workspace.create_workflow(name, definition)` previously dumped the user's
JSON verbatim into the backend's `config` field. That breaks execution via
the Roboflow Inference server, which parses `config` and then does:
workflow_config = json.loads(response["workflow"]["config"])
specification = workflow_config["specification"]
# -> KeyError -> MalformedWorkflowResponseError -> HTTP 502
User-facing Workflows JSON — as published at
https://inference.roboflow.com/workflows/create_and_run/, in
`inference/development/workflows_examples/*`, and in the in-repo
integration tests (e.g. LEGACY_DETECTION_PLUS_CLASSIFICATION_WORKFLOW) —
is the flat shape {"version", "inputs", "steps", "outputs"}. The web app
silently wraps it in {"specification": ...} before POST so that the
inference server can unwrap it. Nothing in the SDK/CLI did the same,
so every CLI user who pasted the documented shape and then invoked
`/infer/workflows/...` hit the 502 on first execution.
Reproduced in EU staging (roboflow-eu-staging) against a freshly trained
yolov11n model: CLI-created workflow returned
`{"message": "Internal error. Request to Roboflow API failed."}` at the
inference endpoint; inference-server logs showed
`MalformedWorkflowResponseError: Workflow specification not found in
Roboflow API response` at roboflow_api.py:749. Re-saving the same
definition wrapped in {"specification": ...} made execution work end-to-
end (15/15 test-set images, p50 ~1.2s).
Fix is limited to the SDK write path and is backward compatible:
- New helper `_normalize_workflow_config(config)` on `rfapi` wraps a
bare workflow spec (dict or JSON string) in {"specification": ...}.
Dicts already containing a top-level `specification` key are passed
through unchanged; dicts without any of version/inputs/steps/outputs
at the top level are left alone so callers sending custom payloads
aren't second-guessed; non-JSON strings pass through as-is; None
continues to map to "{}" (preserving the "create empty workflow"
default behaviour exercised by test_defaults_config_and_template).
- `create_workflow` and `update_workflow` in `roboflow.adapters.rfapi`
now call the helper on `config`.
New tests cover bare-dict wrap, bare-JSON-string wrap, already-wrapped
no-op, non-workflow-dict passthrough, and the empty/None/non-JSON
edge cases in `_normalize_workflow_config` directly. Existing tests
pass unchanged — including the byte-for-byte string passthrough for
non-workflow JSON.
Addresses reviewer feedback on the workflow-spec auto-wrap:
1. Strip a leading UTF-8 BOM (``\ufeff``) before ``json.loads`` so
configs saved from Windows editors are recognized as bare workflow
specs and get wrapped. Previously they fell through the JSON-parse
``except`` branch and shipped to the backend unchanged, reproducing
the original 502.
2. Serialize wrapped output with compact separators (``(",", ":")``)
to match the shape the web app writes via ``JSON.stringify``.
Downstream audit/diff tooling sees SDK- and UI-written rows as
byte-identical when the logical content matches.
3. Additional unit tests cover the boundaries reviewers flagged:
already-wrapped JSON string passthrough, partial workflow dicts
(single workflow-shaped key), JSON array / scalar input
(guards the ``isinstance(parsed, dict)`` check), UTF-8 BOM
auto-wrap, and the compact-separators invariant.
4. Rename ``test_empty_dict_preserved`` to
``test_empty_dict_serialized_to_empty_json`` with a comment noting
the behavior is ``json.dumps({}) == "{}"`` rather than an explicit
preservation branch.
5. Hoist ``_normalize_workflow_config`` and ``json`` imports to
module level in the test file; individual tests no longer repeat
the import.
475 passed, 1 skipped.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
roboflow workflow create --definition <file>andWorkspace.create_workflow(name, definition)dump the user's JSON verbatim into the backend'sconfigfield. That breaks execution via the Roboflow Inference server, which parsesconfigand then does:User-facing Workflows JSON — as published at the Workflows docs, in
inference/development/workflows_examples/*, and in the in-repo integration tests (e.g.LEGACY_DETECTION_PLUS_CLASSIFICATION_WORKFLOW) — is the flat shape{"version", "inputs", "steps", "outputs"}. The web app silently wraps it in{"specification": ...}before POST so the inference server can unwrap it. Nothing in the SDK/CLI did the same, so every CLI user who pasted the documented shape and then invoked/infer/workflows/...hit the 502 on first execution.Reproduced on latest
roboflow==1.3.3against a fresh Roboflow hosted deployment (EU staging). CLI-created workflow returned:…and the inference server's own logs showed:
Re-saving the same definition wrapped in
{"specification": ...}made execution work end-to-end (15/15 test images, p50 ~1.2s).What
_normalize_workflow_config(config)onrfapithat:{"specification": ...}.specificationkey through unchanged.version/inputs/steps/outputsat the top level alone, so callers sending custom payloads aren't second-guessed.None -> "{}"(the existing "create empty workflow" default).create_workflowandupdate_workflowinroboflow.adapters.rfapinow call the helper onconfig.Back-compat: no existing public signature changed. All 469 existing tests still pass. New unit tests cover bare-dict wrap, bare-JSON-string wrap, already-wrapped no-op, non-workflow-dict passthrough, and the empty / None / non-JSON edge cases on
_normalize_workflow_configdirectly.Not included (intentional)
This PR only fixes the write path. Workflows stored with the bare shape before this fix will still 502 until re-saved. The read-side symmetry (
inference.core.roboflow_api.get_workflow_specificationaccepting either shape) is orthogonal and belongs inroboflow/inference. Skipped here because the set of already-broken flat workflows in the wild is small and a one-lineworkflow updateper affected workflow is sufficient.We should also have schema validation for any objects we're allowing to pass through our CLI to the database but that can come as a later refinement.
Test plan
pytest tests/ -q --ignore=tests/manual— 469 passed, 1 skipped.roboflow workflow create --definition <bare spec>→ workflow stored with{"specification": ...}wrapper →POST /infer/workflows/<ws>/<name>returns 200 with predictions.