Skip to content

v2026.7.1#502

Merged
duguwanglong merged 260 commits into
mainfrom
dev
Jul 1, 2026
Merged

v2026.7.1#502
duguwanglong merged 260 commits into
mainfrom
dev

Conversation

@stephamie7

Copy link
Copy Markdown
Contributor

No description provided.

xiami762 and others added 30 commits June 5, 2026 15:48
feat(device): add multi-room support and i18n for device integration
When opening a tool's test panel from within a specific device's
configuration panel, the device UUID (generated at onboarding time)
is now pre-filled as `device_id` in the test parameters JSON.

This resolves the error "当前存在多台同类型设备,调用前必须显式传入
`device_id`" that appeared whenever multiple instances of the same
device type were registered, because the registry's `_resolve_device_target`
could not auto-select a target without an explicit `device_id`.

Changes:
- ToolDetailModal: add optional `deviceId` prop; inject it as the first
  key in the template produced by `buildParamsTemplate`, skipping any
  duplicate `device_id` param entry from the tool's parameter list only
  when the prop is already provided (avoids regression for tools whose
  YAML explicitly declares a required `device_id` parameter)
- ToolDetailModal: show a contextual hint below the params label when
  `deviceId` is pre-filled, so the user knows where the value came from
- DeviceConfigPanel: pass `device?.id` as `deviceId` when opening
  ToolDetailModal from the per-device tools tab
- buildParamsTemplate: fix number/boolean default values to emit actual
  `0` / `false` instead of the string literals `"0"` / `"false"`

Co-authored-by: Cursor <cursoragent@cursor.com>
…evice-id

feat(device): auto-inject device_id into tool test params
Clicking a fixture called setTestParams with the fixture's raw params,
silently overwriting the pre-filled device_id injected by the deviceId
prop. Merge device_id as the first key so it is never lost regardless
of which fixture the user selects.

Co-authored-by: Cursor <cursoragent@cursor.com>
setToolModal(tool) passed the raw tool object whose enabled field
reflects the initial load, ignoring any per-device toggle the user
applied in the same session. Replace with { ...tool, enabled: isOn }
so the modal always receives the current effective enabled state.

Co-authored-by: Cursor <cursoragent@cursor.com>
…l reveal (#349)

* feat(webui): add custom device access wizard (API/WebCLI/Syslog)

Extend Device Integration with custom device onboarding for API, WebCLI,
and syslog modes, including types, panel UI, and tests. Document a third
web2cli capture path when CLI requirements are already specified.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(webui,web2cli): align WebCLI custom device with device plugin flow

Require skill integration first, then optional device plugin packaging for
security devices. Add cli-in-device reference, tighten session prompts and
provider description guidance, and update custom device access UI/tests.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(device,web2cli): reveal persisted secrets and streamline CLI docs

Add an explicit credentials endpoint so the device UI can show full
masked secrets on demand. Update web2cli to document cookie/auth-state
defaults and replace the spec-generation step with cli-requirements.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(web2cli,webui): document optional auth recovery credentials

Allow optional username/password for browser-based cookie recovery
after auth-state expires, and align custom device session prompts
with the updated cli-in-device guidance.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(device,webui): scope credential reveal and add audit trail

Switch credentials reveal to POST with per-field requests, emit
device.credentials_reveal audit events without logging secrets, and
localize the custom device access UI.
…E_ID handlers

_config_override_service was set to the bare service_id (e.g. "sangfor_af")
derived by storage_key_to_service_id(), but several device handlers declare
SERVICE_ID as the full versioned storage key (e.g. "sangfor_af_v8_0_48").
get_config_override() performed an exact match, so these handlers always
got None and silently fell back to the global default config — using the
plugin's hardcoded DEFAULT_BASE_URL (192.168.1.1) instead of the device's
configured IP (e.g. 10.201.255.17).

Fix:
- _build_overrides() now returns a 4-tuple including storage_key
- activate_device_credentials() stores storage_key in a new ContextVar
  _config_override_storage_key alongside the existing service_id var
- get_config_override() accepts a match on either bare service_id OR
  full storage_key, so all handler SERVICE_ID conventions work correctly

Affected handlers (non-exhaustive):
  sangfor_af_v8_0_48, sangfor_af_v8_0_85, sangfor_af_v8_0_106

Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(session): add write file links and concise workflow tool output

Append clickable local-file Markdown links to write tool results in session
output, omit verbose workflow execution history from default run_workflow text
while keeping it in metadata, and strip reasoning block whitespace on both ends.

Co-authored-by: Cursor <cursoragent@cursor.com>

* refactor(session): drop runner-side write tool file link formatting

Remove clickable Markdown link injection from session runner write output;
workflow concise output and reasoning strip changes remain unchanged.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(workflow): show live stage in WebUI and forward session cancel

Expose workflow_name and total_nodes in run_workflow metadata, forward the
session abort flag to the workflow runtime, and render a compact running-stage
summary in the session tool header.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(session): propagate abort to tools and guard late metadata updates

Forward the session abort event into StreamProcessor tool execution,
mark metadata callbacks finished in a finally block, ignore stale running
updates after completion, and persist interrupted tool state on cancel.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(session): cancel inflight metadata tasks and localize workflow header

Cancel pending running-metadata publish/persist tasks when a tool completes,
and use i18n labels for run_workflow stage summaries in the WebUI header.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(workflow): make llm.ask cancellable and show cancelling UI state

Propagate workflow cancel checks through LLM nodes, lazy llm helpers,
and provider calls, then surface a cancelling phase in workflow detail
run/history views with localized status messaging.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(workflow): add cancel_checker to sandbox python runtime

Declare cancel_checker on SandboxPythonExecRuntime and update sandbox
tests to match the keyword-aware get_lazy_llm factory.
B1 (ToolDetailModal): { device_id: deviceId, ...fx.params } had the spread
order backwards — a fixture that already carries a device_id key would
overwrite the prop-injected value, exactly undoing the fix. Swap to
{ ...fx.params, device_id: deviceId } so the parent-supplied deviceId
always takes precedence regardless of fixture contents.

S1 (credential_context): 'and' binds tighter than 'or', so the guard
  if secret_ovr is None and config_ovr is None or service_id is None
was already evaluated correctly, but was easy to misread as
  if secret_ovr is None and (config_ovr is None or service_id is None).
Add explicit parentheses to match the intended semantics at a glance.

Co-authored-by: Cursor <cursoragent@cursor.com>
S2 — Add unit tests for get_config_override dual-key matching
  New test file tests/tool/test_credential_context_config_override.py
  covers 7 scenarios without touching the DB or ToolRegistry:
  · bare service_id match (existing behaviour)
  · versioned storage_key match (new behaviour, regression target)
  · unrelated service_id → None
  · no active override → None
  · storage_key=None does not match empty string (falsy trap)
  · service_id=None does not match empty string (falsy trap)
  · identical service_id and storage_key (no version suffix)

S3 — Improve readability of the dual-key match in get_config_override
  Replace 'service_id in (expected_service, expected_storage)' with
  explicit named booleans (matches_service / matches_storage) and an
  expanded docstring that explains the two naming conventions handlers
  use, so future readers do not mistake either branch as redundant.

S4 — _build_overrides return type → _DeviceOverrides NamedTuple
  Positional 4-tuples are fragile: a caller adding or reordering fields
  silently shifts every unpack site. Introducing _DeviceOverrides makes
  field access self-documenting (.service_id, .storage_key, …) and lets
  type-checkers catch missing fields at import time.

Co-authored-by: Cursor <cursoragent@cursor.com>
- Add autouse fixture _reset_context_vars that clears the three
  ContextVars before and after every test; without it a test that sets
  a var leaks state into the next test in the same thread
- Remove bare 'import pytest' that was unused before the fixture was
  added (would have triggered a lint warning)

Co-authored-by: Cursor <cursoragent@cursor.com>
…-drops-device-id

fix/device tool test fixture drops device
…ingtalk, telegram

Apply PR #190 (wecom inbound + outbound file attachments) and extend the
same pattern to dingtalk and telegram. Refactor the inbound dispatcher
into a per-channel hook registry so new channels no longer need to
modify dispatcher.py.

Inbound (download to local FilePart):
- wecom: AES-256-CBC decrypt via wecom_aibot_sdk, 30MB cap, nested
  mixed-message aeskey, Content-Disposition filename extraction
- dingtalk: exchange download_code via OAPI /v1.0/robot/messageFiles/download,
  20MB cap, separate exchange/download error classification
- telegram: resolve telegram://<kind>/<file_id> via getFile +
  https://api.telegram.org/file/bot<token>/<file_path> download

Outbound (send_media per channel):
- wecom: SDK upload_media → send_media_message / reply_media; companion
  text sent as a follow-up markdown message
- dingtalk: OAPI multipart upload → msgKey=file (downloadCode+fileName)
  for any type; msgKey=image (photoURL) for remote image URLs
- telegram: route to sendPhoto / sendDocument / sendVideo / sendAudio /
  sendVoice / sendAnimation based on inferred kind; agent can force
  document via telegram:document:<url> prefix

Dispatcher:
- register_inbound_media_downloader() + _DOWNLOADERS table
- dynamic per-channel lookup so test monkeypatches on the channel's
  inbound_media module still apply
- SSE message.part.updated events for both FilePart and the rewritten
  text part; placeholder text replaced with 'Attached files: <path>'

Tests:
- wecom: +14 (send_media, inbound_media, content-disposition, mixed file)
- dingtalk: +9 (download_code exchange, oversized guard, send_media
  routing, text-after-file, image URL inline)
- telegram: +15 (file_id resolution, kind inference for image/pdf/gif/ogg,
  endpoint routing, kind override prefix, error path)
- dispatcher: +9 (per-channel routing, placeholder detection,
  end-to-end per-channel pipeline)
- test_e2e_file_roundtrip.py: 7 new tests covering real PNG byte
  round-trips for all five channels with in-process fake servers

All 354 channel tests pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Documents the bidirectional file/image contract for all built-in
channels, the dispatcher refactor, the per-channel downloader hook
pattern, and the channel-specific outbound quirks reviewers need
to know before approving changes to the media path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…escription)

The channel file/image review notes are better kept in the PR
description itself (where reviewers see them first) than in a
root-level doc that the gitignore policy excludes from the
docs/ folder.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ge points, impact scope, and review focus

Rewrite the "Pull Request Guidelines" section in CONTRIBUTING.md so the
required PR description structure is explicit:

- Key Changes (改动点) — concrete deltas grouped by area.
- Impact Scope (影响范围) — user-visible behavior, compatibility,
  configuration, dependencies, performance, security.
- Business Logic to Focus On During Review (需重点 Review 的业务逻辑) —
  the parts of the change that deserve extra reviewer attention.

The previous section listed five reviewer-facing questions but did not
constrain the order or depth of the description, which led to PRs that
mentioned the impact but omitted the logic that needed a careful read.

Also update the PR description template to match the new structure and
add a Why-This-Approach section plus an explicit Compatibility, Migration
& Rollback section.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
PR #380 added flex flex-col to the shared content wrapper for the
/devices page height chain, which unintentionally changed home and
other standard page layout. Keep /devices on the fullscreen route and
restore the v2026.6.4 wrapper for all other pages.
Add admin-managed user-defined pages with backend build/watch/runtime support and a WebUI host for published pages.
* fix(provider): send enable_thinking for catalog-declared interleaved models

Trace ses_1628dfe6cffe1i5xZY9lv1u20m showed GLM-5 on alibaba returning
finishReason=stop with empty content + zero tool calls on every turn, so
the agent loop handed the turn back to the user after every step.

Root cause: the request-side dispatch in options.py was a hard-coded
substring whitelist (qwen3 / kimi / mimo / qwq / qwen-max). GLM-5,
minimax-m2.* / m3, deepseek-reasoner and step-3.5-flash all declare
interleaved in the catalog but their model names match none of those
substrings, so enable_thinking: true was never sent. Without the flag,
DashScope / Moonshot / Zhipu / MiniMax / Stepfun / DeepSeek stream their
thinking into content instead of reasoning_content and stop mid-tool-call.

Replace the token-list dispatch with _THINKING_REQUEST_SHAPES, a
provider-id-keyed dict of callables gated on the catalog interleaved
capability. The catalog becomes the single source of truth for "this
model wants thinking-aware streaming"; adding a new provider is one line.

Also fix openai_compatible.py chat() and chat_stream(): both were
silently dropping caller-supplied extra_body whenever kwargs.thinking was
None. This made the user-configured openai-compatible provider ignore
default_parameters.enable_thinking in flocks.json. Now mirrors
openai_base.py:905-913.

Adds tests/provider/test_thinking_params.py with 44 tests:
- Property test: every catalog entry with interleaved != null resolves
  to a thinking flag (regression net for this class of bug)
- Specific GLM-5 step-50 trace replay across alibaba / threatbook-cn /
  threatbook-io / zhipu
- All previously-dropped models (minimax-m2.* / m3 × 3 providers,
  deepseek-reasoner, step-3.5-flash)
- Shape-registry structural checks (no legacy token constant, every
  reasoning provider has a shape entry)
- openai_compatible extra_body propagation smoke test

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(provider): drop _deepseek_thinking_shape V3 model-name branching

Follow-up to 2299b33.  The catalog is the single source of truth for
"this model wants thinking-aware streaming" — the V3 string checks
(re)introduced the exact fragility the catalog-driven dispatch was
meant to eliminate.  Catalog already says the right thing:
``deepseek-chat`` (family=deepseek-v3) has no ``interleaved`` capability,
so ``interleaved_enabled`` is False and the shape function is never
called for it.  A future catalog change that adds ``interleaved`` to V3
would have been silently stripped by the V3 branch; deleting the branch
makes the catalog gate honest.

Add TestShapeRegistry.test_deepseek_v3_is_not_a_thinking_model which
pins both directions:
- Sanity: every catalog entry with family starting ``deepseek-v3`` has
  no ``interleaved`` capability.  If someone changes that, the test
  fails and forces a re-evaluation.
- Runtime: with ``_resolve_interleaved_capability`` returning None, the
  dispatcher emits no ``enable_thinking`` flag for ``deepseek-chat``.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* refactor(provider): drop shape registry, use transport-driven dispatch

Follow-up to 699f717.  The provider-keyed shape registry introduced in
2299b33 was over-engineered: every entry produced the same dict
({"enable_thinking": bool(reasoning_enabled)}), and the gate was
already correctly computed upstream by ``resolve_interleaved_capability``
in ``interleaved.py`` (catalog explicit declaration → series-token
inference fallback).  Replacing it with a transport-based branch:

  reasoning_transport == anthropic_messages
    → thinking={type: "enabled", budget_tokens:...}   (unchanged)

  reasoning_transport == generic_chat
    → extra_body.enable_thinking = bool(reasoning_enabled)

This unblocks the natural design goal: read the model series globally,
no per-provider dispatch.  Concrete consequences:

- deepseek-v4-flash and similar models that the catalog forgets to
  declare now get enable_thinking automatically via the series-token
  inference in ``infer_interleaved_capability`` (qwen3 / glm-* / kimi-k2*
  / deepseek-v4* / step-3.5* / minimax-m* tokens).
- A user-configured openai-compatible endpoint pointing at a known
  family Just Works without anyone editing a per-provider registry.
- Adding a new provider or a new model from a known family is now
  zero-touch: no catalog edit, no shape entry, no dispatch change.

Removed:
- _THINKING_REQUEST_SHAPES dict (9 entries, all identical product)
- _openai_base_thinking_shape helper function
- TestShapeRegistry.test_all_thinking_providers_have_a_shape
  (replaced with transport-based and shape-removal assertions)
- Property test skip-on-missing-shape (no longer needed; transport
  dispatch is universal for interleaved models)

Added:
- TestDispatchShape.test_no_shape_registry — pins the removal
- TestDispatchShape.test_anthropic_transport_still_uses_thinking_field
  — pins the contract that the new generic_chat branch did not
  regress the anthropic_messages path
- TestDispatchShape.test_series_token_fallback_emits_enable_thinking
  — 5 parametrized cases proving series-token inference closes the
  "forgot to add to catalog" gap (qwen3, glm-5, kimi-k2.6, minimax,
  step-3.5-flash)

Tests: 51/51 in test_thinking_params.py pass; full tests/provider/
shows the same 7 pre-existing failures as baseline, none introduced
by this refactor.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* feat(provider): expand deepseek thinking coverage to full series

User-driven follow-up to 3335cc7.  Aligns the deepseek handling with
the "全局读模型系列" design intent: any model in the deepseek family
should be treated as thinking-capable unless explicitly overridden.

Changes:

catalog.json (deepseek provider):
- Add ``deepseek-v4-flash`` with explicit ``interleaved`` declaration.
  Previously only present in the threatbook-cn-llm / threatbook-io-llm
  providers.
- Add ``deepseek-v4-pro`` (new model entry) with explicit ``interleaved``.

interleaved.py (_STRICT_REASONING_CONTENT_TOKENS):
- Add ``deepseek`` (broad catch-all) and ``deepseek-v3`` (specific) to
  cover the entire deepseek series.  Previously only V4-subset and R1
  tokens were listed, so a model named ``deepseek-chat`` (V3) or any
  future V5 / V6 would not be auto-inferred.
- Result: every deepseek model now auto-resolves to the strict
  reasoning_content policy via series-token inference, including
  ``deepseek-chat`` (V3).

Test impact:

- test_thinking_params.py: invert
  ``test_deepseek_v3_is_not_a_thinking_model`` →
  ``test_deepseek_v3_is_a_thinking_model`` to assert the new
  behavior using the real resolution chain (no monkeypatch).

- test_chinese_providers.py::test_deepseek_catalog: extend the model
  set assertion to include deepseek-v4-flash / deepseek-v4-pro; add
  per-model interleaved assertions for the new entries.

- test_provider.py::test_resolve_model_does_not_infer_interleaved_for_non_reasoning_model:
  switch the test fixture from ``deepseek-chat`` to ``gpt-4-turbo``
  because ``deepseek-chat`` is no longer a non-reasoning model under
  the new design.  ``gpt-4-turbo`` doesn't match any series token and
  remains a true non-reasoning fixture.

Net result of this + the previous three commits:

  - Original bug (2299b33): GLM-5 / minimax / deepseek / stepfun
    dropped from request-side enable_thinking because of a hard-coded
    substring whitelist — fixed.
  - 699f717: removed the model-name V3 carve-out in
    _deepseek_thinking_shape so catalog becomes the only gate.
  - 3335cc7: replaced provider-keyed shape registry with
    transport-driven dispatch, hooking up the series-token inference
    in interleaved.py as a fallback for any uncatalogued model.
  - This commit: extend the deepseek series coverage to "all of it"
    (V3 / R1 / V4 and forward) and make V4-flash / V4-pro explicit in
    the deepseek provider catalog.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(provider,session): per-model thinking extra_body and queued user cache

Route generic-chat reasoning to provider-specific wire formats (DeepSeek/GLM/Kimi/MiMo thinking blocks, MiniMax reasoning_split). Apply queued-user reminders inside _to_chat_messages with chat context cache invalidation, and persist aborted assistant messages when LLM streams are interrupted.
Add a Flocks Help entry point to AI editing and replace the default canvas controls with lightweight controls that include tooltips. Document PowerShell encoding requirements for agents.
Local Codex CLI cache / config directory; not meant to be tracked.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
chore(gitignore): ignore .codex/ directory
feat(channel): complete bidirectional file support
* refactor: unify subagent delegation under delegate_task

Consolidate task scheduling into delegate_task (task becomes a compat alias),
remove plan mode and standalone background tools, and run independent foreground
subagents in parallel. Update WebUI cards, skill installer, and agent configs.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix/skill remove

* fix/skill install skill.sh

* fix(webui): render parallel delegate tasks as separate cards

Remove buildParallelDelegateGroupParts which merged multiple sibling
delegate_task tool parts into a single 'Parallel Agents' card. Each
parallel subagent now renders as its own DelegateTaskCard with its
real subagent_type name, matching the user-facing expectation of
'one tool card per agent'.

* refactor(delegate_task): remove legacy tasks=[...] batch shape

The unified delegate_task tool only needs the single-subagent shape now
that parallel work is expressed as multiple sibling tool calls in one
assistant turn. Drop the batch path from delegate_task, the task compat
alias, the stream-processor branch that aggregated it into one card, and
the matching batch compat tests; replace the latter with a small
tolerance test for the slimmer schema and adjust model-pinning tests
accordingly. Also remove the now-redundant fallback to the
load_skills-aware description in the webui DelegateTaskCard since the
card already renders the explicit description when provided.

* Fix legacy todo permission migration
xiami762 and others added 27 commits June 29, 2026 19:14
* feat: guide device onboarding through Rex workbench

* fix: keep device onboarding in Rex after save

* feat: improve device onboarding docs and guidance

* feat(device): add Rex-guided device onboarding

* feat(device): refine config assist and template refresh

* feat(device): add SecGate 3600 hub template
* feat: open workspace files in system file manager

* feat: add workspace table sorting

* feat: enhance workspace file previews

* feat: improve workspace preview layout and memory previews
* ci: dispatch autotest windows upgrade

* ci: dispatch autotest from main updates only

* ci: document autotest artifact prefix

* ci: update autotest artifact prefix summary

* ci: wait for autotest artifact result
* fix: refine pro bundle version upgrades

* fix: simplify pro update version display

* fix: report pro install receipt after restart

* fix: improve pro upgrade restart recovery

* fix: harden pro upgrade console sync

* fix: refresh flocks pro license state

* fix: show revoked pro license state

* fix: harden pro update recovery paths
…custom

# Conflicts:
#	webui/src/components/layout/Layout.test.tsx
* feat(device): add device_manage update action

* docs(skills): add device integration guide

* fix(session): stop pre-response chat state

* Fix abort failure chat state
* feat(workflow): optimize execution memory and storage

* Add workflow payload risk observability

* Add opt-in vertex cache workflow dataflow

* Default new workflows to vertex cache dataflow

* Validate strict workflow edge mappings

* refactor(workflow): isolate edge resolution state

* fix(workflow): preserve root edge mapping payload

* fix(workflow): relax join lint for loop flows

* fix(workflow): return raw tool output to agents

* fix(workflow): unify tool execution run id

* perf(workflow): cache trigger execution plans
@duguwanglong duguwanglong self-requested a review July 1, 2026 09:14
@duguwanglong duguwanglong merged commit f665b19 into main Jul 1, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants