Skip to content

Fix 25 pre-existing root mocha-suite failures#45

Merged
electron-rare merged 13 commits into
masterfrom
fix/preexisting-mocha-failures
Jun 1, 2026
Merged

Fix 25 pre-existing root mocha-suite failures#45
electron-rare merged 13 commits into
masterfrom
fix/preexisting-mocha-failures

Conversation

@electron-rare

Copy link
Copy Markdown

Fix the 25 pre-existing root-suite (test:unit:mocha) failures

The root mocha suite had 25 failing tests on master (unrelated to any feature — stale tests/mocks/snapshots accumulated as the fork diverged from Cline/Dirac). This branch takes the suite to 1354 passing / 0 failing under Node LTS.

Stacked on #44. Branched to include #44's check-types/lint fixes (the two pre-existing type/lint breakages on master), so this branch is fully green. Until #44 merges, this PR shows its commits too; they drop out automatically once #44 lands. Review/merge #44 first.

Root causes & fixes (each an atomic commit)

fix(prompts): always emit additionalPropertiesthe only production change.
A prior commit conditioned additionalProperties: false on strict === true in the OpenAI tool-schema converter (spec.ts). The convention is to set it unconditionally at the schema root (the strict flag governs server-side enforcement, not whether to declare it). Restored + refreshed the 12 system-prompt snapshots, which were also stale after the tool-constraints and async-tools (get_tool_result) features had landed without a snapshot update.

test(handlers): repair stale handler-test mocks — mocks that the fork's code outgrew:

  • RenameSymbolToolHandler: mock diffViewProvider lacked show/hideReview (manual-approval path); saveChanges mock now resolves the relative displayPath (which DiffViewProvider.open resolves internally in prod — verified, not a handler bug).
  • debug_syntax (EditFileToolHandler): mock config lacked messageState (whose getApiConversationHistory is called in syncCache).
  • SubagentToolHandler: removeLastPartialMessageIfExistsWithType now receives a third undefined onlyPartial arg.

test(slash-commands): drop removed askDirac test — the fork deliberately dropped the upstream /askDirac command (index.ts: RAG over Dirac source, irrelevant here); the orphaned test hit the workflow path and threw StateManager has not been initialized.

test(providers): assert totalCost in usage chunks — the shared OpenAI-compatible usage formatter now always emits totalCost; the Fireworks/OpenRouter tests predated that.

Verification

  • check-types, lint (biome), test:unit:mcp (76), test:unit:mocha 1354 passing / 0 failing — all under Node 22 (.nvmrc lts/*; npm rebuild better-sqlite3 required for the native binding).
  • Independent pre-PR critic review: ACCEPT, 0 CRITICAL/MAJOR. Confirmed the spec.ts change is correct per OpenAI spec, the RenameSymbol relative-path is not a masked bug, and no test was weakened to pass. Two LOW notes (a brittle undefined-arg assertion; snapshot key-order churn) deferred.

Not addressed

The 25 fixes are scoped to making the suite green without changing behavior. Snapshot key-order normalization (to reduce future churn) is left as a follow-up.

Bearer credentials from plugin-supplied .mcp.json could leak in two
ways: sent in cleartext to a public http:// server, or exfiltrated to a
cloud-metadata endpoint via a crafted url.

Add src/core/mcp/urlSecurity.ts with two guards:
- shouldSendBearer: attach Authorization only over https, a private/
  loopback/CGNAT host, or an ISAAC_MCP_ALLOW_HOSTS-listed host.
- assertMcpUrlAllowed: refuse known cloud-metadata endpoints unless
  allowlisted.

Wire both into the loader (fail-closed, validated before the server id
is claimed so a rejected entry can't shadow a valid same-id server) and
re-check at connect time as defense-in-depth. Also fixes the test-only
type errors that broke check-types after the http-transport union.
chat() called fetch() with no AbortSignal or timeout, so an unresponsive
worker would hang the agent indefinitely. The streaming path already has
total + idle timers; the non-streaming path produces no chunks, so only a
wall-clock total timeout applies. Mirror chatStream()'s combineAbortSignals
pattern and wrap the AbortError as LocalRouterTimeoutError so callers can
fall back on it.
scrubSecrets was applied to planner/tool fields but not to meta.json
(gateway_url could carry inline creds, worker endpoints) nor to the
errors[] array (an error string may echo a failed token). Scrub at the
single persistMeta() point (returns a deep copy, so in-memory meta stays
intact for later merges) and scrub errors[] in appendTurn.

Note: api_conversation_history.json is deliberately NOT scrubbed here -
it is replayed to resume a task, so redacting it would change behaviour;
that is a separate decision.
The cli tsconfig uses the classic JSX runtime, so React must be in scope
for the JSX in this test (tsc TS2686). biome's noUnusedImports didn't see
the usage; add a scoped biome-ignore instead of dropping the import.
Two hardening fixes in the emulated tool-call parser:
- Bash commands parsed from a model's markdown fence defaulted to
  requires_approval:false, letting e.g. 'rm -rf' auto-run. Default to
  true so they hit the approval gate; auto-approve modes still bypass.
- The plain-function extractor interpolated the tool name into a RegExp
  unescaped; a name with regex metacharacters would corrupt the pattern.
  Escape it via escapeRegExp.
- ResponseCache evicted by LRU without first reclaiming expired entries,
  so a cache full of stale entries could evict a fresh one. Reclaim
  expired entries before the LRU eviction.
- HealthMonitor shared one AbortController + 5s timer across the /health
  and /v1/models attempts; the fallback could start already-aborted and
  mark an up worker as down. Give each ping its own controller/timeout.
api_conversation_history.json is a plaintext copy of the full LLM
exchange and may carry secrets. It is replayed to resume a task, so it
is deliberately not scrubbed; restrict it to the owner instead via a new
optional mode arg on atomicWriteFile.
Rebrand residual user-visible 'dirac' references that do not affect
storage compat: telemetry event dirac_cli -> isaac_cli, exports JSDoc,
CLI help fixtures in index.test, DEVELOPMENT.md command examples, and
two code comments. Storage paths (~/.dirac, DIRAC_DIR, .diracrules) are
intentionally left for backward compat per src/CLAUDE.md.
Pre-PR critic review (verdict ACCEPT-WITH-RESERVATIONS, agent
a9be496e52ccc434a) found that http://[::ffff:169.254.169.254]/ and the
hextet form ::ffff:a9fe:a9fe bypassed the metadata SSRF guard, and that
IPv4-mapped private addresses were wrongly treated as public by the
token-gate. normalizeHost now collapses IPv4-mapped IPv6 to dotted IPv4
and strips a trailing FQDN dot, so both guards see the real address.
Adds tests for the mapped/trailing-dot evasions.
A prior change conditioned additionalProperties:false on strict=true,
but the OpenAI tool-schema convention is to always set it at the schema
root (strict only filters individual leaf params). Restore it
unconditionally and refresh the 12 system-prompt snapshots, which were
also stale after the tool-constraints and async-tools (get_tool_result)
features landed without a snapshot update.
Three handler tests broke when the fork's code outgrew their mocks:
- RenameSymbolToolHandler: mock diffViewProvider lacked show/hideReview
  (called on the manual-approval path); also resolve the relative
  displayPath against tmpDir in the saveChanges mock.
- debug_syntax (EditFileToolHandler): mock config lacked messageState,
  whose getApiConversationHistory the handler calls in syncCache.
- SubagentToolHandler: removeLastPartialMessageIfExistsWithType is now
  invoked with a third (undefined) onlyPartial arg; assert it exactly.
The fork dropped the upstream /askDirac command (index.ts: RAG over
Dirac source, irrelevant here). The orphaned test hit the workflow path
and threw 'StateManager has not been initialized'.
The shared OpenAI-compatible usage formatter now always emits totalCost.
The Fireworks and OpenRouter tests predated that and omitted (or zeroed)
the field; assert the actual computed cost for the default model price.
Copilot AI review requested due to automatic review settings May 31, 2026 23:44

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates root mocha-suite expectations and stale tests while also carrying stacked production changes from #44 around MCP hardening, tracing redaction, local-router reliability, storage permissions, and CLI rebranding.

Changes:

  • Restores OpenAI tool-schema additionalProperties: false behavior and refreshes prompt/tool snapshots.
  • Repairs stale handler/provider/slash-command tests and adds coverage for tracing and MCP URL security.
  • Includes stacked production hardening for MCP HTTP URLs, tracing persistence, local-router timeouts/cache/approval defaults, and CLI branding.

Reviewed changes

Copilot reviewed 38 out of 38 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
src/services/local-router/ResponseCache.ts Reclaims expired entries before LRU eviction.
src/services/local-router/LocalRouter.ts Adds non-streaming timeout guard, regex escaping, and safer bash-fence approval default.
src/services/local-router/HealthMonitor.ts Gives each health-check attempt its own abort timeout.
src/services/local-router/__tests__/LocalRouter.test.ts Updates bash-fence approval assertion.
src/core/tracing/JsonlTracer.ts Scrubs errors and persisted meta before writing trace artifacts.
src/core/task/tools/handlers/__tests__/SubagentToolHandler.test.ts Updates callback argument expectations.
src/core/task/tools/handlers/__tests__/RenameSymbolToolHandler.test.ts Repairs diff-view mock behavior.
src/core/task/tools/handlers/__tests__/debug_syntax.test.ts Adds missing message-state mock.
src/core/storage/disk.ts Adds optional file mode support and writes API conversation history as 0600.
src/core/slash-commands/__tests__/index.test.ts Removes obsolete /askDirac test.
src/core/prompts/system-prompt/spec.ts Always emits OpenAI root additionalProperties: false.
src/core/prompts/system-prompt/__tests__/__snapshots__/vertex_gemini_3.tools.snap Refreshes Vertex tool snapshot.
src/core/prompts/system-prompt/__tests__/__snapshots__/vertex_gemini_3-no-browser.snap Refreshes Vertex no-browser prompt snapshot.
src/core/prompts/system-prompt/__tests__/__snapshots__/vertex_gemini_3-basic.snap Refreshes Vertex basic prompt snapshot.
src/core/prompts/system-prompt/__tests__/__snapshots__/openai_gpt_5.tools.snap Refreshes OpenAI tool snapshot.
src/core/prompts/system-prompt/__tests__/__snapshots__/openai_gpt_5-no-browser.snap Refreshes OpenAI no-browser prompt snapshot.
src/core/prompts/system-prompt/__tests__/__snapshots__/openai_gpt_5-basic.snap Refreshes OpenAI basic prompt snapshot.
src/core/prompts/system-prompt/__tests__/__snapshots__/gemini_gemini_3.tools.snap Refreshes Gemini tool snapshot.
src/core/prompts/system-prompt/__tests__/__snapshots__/gemini_gemini_3-no-browser.snap Refreshes Gemini no-browser prompt snapshot.
src/core/prompts/system-prompt/__tests__/__snapshots__/gemini_gemini_3-basic.snap Refreshes Gemini basic prompt snapshot.
src/core/prompts/system-prompt/__tests__/__snapshots__/anthropic_claude_4_5_sonnet.tools.snap Refreshes Anthropic tool snapshot.
src/core/prompts/system-prompt/__tests__/__snapshots__/anthropic_claude_4_5_sonnet-no-browser.snap Refreshes Anthropic no-browser prompt snapshot.
src/core/prompts/system-prompt/__tests__/__snapshots__/anthropic_claude_4_5_sonnet-basic.snap Refreshes Anthropic basic prompt snapshot.
src/core/mcp/urlSecurity.ts Adds MCP URL SSRF and bearer token-gate helpers.
src/core/mcp/McpServerConfigLoader.ts Applies URL security checks while loading plugin MCP configs.
src/core/mcp/McpClientManager.ts Rechecks MCP URL security before connecting.
src/core/mcp/__tests__/urlSecurity.test.ts Adds MCP URL security unit tests.
src/core/mcp/__tests__/McpServerConfigLoader.test.ts Updates type assertions and adds loader security tests.
src/core/api/providers/__tests__/openrouter.test.ts Updates expected OpenRouter usage cost.
src/core/api/providers/__tests__/fireworks.test.ts Updates expected Fireworks usage cost.
cli/tests/tracing/JsonlTracer.test.ts Adds tracing redaction tests.
cli/src/utils/piped.ts Rebrands CLI examples in comments.
cli/src/init.ts Rebrands CLI telemetry host event.
cli/src/index.test.ts Updates CLI branding and removes kanban expectations.
cli/src/exports.ts Rebrands public API documentation.
cli/src/context/StdinContext.tsx Rebrands stdin comment example.
cli/src/context/StdinContext.test.tsx Documents required React import.
cli/DEVELOPMENT.md Rebrands CLI development documentation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +55 to +56
- Operating System: darwin
- Default Shell: /bin/zsh
Comment on lines +55 to +56
- Operating System: darwin
- Default Shell: /bin/zsh
Comment on lines +55 to +56
- Operating System: darwin
- Default Shell: /bin/zsh
Comment on lines +55 to +56
- Operating System: darwin
- Default Shell: /bin/zsh
Comment on lines +55 to +56
- Operating System: darwin
- Default Shell: /bin/zsh
Comment on lines +55 to +56
- Operating System: darwin
- Default Shell: /bin/zsh
Comment on lines +55 to +56
- Operating System: darwin
- Default Shell: /bin/zsh
Comment on lines +55 to +56
- Operating System: darwin
- Default Shell: /bin/zsh
Comment on lines +204 to +210
const totalTimeoutMs = sanitizeTimeout(req.timeoutMs, DEFAULT_LOCAL_ROUTER_TOTAL_TIMEOUT_MS)
const { controller, dispose: detachSignals } = combineAbortSignals([req.signal])
let timedOut = false
const totalTimer = setTimeout(() => {
timedOut = true
controller.abort(new LocalRouterTimeoutError("total", worker.id, totalTimeoutMs))
}, totalTimeoutMs)
@electron-rare electron-rare merged commit 05bca4d into master Jun 1, 2026
3 checks passed
@electron-rare electron-rare deleted the fix/preexisting-mocha-failures branch June 1, 2026 07:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants