Fix 25 pre-existing root mocha-suite failures#45
Merged
Conversation
Bearer credentials from plugin-supplied .mcp.json could leak in two ways: sent in cleartext to a public http:// server, or exfiltrated to a cloud-metadata endpoint via a crafted url. Add src/core/mcp/urlSecurity.ts with two guards: - shouldSendBearer: attach Authorization only over https, a private/ loopback/CGNAT host, or an ISAAC_MCP_ALLOW_HOSTS-listed host. - assertMcpUrlAllowed: refuse known cloud-metadata endpoints unless allowlisted. Wire both into the loader (fail-closed, validated before the server id is claimed so a rejected entry can't shadow a valid same-id server) and re-check at connect time as defense-in-depth. Also fixes the test-only type errors that broke check-types after the http-transport union.
chat() called fetch() with no AbortSignal or timeout, so an unresponsive worker would hang the agent indefinitely. The streaming path already has total + idle timers; the non-streaming path produces no chunks, so only a wall-clock total timeout applies. Mirror chatStream()'s combineAbortSignals pattern and wrap the AbortError as LocalRouterTimeoutError so callers can fall back on it.
scrubSecrets was applied to planner/tool fields but not to meta.json (gateway_url could carry inline creds, worker endpoints) nor to the errors[] array (an error string may echo a failed token). Scrub at the single persistMeta() point (returns a deep copy, so in-memory meta stays intact for later merges) and scrub errors[] in appendTurn. Note: api_conversation_history.json is deliberately NOT scrubbed here - it is replayed to resume a task, so redacting it would change behaviour; that is a separate decision.
The cli tsconfig uses the classic JSX runtime, so React must be in scope for the JSX in this test (tsc TS2686). biome's noUnusedImports didn't see the usage; add a scoped biome-ignore instead of dropping the import.
Two hardening fixes in the emulated tool-call parser: - Bash commands parsed from a model's markdown fence defaulted to requires_approval:false, letting e.g. 'rm -rf' auto-run. Default to true so they hit the approval gate; auto-approve modes still bypass. - The plain-function extractor interpolated the tool name into a RegExp unescaped; a name with regex metacharacters would corrupt the pattern. Escape it via escapeRegExp.
- ResponseCache evicted by LRU without first reclaiming expired entries, so a cache full of stale entries could evict a fresh one. Reclaim expired entries before the LRU eviction. - HealthMonitor shared one AbortController + 5s timer across the /health and /v1/models attempts; the fallback could start already-aborted and mark an up worker as down. Give each ping its own controller/timeout.
api_conversation_history.json is a plaintext copy of the full LLM exchange and may carry secrets. It is replayed to resume a task, so it is deliberately not scrubbed; restrict it to the owner instead via a new optional mode arg on atomicWriteFile.
Rebrand residual user-visible 'dirac' references that do not affect storage compat: telemetry event dirac_cli -> isaac_cli, exports JSDoc, CLI help fixtures in index.test, DEVELOPMENT.md command examples, and two code comments. Storage paths (~/.dirac, DIRAC_DIR, .diracrules) are intentionally left for backward compat per src/CLAUDE.md.
Pre-PR critic review (verdict ACCEPT-WITH-RESERVATIONS, agent a9be496e52ccc434a) found that http://[::ffff:169.254.169.254]/ and the hextet form ::ffff:a9fe:a9fe bypassed the metadata SSRF guard, and that IPv4-mapped private addresses were wrongly treated as public by the token-gate. normalizeHost now collapses IPv4-mapped IPv6 to dotted IPv4 and strips a trailing FQDN dot, so both guards see the real address. Adds tests for the mapped/trailing-dot evasions.
A prior change conditioned additionalProperties:false on strict=true, but the OpenAI tool-schema convention is to always set it at the schema root (strict only filters individual leaf params). Restore it unconditionally and refresh the 12 system-prompt snapshots, which were also stale after the tool-constraints and async-tools (get_tool_result) features landed without a snapshot update.
Three handler tests broke when the fork's code outgrew their mocks: - RenameSymbolToolHandler: mock diffViewProvider lacked show/hideReview (called on the manual-approval path); also resolve the relative displayPath against tmpDir in the saveChanges mock. - debug_syntax (EditFileToolHandler): mock config lacked messageState, whose getApiConversationHistory the handler calls in syncCache. - SubagentToolHandler: removeLastPartialMessageIfExistsWithType is now invoked with a third (undefined) onlyPartial arg; assert it exactly.
The fork dropped the upstream /askDirac command (index.ts: RAG over Dirac source, irrelevant here). The orphaned test hit the workflow path and threw 'StateManager has not been initialized'.
The shared OpenAI-compatible usage formatter now always emits totalCost. The Fireworks and OpenRouter tests predated that and omitted (or zeroed) the field; assert the actual computed cost for the default model price.
There was a problem hiding this comment.
Pull request overview
This PR updates root mocha-suite expectations and stale tests while also carrying stacked production changes from #44 around MCP hardening, tracing redaction, local-router reliability, storage permissions, and CLI rebranding.
Changes:
- Restores OpenAI tool-schema
additionalProperties: falsebehavior and refreshes prompt/tool snapshots. - Repairs stale handler/provider/slash-command tests and adds coverage for tracing and MCP URL security.
- Includes stacked production hardening for MCP HTTP URLs, tracing persistence, local-router timeouts/cache/approval defaults, and CLI branding.
Reviewed changes
Copilot reviewed 38 out of 38 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
src/services/local-router/ResponseCache.ts |
Reclaims expired entries before LRU eviction. |
src/services/local-router/LocalRouter.ts |
Adds non-streaming timeout guard, regex escaping, and safer bash-fence approval default. |
src/services/local-router/HealthMonitor.ts |
Gives each health-check attempt its own abort timeout. |
src/services/local-router/__tests__/LocalRouter.test.ts |
Updates bash-fence approval assertion. |
src/core/tracing/JsonlTracer.ts |
Scrubs errors and persisted meta before writing trace artifacts. |
src/core/task/tools/handlers/__tests__/SubagentToolHandler.test.ts |
Updates callback argument expectations. |
src/core/task/tools/handlers/__tests__/RenameSymbolToolHandler.test.ts |
Repairs diff-view mock behavior. |
src/core/task/tools/handlers/__tests__/debug_syntax.test.ts |
Adds missing message-state mock. |
src/core/storage/disk.ts |
Adds optional file mode support and writes API conversation history as 0600. |
src/core/slash-commands/__tests__/index.test.ts |
Removes obsolete /askDirac test. |
src/core/prompts/system-prompt/spec.ts |
Always emits OpenAI root additionalProperties: false. |
src/core/prompts/system-prompt/__tests__/__snapshots__/vertex_gemini_3.tools.snap |
Refreshes Vertex tool snapshot. |
src/core/prompts/system-prompt/__tests__/__snapshots__/vertex_gemini_3-no-browser.snap |
Refreshes Vertex no-browser prompt snapshot. |
src/core/prompts/system-prompt/__tests__/__snapshots__/vertex_gemini_3-basic.snap |
Refreshes Vertex basic prompt snapshot. |
src/core/prompts/system-prompt/__tests__/__snapshots__/openai_gpt_5.tools.snap |
Refreshes OpenAI tool snapshot. |
src/core/prompts/system-prompt/__tests__/__snapshots__/openai_gpt_5-no-browser.snap |
Refreshes OpenAI no-browser prompt snapshot. |
src/core/prompts/system-prompt/__tests__/__snapshots__/openai_gpt_5-basic.snap |
Refreshes OpenAI basic prompt snapshot. |
src/core/prompts/system-prompt/__tests__/__snapshots__/gemini_gemini_3.tools.snap |
Refreshes Gemini tool snapshot. |
src/core/prompts/system-prompt/__tests__/__snapshots__/gemini_gemini_3-no-browser.snap |
Refreshes Gemini no-browser prompt snapshot. |
src/core/prompts/system-prompt/__tests__/__snapshots__/gemini_gemini_3-basic.snap |
Refreshes Gemini basic prompt snapshot. |
src/core/prompts/system-prompt/__tests__/__snapshots__/anthropic_claude_4_5_sonnet.tools.snap |
Refreshes Anthropic tool snapshot. |
src/core/prompts/system-prompt/__tests__/__snapshots__/anthropic_claude_4_5_sonnet-no-browser.snap |
Refreshes Anthropic no-browser prompt snapshot. |
src/core/prompts/system-prompt/__tests__/__snapshots__/anthropic_claude_4_5_sonnet-basic.snap |
Refreshes Anthropic basic prompt snapshot. |
src/core/mcp/urlSecurity.ts |
Adds MCP URL SSRF and bearer token-gate helpers. |
src/core/mcp/McpServerConfigLoader.ts |
Applies URL security checks while loading plugin MCP configs. |
src/core/mcp/McpClientManager.ts |
Rechecks MCP URL security before connecting. |
src/core/mcp/__tests__/urlSecurity.test.ts |
Adds MCP URL security unit tests. |
src/core/mcp/__tests__/McpServerConfigLoader.test.ts |
Updates type assertions and adds loader security tests. |
src/core/api/providers/__tests__/openrouter.test.ts |
Updates expected OpenRouter usage cost. |
src/core/api/providers/__tests__/fireworks.test.ts |
Updates expected Fireworks usage cost. |
cli/tests/tracing/JsonlTracer.test.ts |
Adds tracing redaction tests. |
cli/src/utils/piped.ts |
Rebrands CLI examples in comments. |
cli/src/init.ts |
Rebrands CLI telemetry host event. |
cli/src/index.test.ts |
Updates CLI branding and removes kanban expectations. |
cli/src/exports.ts |
Rebrands public API documentation. |
cli/src/context/StdinContext.tsx |
Rebrands stdin comment example. |
cli/src/context/StdinContext.test.tsx |
Documents required React import. |
cli/DEVELOPMENT.md |
Rebrands CLI development documentation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+55
to
+56
| - Operating System: darwin | ||
| - Default Shell: /bin/zsh |
Comment on lines
+55
to
+56
| - Operating System: darwin | ||
| - Default Shell: /bin/zsh |
Comment on lines
+55
to
+56
| - Operating System: darwin | ||
| - Default Shell: /bin/zsh |
Comment on lines
+55
to
+56
| - Operating System: darwin | ||
| - Default Shell: /bin/zsh |
Comment on lines
+55
to
+56
| - Operating System: darwin | ||
| - Default Shell: /bin/zsh |
Comment on lines
+55
to
+56
| - Operating System: darwin | ||
| - Default Shell: /bin/zsh |
Comment on lines
+55
to
+56
| - Operating System: darwin | ||
| - Default Shell: /bin/zsh |
Comment on lines
+55
to
+56
| - Operating System: darwin | ||
| - Default Shell: /bin/zsh |
Comment on lines
+204
to
+210
| const totalTimeoutMs = sanitizeTimeout(req.timeoutMs, DEFAULT_LOCAL_ROUTER_TOTAL_TIMEOUT_MS) | ||
| const { controller, dispose: detachSignals } = combineAbortSignals([req.signal]) | ||
| let timedOut = false | ||
| const totalTimer = setTimeout(() => { | ||
| timedOut = true | ||
| controller.abort(new LocalRouterTimeoutError("total", worker.id, totalTimeoutMs)) | ||
| }, totalTimeoutMs) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix the 25 pre-existing root-suite (
test:unit:mocha) failuresThe root mocha suite had 25 failing tests on
master(unrelated to any feature — stale tests/mocks/snapshots accumulated as the fork diverged from Cline/Dirac). This branch takes the suite to 1354 passing / 0 failing under Node LTS.Root causes & fixes (each an atomic commit)
fix(prompts): always emit additionalProperties— the only production change.A prior commit conditioned
additionalProperties: falseonstrict === truein the OpenAI tool-schema converter (spec.ts). The convention is to set it unconditionally at the schema root (thestrictflag governs server-side enforcement, not whether to declare it). Restored + refreshed the 12 system-prompt snapshots, which were also stale after thetool-constraintsand async-tools (get_tool_result) features had landed without a snapshot update.test(handlers): repair stale handler-test mocks— mocks that the fork's code outgrew:diffViewProviderlackedshow/hideReview(manual-approval path);saveChangesmock now resolves the relativedisplayPath(whichDiffViewProvider.openresolves internally in prod — verified, not a handler bug).configlackedmessageState(whosegetApiConversationHistoryis called insyncCache).removeLastPartialMessageIfExistsWithTypenow receives a thirdundefinedonlyPartialarg.test(slash-commands): drop removed askDirac test— the fork deliberately dropped the upstream/askDiraccommand (index.ts: RAG over Dirac source, irrelevant here); the orphaned test hit the workflow path and threwStateManager has not been initialized.test(providers): assert totalCost in usage chunks— the shared OpenAI-compatible usage formatter now always emitstotalCost; the Fireworks/OpenRouter tests predated that.Verification
check-types,lint(biome),test:unit:mcp(76),test:unit:mocha1354 passing / 0 failing — all under Node 22 (.nvmrc lts/*;npm rebuild better-sqlite3required for the native binding).spec.tschange is correct per OpenAI spec, the RenameSymbol relative-path is not a masked bug, and no test was weakened to pass. Two LOW notes (a brittleundefined-arg assertion; snapshot key-order churn) deferred.Not addressed
The 25 fixes are scoped to making the suite green without changing behavior. Snapshot key-order normalization (to reduce future churn) is left as a follow-up.