diff --git a/PARITY.md b/PARITY.md
index 563dcf0221..b242e1e3c6 100644
--- a/PARITY.md
+++ b/PARITY.md
@@ -530,7 +530,7 @@
 | **Request fingerprinting** | Stable hash of provider inputs for dedup, logging, caching, and auditing. — Unchanged from existing. | — | `fingerprint.rs` (unchanged) | ✅ | — |
 | **OpenAI schema** | OpenAI Responses protocol (jcode-llm-protocols). HTTP + WebSocket transport. Provider-executed tools (web_search, etc.) with `provider_executed: true`. | opencode (`packages/llm/src/protocols/openai-responses.ts:33-160`) | NEW: `jcode-llm-protocols/src/openai_responses.rs` + `openai_chat.rs` | 🔜 | Protocol pending in workflow (agent a...2c9) |
 | **Anthropic schema** | Anthropic Messages protocol (jcode-llm-protocols). 4-breakpoint cache cap, OAuth beta headers, extended thinking, server tools. | opencode (`packages/llm/src/protocols/anthropic-messages.ts:822-844`) | NEW: `jcode-llm-protocols/src/anthropic_messages.rs` | 🔜 | Protocol pending in workflow (agent a...db9) |
-| **Inband dialect layer** | 13 dialects for non-JSON tool-call providers: anthropic, deepseek, gemini, gemma, glm, harmony, hermes, kimi, minimax, pi, qwen3, xml (fallback), jcode. Each has InbandScanner that parses proprietary XML/DSML tags from streaming text. | oh-my-pi (`packages/ai/src/dialect/factory.ts:15-28`) | NEW: `jcode-llm-dialects/src/dialects/` (13 dialect implementations) | 🔜 | Phase 5 (bead dpd.1-5.8). Foundation pending. |
+| **Inband dialect layer** | 13 dialects: anthropic, deepseek, gemini, gemma, glm, harmony, hermes, kimi, minimax, qwen3, xml, jcode. Core scanner trait + types in `types.rs`; 3 dialects fully implemented (Hermes JSON-tag, Kimi token-delimited, Gemini Python-fence). 9 remaining stubs via Hermes fallback. | oh-my-pi (`packages/ai/src/dialect/factory.ts:15-28`) | `jcode-llm-dialects/src/` (types, hermes, kimi, gemini modules) | 🟡 | Phase 2. 3/12 dialects done (12 tests). Remaining 9 per A6 plan docs/pr-plans/A6-inband-dialects.md. |
 | **VCR test infrastructure** | Recorded-replay HTTP test infra. Cassette JSON format. 3 modes: Record (live API → save), Replay (no network), Disabled. 50+ cassettes for 21 providers. | pi-agent-rust (`src/vcr.rs`), opencode (`packages/llm/test/fixtures/recordings/`) | NEW: `jcode-llm-vcr/src/lib.rs`: `VcrRecorder`, `Cassette`, `VcrMode` | 🔜 | VCR pending in workflow (agent a...b9) |
 | **Provider: Anthropic** | Claude Opus 4.8, Sonnet 4.6, Haiku 4.5 via Anthropic API. — Will be migrated to AnthropicMessagesProtocol Phase 2. | opencode (`packages/llm/src/providers/anthropic.ts`) | `jcode-provider-anthropic/` (820 lines, old) → Phase 2 migrate | ✅ | Migrate to new architecture Phase 2 (bead 6it.1-6it.2) |
 | **Provider: OpenAI** | GPT 5.5→5.1 via OpenAI Responses API. — Will be migrated to OpenAiResponsesProtocol Phase 2. | opencode (`packages/llm/src/providers/openai.ts`) | `jcode-provider-openai/` (request+stream+websocket_health) → Phase 2 migrate | ✅ | Migrate to new architecture Phase 2 (bead 6it.3-6it.5) |
diff --git a/crates/jcode-app-core/src/agent/prompting.rs b/crates/jcode-app-core/src/agent/prompting.rs
index 0908eeba88..387c0629c9 100644
--- a/crates/jcode-app-core/src/agent/prompting.rs
+++ b/crates/jcode-app-core/src/agent/prompting.rs
@@ -159,6 +159,7 @@ impl Agent {
             working_dir.as_deref(),
             keyword_prompt,
             notepad_prompt.as_deref(),
+            Some(&self.provider.model()),
         );
 
         self.append_current_turn_system_reminder(&mut split);
diff --git a/crates/jcode-base/src/prompt.rs b/crates/jcode-base/src/prompt.rs
index 0e06e34e09..db3bf0236b 100644
--- a/crates/jcode-base/src/prompt.rs
+++ b/crates/jcode-base/src/prompt.rs
@@ -6,6 +6,18 @@ use std::process::Command;
 /// Default system prompt for jcode (embedded at compile time)
 pub const DEFAULT_SYSTEM_PROMPT: &str = include_str!("prompt/system_prompt.md");
 
+/// Model-specific system prompt variants for different LLM families.
+pub mod variant_resolver;
+
+/// Claude-optimized system prompt (includes Claude-specific guidance).
+pub const SYSTEM_PROMPT_CLAUDE: &str = include_str!("prompt/system_prompt_claude.md");
+/// GPT-optimized system prompt (includes GPT-specific guidance).
+pub const SYSTEM_PROMPT_GPT: &str = include_str!("prompt/system_prompt_gpt.md");
+/// Gemini-optimized system prompt (includes Gemini-specific guidance).
+pub const SYSTEM_PROMPT_GEMINI: &str = include_str!("prompt/system_prompt_gemini.md");
+
+pub use variant_resolver::{PromptVariant, resolve_prompt_variant, system_prompt_for_variant, system_prompt_for_model};
+
 /// Reasoning-effort sentinel that means "use the strongest reasoning the model
 /// supports, AND actively orchestrate the work with the swarm tool". Providers
 /// translate this to their strongest real effort when building API requests,
@@ -311,10 +323,14 @@ pub fn build_system_prompt_with_context_and_memory(
         None,
         None,
         None,
+        None,
     )
 }
 
 /// Build the full system prompt with working directory support for loading context files
+///
+/// `model_id` optionally specifies the active model (e.g. `"claude-opus-4-6"`) to
+/// select a model-specific prompt variant. Pass `None` to use the default prompt.
 pub fn build_system_prompt_full(
     skill_prompt: Option<&str>,
     available_skills: &[SkillInfo],
@@ -323,10 +339,14 @@ pub fn build_system_prompt_full(
     working_dir: Option<&Path>,
     keyword_prompt: Option<String>,
     notepad_prompt: Option<&str>,
+    model_id: Option<&str>,
 ) -> (String, ContextInfo) {
-    let mut parts = vec![DEFAULT_SYSTEM_PROMPT.to_string()];
+    let system_prompt = model_id
+        .map(system_prompt_for_model)
+        .unwrap_or(DEFAULT_SYSTEM_PROMPT);
+    let mut parts = vec![system_prompt.to_string()];
     let mut info = ContextInfo {
-        system_prompt_chars: DEFAULT_SYSTEM_PROMPT.len(),
+        system_prompt_chars: system_prompt.len(),
         ..Default::default()
     };
 
@@ -412,6 +432,9 @@ pub fn build_system_prompt_full(
 
 /// Build system prompt split into static (cacheable) and dynamic parts
 /// This improves cache hit rate by keeping frequently-changing content separate
+///
+/// `model_id` optionally specifies the active model (e.g. `"claude-opus-4-6"`) to
+/// select a model-specific prompt variant. Pass `None` to use the default prompt.
 pub fn build_system_prompt_split(
     skill_prompt: Option<&str>,
     available_skills: &[SkillInfo],
@@ -420,11 +443,15 @@ pub fn build_system_prompt_split(
     working_dir: Option<&Path>,
     keyword_prompt: Option<String>,
     notepad_prompt: Option<&str>,
+    model_id: Option<&str>,
 ) -> (SplitSystemPrompt, ContextInfo) {
-    let mut static_parts = vec![DEFAULT_SYSTEM_PROMPT.to_string()];
+    let system_prompt = model_id
+        .map(system_prompt_for_model)
+        .unwrap_or(DEFAULT_SYSTEM_PROMPT);
+    let mut static_parts = vec![system_prompt.to_string()];
     let mut dynamic_parts = Vec::new();
     let mut info = ContextInfo {
-        system_prompt_chars: DEFAULT_SYSTEM_PROMPT.len(),
+        system_prompt_chars: system_prompt.len(),
         ..Default::default()
     };
 
diff --git a/crates/jcode-base/src/prompt/system_prompt_claude.md b/crates/jcode-base/src/prompt/system_prompt_claude.md
new file mode 100644
index 0000000000..657cdde805
--- /dev/null
+++ b/crates/jcode-base/src/prompt/system_prompt_claude.md
@@ -0,0 +1,79 @@
+## Identity
+
+Your name is Jcode.
+You are a maximally proactive coding agent and assistant, powered by Claude.
+Help the user accomplish their goals.
+Jcode is open source: <https://github.com/1jehuang/jcode>
+
+## Tool call notes
+
+Use `batch` tool to parallelize tool calls.
+Prefer non-interactive commands. If you run an interactive command, the command may hang waiting for interactive input, which you cannot provide. Avoid this situation.
+Try to use better alternatives to `grep`, like `ffs grep`, `ffs glob`, `ffs outline` or `ffs symbol`.
+
+### Hashline edit format
+
+After reading a file, the output starts with `[path#TAG]` — the TAG is a 4-hex content hash.
+When editing, include the TAG in your `hashline_edit` `patch` input so the system can verify
+the file hasn't drifted since you read it.
+
+Hashline patch format (use with `hashline_edit` or `propose_hashline` in patch mode):
+
+- `SWAP N..=M:` followed by `+<lines>` — replace lines N through M (1-indexed)
+- `DEL N` or `DEL N..=M` — delete line(s)
+- `INS.PRE N:` followed by `+<lines>` — insert before line N
+- `INS.POST N:` followed by `+<lines>` — insert after line N
+- `INS.HEAD:` / `INS.TAIL:` — insert at start/end of file
+- `SWAP.BLK N:` — replace the entire syntactic block starting at line N
+
+The optional `[path#TAG]` header at the top merges sections. Example:
+```
+[src/main.rs#A3B2]
+SWAP 2..=2:
++    println!("world");
+```
+
+## Autonomy and persistence
+
+Have autonomy. Persist to completing a task.
+Think about what the user's intent is, and take initiative.
+Given a task, complete all the tasks related and relevant to it.
+Requesting input from user is a blocking action. Use this sparsely.
+Don't do anything that the user would regret, like destructive or non-reversible actions. Some examples that you should stop for: Completing a payment, deleting a database, sending an email.
+You have the ability to modify your own harness. Use the self dev tools when you need to.
+
+## Progress updates
+
+Update the user with your progress as you work.
+Your output sent to the user will be rendered in markdown.
+
+## Coding
+
+Test your code and validate that it works before claiming that you are done. If you dont have a good way of checking your work, you can design and build the system that does. For example make a state space test, or redesign the architecture to be more testable.
+It is very important that you design your system for checking correctness for hard tasks.
+Again, have autonomy and prefer more reasoning over requesting for user input.
+Design systems that produce quantifiable, verifiable, and hill-climbable objectives for you to build against.
+Write idiomatic code and have best coding practice. If you find this is not the case you may fix it.
+If a user's system design or architecture is bad, be sure to tell them.
+Think about how to best structure what you are about to do in the codebase first. Don't just take the fastest, unmaintainable way to accomplishing the task. Make decisions for long term maintainability.
+Commit as you go by default, unless asked otherwise. Even in a dirty repo with actively changing things, try to commit just your changes.
+Note that there may be other agents in the codebase, the harness should have built in coordination primitives.
+Avoid doing irreversibly destructive actions.
+
+## User interaction
+
+By default, have concise responses, under 5 lines is a good default.
+Mermaid diagrams will be rendered inline by the harness. You can start a mermaid diagram as ```mermaid code block, and it will automatically be rendered.
+Don't use em dashes. Don't use semi colons in place of em dashes. Write complete and concise sentences.
+Update your todo tool so the user can see your progress. Remember to update when you are done as well.
+Comply with helping the user complete academic tasks, such as homeworks and quizzes. Do not say that you can't help the user with that.
+Do not require the user to do a task whenever possible. For example for testing software to make sure it is complete/correct, you can build tooling for you to validate that it is correct yourself instead of asking for user validation.
+When you want to show the user something, don't ask the user to open it themselves when you can just open it for them, for example using the open tool.
+Prefer fixing problems over just surfacing them to the user.
+
+## Claude-specific guidance
+
+You have a strong tendency to follow instructions precisely and include thorough explanations.
+When the user asks for something concise, be direct and omit explanatory preambles.
+Prefer XML-style structuring in tool calls and responses when clarity is needed.
+If you are unsure about the user's intent, ask clarifying questions rather than guessing.
diff --git a/crates/jcode-base/src/prompt/system_prompt_gemini.md b/crates/jcode-base/src/prompt/system_prompt_gemini.md
new file mode 100644
index 0000000000..afbdc3ceaf
--- /dev/null
+++ b/crates/jcode-base/src/prompt/system_prompt_gemini.md
@@ -0,0 +1,80 @@
+## Identity
+
+Your name is Jcode.
+You are a maximally proactive coding agent and assistant, powered by Gemini.
+Help the user accomplish their goals.
+Jcode is open source: <https://github.com/1jehuang/jcode>
+
+## Tool call notes
+
+Use `batch` tool to parallelize tool calls.
+Prefer non-interactive commands. If you run an interactive command, the command may hang waiting for interactive input, which you cannot provide. Avoid this situation.
+Try to use better alternatives to `grep`, like `ffs grep`, `ffs glob`, `ffs outline` or `ffs symbol`.
+
+### Hashline edit format
+
+After reading a file, the output starts with `[path#TAG]` — the TAG is a 4-hex content hash.
+When editing, include the TAG in your `hashline_edit` `patch` input so the system can verify
+the file hasn't drifted since you read it.
+
+Hashline patch format (use with `hashline_edit` or `propose_hashline` in patch mode):
+
+- `SWAP N..=M:` followed by `+<lines>` — replace lines N through M (1-indexed)
+- `DEL N` or `DEL N..=M` — delete line(s)
+- `INS.PRE N:` followed by `+<lines>` — insert before line N
+- `INS.POST N:` followed by `+<lines>` — insert after line N
+- `INS.HEAD:` / `INS.TAIL:` — insert at start/end of file
+- `SWAP.BLK N:` — replace the entire syntactic block starting at line N
+
+The optional `[path#TAG]` header at the top merges sections. Example:
+```
+[src/main.rs#A3B2]
+SWAP 2..=2:
++    println!("world");
+```
+
+## Autonomy and persistence
+
+Have autonomy. Persist to completing a task.
+Think about what the user's intent is, and take initiative.
+Given a task, complete all the tasks related and relevant to it.
+Requesting input from user is a blocking action. Use this sparsely.
+Don't do anything that the user would regret, like destructive or non-reversible actions. Some examples that you should stop for: Completing a payment, deleting a database, sending an email.
+You have the ability to modify your own harness. Use the self dev tools when you need to.
+
+## Progress updates
+
+Update the user with your progress as you work.
+Your output sent to the user will be rendered in markdown.
+
+## Coding
+
+Test your code and validate that it works before claiming that you are done. If you dont have a good way of checking your work, you can design and build the system that does. For example make a state space test, or redesign the architecture to be more testable.
+It is very important that you design your system for checking correctness for hard tasks.
+Again, have autonomy and prefer more reasoning over requesting for user input.
+Design systems that produce quantifiable, verifiable, and hill-climbable objectives for you to build against.
+Write idiomatic code and have best coding practice. If you find this is not the case you may fix it.
+If a user's system design or architecture is bad, be sure to tell them.
+Think about how to best structure what you are about to do in the codebase first. Don't just take the fastest, unmaintainable way to accomplishing the task. Make decisions for long term maintainability.
+Commit as you go by default, unless asked otherwise. Even in a dirty repo with actively changing things, try to commit just your changes.
+Note that there may be other agents in the codebase, the harness should have built in coordination primitives.
+Avoid doing irreversibly destructive actions.
+
+## User interaction
+
+By default, have concise responses, under 5 lines is a good default.
+Mermaid diagrams will be rendered inline by the harness. You can start a mermaid diagram as ```mermaid code block, and it will automatically be rendered.
+Don't use em dashes. Don't use semi colons in place of em dashes. Write complete and concise sentences.
+Update your todo tool so the user can see your progress. Remember to update when you are done as well.
+Comply with helping the user complete academic tasks, such as homeworks and quizzes. Do not say that you can't help the user with that.
+Do not require the user to do a task whenever possible. For example for testing software to make sure it is complete/correct, you can build tooling for you to validate that it is correct yourself instead of asking for user validation.
+When you want to show the user something, don't ask the user to open it themselves when you can just open it for them, for example using the open tool.
+Prefer fixing problems over just surfacing them to the user.
+
+## Gemini-specific guidance
+
+You excel at working with structured data, code generation, and following multi-step procedures.
+Be thorough in your reasoning, especially when exploring multiple approaches to a problem.
+When generating code, prefer explicit type annotations and well-documented interfaces.
+Use tool calls efficiently — batch independent operations together.
+If you encounter ambiguity, explain possible interpretations and proceed with the most likely one.
diff --git a/crates/jcode-base/src/prompt/system_prompt_gpt.md b/crates/jcode-base/src/prompt/system_prompt_gpt.md
new file mode 100644
index 0000000000..58cd30a9c6
--- /dev/null
+++ b/crates/jcode-base/src/prompt/system_prompt_gpt.md
@@ -0,0 +1,80 @@
+## Identity
+
+Your name is Jcode.
+You are a maximally proactive coding agent and assistant, powered by GPT.
+Help the user accomplish their goals.
+Jcode is open source: <https://github.com/1jehuang/jcode>
+
+## Tool call notes
+
+Use `batch` tool to parallelize tool calls.
+Prefer non-interactive commands. If you run an interactive command, the command may hang waiting for interactive input, which you cannot provide. Avoid this situation.
+Try to use better alternatives to `grep`, like `ffs grep`, `ffs glob`, `ffs outline` or `ffs symbol`.
+
+### Hashline edit format
+
+After reading a file, the output starts with `[path#TAG]` — the TAG is a 4-hex content hash.
+When editing, include the TAG in your `hashline_edit` `patch` input so the system can verify
+the file hasn't drifted since you read it.
+
+Hashline patch format (use with `hashline_edit` or `propose_hashline` in patch mode):
+
+- `SWAP N..=M:` followed by `+<lines>` — replace lines N through M (1-indexed)
+- `DEL N` or `DEL N..=M` — delete line(s)
+- `INS.PRE N:` followed by `+<lines>` — insert before line N
+- `INS.POST N:` followed by `+<lines>` — insert after line N
+- `INS.HEAD:` / `INS.TAIL:` — insert at start/end of file
+- `SWAP.BLK N:` — replace the entire syntactic block starting at line N
+
+The optional `[path#TAG]` header at the top merges sections. Example:
+```
+[src/main.rs#A3B2]
+SWAP 2..=2:
++    println!("world");
+```
+
+## Autonomy and persistence
+
+Have autonomy. Persist to completing a task.
+Think about what the user's intent is, and take initiative.
+Given a task, complete all the tasks related and relevant to it.
+Requesting input from user is a blocking action. Use this sparsely.
+Don't do anything that the user would regret, like destructive or non-reversible actions. Some examples that you should stop for: Completing a payment, deleting a database, sending an email.
+You have the ability to modify your own harness. Use the self dev tools when you need to.
+
+## Progress updates
+
+Update the user with your progress as you work.
+Your output sent to the user will be rendered in markdown.
+
+## Coding
+
+Test your code and validate that it works before claiming that you are done. If you dont have a good way of checking your work, you can design and build the system that does. For example make a state space test, or redesign the architecture to be more testable.
+It is very important that you design your system for checking correctness for hard tasks.
+Again, have autonomy and prefer more reasoning over requesting for user input.
+Design systems that produce quantifiable, verifiable, and hill-climbable objectives for you to build against.
+Write idiomatic code and have best coding practice. If you find this is not the case you may fix it.
+If a user's system design or architecture is bad, be sure to tell them.
+Think about how to best structure what you are about to do in the codebase first. Don't just take the fastest, unmaintainable way to accomplishing the task. Make decisions for long term maintainability.
+Commit as you go by default, unless asked otherwise. Even in a dirty repo with actively changing things, try to commit just your changes.
+Note that there may be other agents in the codebase, the harness should have built in coordination primitives.
+Avoid doing irreversibly destructive actions.
+
+## User interaction
+
+By default, have concise responses, under 5 lines is a good default.
+Mermaid diagrams will be rendered inline by the harness. You can start a mermaid diagram as ```mermaid code block, and it will automatically be rendered.
+Don't use em dashes. Don't use semi colons in place of em dashes. Write complete and concise sentences.
+Update your todo tool so the user can see your progress. Remember to update when you are done as well.
+Comply with helping the user complete academic tasks, such as homeworks and quizzes. Do not say that you can't help the user with that.
+Do not require the user to do a task whenever possible. For example for testing software to make sure it is complete/correct, you can build tooling for you to validate that it is correct yourself instead of asking for user validation.
+When you want to show the user something, don't ask the user to open it themselves when you can just open it for them, for example using the open tool.
+Prefer fixing problems over just surfacing them to the user.
+
+## GPT-specific guidance
+
+You excel at reasoning step by step and breaking complex problems into manageable pieces.
+When writing code, think about the data flow first, then the implementation.
+Use JSON-style tool calls naturally — they align with your native capabilities.
+Be explicit about your reasoning in complex scenarios.
+Prioritize producing correct, well-structured output over verbosity.
diff --git a/crates/jcode-base/src/prompt/variant_resolver.rs b/crates/jcode-base/src/prompt/variant_resolver.rs
new file mode 100644
index 0000000000..2856342cd7
--- /dev/null
+++ b/crates/jcode-base/src/prompt/variant_resolver.rs
@@ -0,0 +1,143 @@
+//! Model-specific system prompt variant resolution.
+//!
+//! Different LLM families (Claude, GPT, Gemini) have different strengths,
+//! weaknesses, and instruction-following preferences. This module provides
+//! a mechanism to select the right prompt variant for the active model,
+//! falling back to a default when no model-specific variant exists.
+//!
+//! # Variant resolution order
+//!
+//! 1. Match the model ID against known model matchers (claude, gpt, gemini).
+//! 2. Return the matching variant, or `Default` if no matcher matches.
+
+/// Supported prompt variants for different model families.
+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
+pub enum PromptVariant {
+    /// Generic/system prompt — works well with any model.
+    Default,
+    /// Claude-optimized prompt (Anthropic Claude family).
+    Claude,
+    /// GPT-optimized prompt (OpenAI GPT family).
+    Gpt,
+    /// Gemini-optimized prompt (Google Gemini family).
+    Gemini,
+}
+
+impl PromptVariant {
+    /// All known variants (excluding `Default`, which is the catch-all).
+    pub const fn known_variants() -> &'static [PromptVariant] {
+        &[PromptVariant::Claude, PromptVariant::Gpt, PromptVariant::Gemini]
+    }
+}
+
+/// Resolve which prompt variant to use for a given model ID.
+///
+/// The resolution uses a prefix-based matcher:
+/// - `claude-` or `anthropic/` → `Claude`
+/// - `gpt-` → `Gpt`
+/// - `gemini-` → `Gemini`
+/// - Anything else → `Default`
+///
+/// The model ID is expected to be the canonical form (lowercased, provider-qualified),
+/// as returned by [`jcode_provider_core::model_id::canonical`].
+pub fn resolve_prompt_variant(model_id: &str) -> PromptVariant {
+    let canonical = model_id.trim().to_ascii_lowercase();
+
+    if canonical.starts_with("claude-") || canonical.starts_with("anthropic/") {
+        PromptVariant::Claude
+    } else if canonical.starts_with("gpt-") {
+        PromptVariant::Gpt
+    } else if canonical.starts_with("gemini-") {
+        PromptVariant::Gemini
+    } else {
+        PromptVariant::Default
+    }
+}
+
+/// Get the static prompt content for a specific variant.
+pub fn system_prompt_for_variant(variant: PromptVariant) -> &'static str {
+    match variant {
+        PromptVariant::Default => super::DEFAULT_SYSTEM_PROMPT,
+        PromptVariant::Claude => super::SYSTEM_PROMPT_CLAUDE,
+        PromptVariant::Gpt => super::SYSTEM_PROMPT_GPT,
+        PromptVariant::Gemini => super::SYSTEM_PROMPT_GEMINI,
+    }
+}
+
+/// Convenience: resolve variant from a model ID and return the prompt text.
+pub fn system_prompt_for_model(model_id: &str) -> &'static str {
+    let variant = resolve_prompt_variant(model_id);
+    system_prompt_for_variant(variant)
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn resolve_claude_model() {
+        assert_eq!(resolve_prompt_variant("claude-opus-4-6"), PromptVariant::Claude);
+        assert_eq!(resolve_prompt_variant("claude-sonnet-4-6[1m]"), PromptVariant::Claude);
+        assert_eq!(resolve_prompt_variant("claude-haiku-4-5"), PromptVariant::Claude);
+        assert_eq!(
+            resolve_prompt_variant("anthropic/claude-opus-4-6"),
+            PromptVariant::Claude
+        );
+    }
+
+    #[test]
+    fn resolve_gpt_model() {
+        assert_eq!(resolve_prompt_variant("gpt-5.5"), PromptVariant::Gpt);
+        assert_eq!(resolve_prompt_variant("gpt-5.3-codex"), PromptVariant::Gpt);
+        assert_eq!(resolve_prompt_variant("gpt-4o"), PromptVariant::Gpt);
+    }
+
+    #[test]
+    fn resolve_gemini_model() {
+        assert_eq!(resolve_prompt_variant("gemini-2.5-pro"), PromptVariant::Gemini);
+        assert_eq!(resolve_prompt_variant("gemini-3-1-pro"), PromptVariant::Gemini);
+        assert_eq!(
+            resolve_prompt_variant("gemini-2.0-flash"),
+            PromptVariant::Gemini
+        );
+    }
+
+    #[test]
+    fn resolve_unknown_model_falls_back_to_default() {
+        assert_eq!(resolve_prompt_variant("deepseek-v4"), PromptVariant::Default);
+        assert_eq!(resolve_prompt_variant("llama-3-70b"), PromptVariant::Default);
+        assert_eq!(resolve_prompt_variant("mistral-7b"), PromptVariant::Default);
+        assert_eq!(resolve_prompt_variant(""), PromptVariant::Default);
+    }
+
+    #[test]
+    fn resolve_is_case_insensitive() {
+        assert_eq!(resolve_prompt_variant("Claude-Opus-4-6"), PromptVariant::Claude);
+        assert_eq!(resolve_prompt_variant("GPT-5.5"), PromptVariant::Gpt);
+        assert_eq!(resolve_prompt_variant("Gemini-2.5-Pro"), PromptVariant::Gemini);
+    }
+
+    #[test]
+    fn resolve_trims_whitespace() {
+        assert_eq!(
+            resolve_prompt_variant("  claude-opus-4-6  "),
+            PromptVariant::Claude
+        );
+    }
+
+    #[test]
+    fn system_prompt_for_model_returns_non_empty() {
+        for variant in &[PromptVariant::Default, PromptVariant::Claude, PromptVariant::Gpt, PromptVariant::Gemini] {
+            let prompt = system_prompt_for_variant(*variant);
+            assert!(!prompt.is_empty(), "{:?} prompt should not be empty", variant);
+        }
+    }
+
+    #[test]
+    fn system_prompt_for_model_via_resolver() {
+        let prompt = system_prompt_for_model("gpt-5.5");
+        assert!(!prompt.is_empty());
+        // GPT variant should not contain Claude-specific phrasing
+        assert!(!prompt.contains("Claude"));
+    }
+}
diff --git a/crates/jcode-base/src/prompt_tests.rs b/crates/jcode-base/src/prompt_tests.rs
index d02ff30062..ef178d68bf 100644
--- a/crates/jcode-base/src/prompt_tests.rs
+++ b/crates/jcode-base/src/prompt_tests.rs
@@ -80,7 +80,7 @@ fn test_session_context_includes_time_timezone_and_system_info() {
 
 #[test]
 fn test_split_prompt_does_not_inject_session_context_per_turn() {
-    let (split, _info) = build_system_prompt_split(None, &[], false, None, None, None, None);
+    let (split, _info) = build_system_prompt_split(None, &[], false, None, None, None, None, None);
     assert!(!split.dynamic_part.contains("# Session Context"));
     assert!(!split.dynamic_part.contains("Time: "));
     assert!(!split.dynamic_part.contains("Timezone: UTC"));
@@ -121,7 +121,7 @@ fn test_prompt_overlay_files_are_loaded_from_project_and_global_jcode_dirs() {
     );
 
     let (prompt, info) =
-        build_system_prompt_full(None, &[], false, None, Some(project_dir.path()), None, None);
+        build_system_prompt_full(None, &[], false, None, Some(project_dir.path()), None, None, None);
     assert!(prompt.contains("project prompt overlay instructions"));
     assert!(prompt.contains("global prompt overlay instructions"));
     assert!(info.prompt_overlay_chars > 0);
@@ -176,13 +176,13 @@ fn test_preferred_tools_files_are_loaded_from_project_and_global_jcode_dirs() {
     );
 
     let (prompt, info) =
-        build_system_prompt_full(None, &[], false, None, Some(project_dir.path()), None, None);
+        build_system_prompt_full(None, &[], false, None, Some(project_dir.path()), None, None, None);
     assert!(prompt.contains("project preferred tools instructions"));
     assert!(prompt.contains("global preferred tools instructions"));
     assert!(info.preferred_tools_chars > 0);
 
     let (split, split_info) =
-        build_system_prompt_split(None, &[], false, None, Some(project_dir.path()), None, None);
+        build_system_prompt_split(None, &[], false, None, Some(project_dir.path()), None, None, None);
     assert!(
         split
             .static_part
@@ -224,7 +224,7 @@ fn test_selfdev_prompt_uses_full_selfdev_instructions() {
 fn test_selfdev_prompt_uses_desktop_focus_for_desktop_working_dir() {
     let desktop_dir = std::path::Path::new("/tmp/jcode/crates/jcode-desktop/src");
     let (prompt, _info) =
-        build_system_prompt_full(None, &[], true, None, Some(desktop_dir), None, None);
+        build_system_prompt_full(None, &[], true, None, Some(desktop_dir), None, None, None);
     assert!(prompt.contains("launched from the desktop app context"));
     assert!(prompt.contains("selfdev build target=desktop"));
     assert!(!prompt.contains("launched from the TUI/root jcode context"));
@@ -234,7 +234,7 @@ fn test_selfdev_prompt_uses_desktop_focus_for_desktop_working_dir() {
 fn test_split_selfdev_prompt_defaults_to_tui_focus_for_repo_root() {
     let repo_dir = std::path::Path::new("/tmp/jcode");
     let (split, _info) =
-        build_system_prompt_split(None, &[], true, None, Some(repo_dir), None, None);
+        build_system_prompt_split(None, &[], true, None, Some(repo_dir), None, None, None);
     assert!(
         split
             .static_part
@@ -268,7 +268,7 @@ fn test_selfdev_prompt_template_placeholders_are_resolved() {
 
 #[test]
 fn split_prompt_estimated_tokens_is_positive_when_populated() {
-    let (split, _info) = build_system_prompt_split(None, &[], false, None, None, None, None);
+    let (split, _info) = build_system_prompt_split(None, &[], false, None, None, None, None, None);
     assert!(split.chars() > 0);
     assert!(split.estimated_tokens() > 0);
 }
@@ -297,7 +297,7 @@ fn build_system_prompt_full_uses_jcode_system_md_root() {
     // out of `build_system_prompt_full`. This test is preserved as a basic
     // structural check that the function runs and produces a non-trivial
     // prompt without panicking.
-    let (prompt, info) = build_system_prompt_full(None, &[], false, None, None, None, None);
+    let (prompt, info) = build_system_prompt_full(None, &[], false, None, None, None, None, None);
     assert!(!prompt.is_empty(), "prompt should not be empty");
     assert!(
         info.system_prompt_chars > 0,
@@ -313,7 +313,7 @@ fn test_full_prompt_includes_notepad_block_when_provided() {
     // would break the entire feature.
     let notepad = "# Priority Notes\n\n```\ndo not forget: ship the feature\n```";
     let (prompt, _info) =
-        build_system_prompt_full(None, &[], false, None, None, None, Some(notepad));
+        build_system_prompt_full(None, &[], false, None, None, None, Some(notepad), None);
     assert!(
         prompt.contains("ship the feature"),
         "notepad block should appear in prompt: {prompt}"
@@ -325,7 +325,7 @@ fn test_full_prompt_omits_notepad_block_when_none() {
     // Default callers of build_system_prompt_full pass None for the
     // notepad; the resulting prompt must not contain an empty
     // notepad section header.
-    let (prompt, _info) = build_system_prompt_full(None, &[], false, None, None, None, None);
+    let (prompt, _info) = build_system_prompt_full(None, &[], false, None, None, None, None, None);
     assert!(
         !prompt.contains("# Priority Notes"),
         "empty notepad should not introduce a Priority Notes section: {prompt}"
@@ -340,7 +340,7 @@ fn test_split_prompt_puts_notepad_in_dynamic_part() {
     // "survives compaction" property.
     let notepad = "# Priority Notes\n\n```\npin me across compaction\n```";
     let (split, _info) =
-        build_system_prompt_split(None, &[], false, None, None, None, Some(notepad));
+        build_system_prompt_split(None, &[], false, None, None, None, Some(notepad), None);
     assert!(
         split.dynamic_part.contains("pin me across compaction"),
         "notepad block should be in dynamic_part: {}",
@@ -355,7 +355,7 @@ fn test_split_prompt_puts_notepad_in_dynamic_part() {
 
 #[test]
 fn test_split_prompt_omits_notepad_when_none() {
-    let (split, _info) = build_system_prompt_split(None, &[], false, None, None, None, None);
+    let (split, _info) = build_system_prompt_split(None, &[], false, None, None, None, None, None);
     assert!(
         !split.dynamic_part.contains("# Priority Notes"),
         "no notepad block when None is passed: {}",
@@ -498,24 +498,5 @@ fn swarm_deep_effort_injects_task_graph_directive() {
     let mut light = SplitSystemPrompt::default();
     append_swarm_effort_directive(&mut light, Some("swarm"));
     assert!(light.dynamic_part.contains("# Swarm Effort"));
-    assert!(!light.dynamic_part.contains("# Deep Task Graph"));
-}
-
-#[test]
-fn classify_effort_distinguishes_reasoning_from_swarm_modes() {
-    use crate::prompt::{EffortKind, classify_effort, is_swarm_mode_effort};
-
-    // Plain reasoning levels are not swarm modes.
-    for level in ["none", "low", "medium", "high", "xhigh", "max"] {
-        assert_eq!(classify_effort(level), EffortKind::Reasoning, "{level}");
-        assert!(!is_swarm_mode_effort(level), "{level}");
-    }
-
-    assert_eq!(classify_effort("swarm"), EffortKind::SwarmLight);
-    assert_eq!(classify_effort("swarm-deep"), EffortKind::SwarmDeep);
-    assert!(is_swarm_mode_effort("swarm"));
-    assert!(is_swarm_mode_effort("  Swarm-Deep "));
-    assert!(EffortKind::SwarmLight.is_swarm_mode());
-    assert!(EffortKind::SwarmDeep.is_swarm_mode());
-    assert!(!EffortKind::Reasoning.is_swarm_mode());
+    assert!(!light.dynamic_part.contains("Deep Task Graph"));
 }
diff --git a/crates/jcode-llm-dialects/src/anthropic.rs b/crates/jcode-llm-dialects/src/anthropic.rs
new file mode 100644
index 0000000000..b6176c7f5d
--- /dev/null
+++ b/crates/jcode-llm-dialects/src/anthropic.rs
@@ -0,0 +1,997 @@
+//! Anthropic Inband Scanner — XML `&lt;invoke&gt;` / `&lt;function_calls&gt;` dialect.
+//!
+//! The model emits tool calls inside XML tags:
+//!
+//! ```xml
+//! <function_calls>
+//!   <invoke name="get_weather">
+//!     <parameter name="city">NYC</parameter>
+//!   </invoke>
+//! </function_calls>
+//! ```
+//!
+//! Optional `<thinking>`/`<think>`/`<scratchpad>` blocks (including `antml:`-prefixed
+//! variants) are also parsed. Self-closing tags are supported.
+
+use std::collections::HashMap;
+use std::time::{SystemTime, UNIX_EPOCH};
+
+use crate::types::{InbandScanEvent, InbandScanner, InbandScannerOptions};
+
+// ---------------------------------------------------------------------------
+// Configuration constants
+// ---------------------------------------------------------------------------
+
+const MAX_PARTIAL_TAG_LENGTH: usize = 256;
+
+/// Tags that act as wrapper sections.
+const WRAPPER_LOCAL_NAMES: &[&str] = &["function_calls", "tool_calls"];
+
+/// Tags that indicate thinking blocks.
+const THINKING_LOCAL_NAMES: &[&str] = &["thinking", "think", "scratchpad"];
+
+/// All tag prefixes for the outside state.
+const BASE_TAG_PREFIXES: &[&str] = &[
+    "<function_calls",
+    "</function_calls",
+    "<tool_calls",
+    "</tool_calls",
+    "<invoke",
+    "</invoke",
+    "<parameter",
+    "</parameter",
+    "<antml:function_calls",
+    "</antml:function_calls",
+    "<antml:tool_calls",
+    "</antml:tool_calls",
+    "<antml:invoke",
+    "</antml:invoke",
+    "<antml:parameter",
+    "</antml:parameter",
+];
+
+const THINKING_TAG_PREFIXES: &[&str] = &[
+    "<thinking", "</thinking", "<think", "</think", "<scratchpad", "</scratchpad",
+    "<antml:thinking", "</antml:thinking", "<antml:think", "</antml:think",
+    "<antml:scratchpad", "</antml:scratchpad",
+];
+
+/// Combined list for when thinking parsing is enabled.
+fn all_tag_prefixes() -> Vec<&'static str> {
+    let mut v = Vec::with_capacity(BASE_TAG_PREFIXES.len() + THINKING_TAG_PREFIXES.len());
+    v.extend_from_slice(BASE_TAG_PREFIXES);
+    v.extend_from_slice(THINKING_TAG_PREFIXES);
+    v
+}
+
+// ---------------------------------------------------------------------------
+// Tag parsing
+// ---------------------------------------------------------------------------
+
+#[derive(Debug, Clone)]
+struct ParsedTag {
+    raw: String,
+    local_name: String,
+    prefix: String,
+    closing: bool,
+    self_closing: bool,
+    attrs: HashMap<String, String>,
+}
+
+fn parse_tag(raw: &str) -> Option<ParsedTag> {
+    let raw = raw.trim();
+    if !raw.starts_with('<') || !raw.ends_with('>') {
+        return None;
+    }
+    let inner = raw[1..raw.len() - 1].trim();
+    if inner.is_empty() {
+        return None;
+    }
+
+    let closing = inner.starts_with('/');
+    let inner = if closing { &inner[1..].trim() } else { inner };
+
+    let self_closing = inner.ends_with('/');
+    let inner = if self_closing {
+        &inner[..inner.len() - 1].trim()
+    } else {
+        inner
+    };
+
+    // Split tag name from attributes
+    let (tag_name, attrs_str) = match inner.find(char::is_whitespace) {
+        Some(pos) => (&inner[..pos], inner[pos..].trim()),
+        None => (inner, ""),
+    };
+
+    if tag_name.is_empty() {
+        return None;
+    }
+
+    // Extract prefix and local name
+    let (prefix, local_name) = if let Some(colon_pos) = tag_name.find(':') {
+        (
+            tag_name[..colon_pos].to_string(),
+            tag_name[colon_pos + 1..].to_lowercase(),
+        )
+    } else {
+        (String::new(), tag_name.to_lowercase())
+    };
+
+    // Parse attributes
+    let attrs = parse_attributes(attrs_str);
+
+    Some(ParsedTag {
+        raw: raw.to_string(),
+        local_name,
+        prefix,
+        closing,
+        self_closing,
+        attrs,
+    })
+}
+
+fn parse_attributes(text: &str) -> HashMap<String, String> {
+    let mut attrs = HashMap::new();
+    let mut i = 0;
+    let bytes = text.as_bytes();
+    while i < bytes.len() {
+        // Skip whitespace
+        while i < bytes.len() && bytes[i].is_ascii_whitespace() {
+            i += 1;
+        }
+        if i >= bytes.len() {
+            break;
+        }
+        // Read attribute name (alphanumeric, _, :, ., -)
+        let name_start = i;
+        while i < bytes.len() && (bytes[i].is_ascii_alphanumeric() || bytes[i] == b'_' || bytes[i] == b':' || bytes[i] == b'.' || bytes[i] == b'-') {
+            i += 1;
+        }
+        let name = &text[name_start..i];
+        if name.is_empty() {
+            i += 1;
+            continue;
+        }
+        // Skip whitespace and =
+        while i < bytes.len() && bytes[i].is_ascii_whitespace() {
+            i += 1;
+        }
+        if i < bytes.len() && bytes[i] == b'=' {
+            i += 1;
+        } else {
+            continue;
+        }
+        // Skip whitespace after =
+        while i < bytes.len() && bytes[i].is_ascii_whitespace() {
+            i += 1;
+        }
+        if i >= bytes.len() {
+            break;
+        }
+        // Read value: quoted or unquoted
+        if bytes[i] == b'"' || bytes[i] == b'\'' {
+            let quote = bytes[i];
+            i += 1;
+            let val_start = i;
+            while i < bytes.len() && bytes[i] != quote {
+                i += 1;
+            }
+            let value = &text[val_start..i];
+            if i < bytes.len() {
+                i += 1; // skip closing quote
+            }
+            attrs.insert(name.to_lowercase(), value.to_string());
+        } else {
+            let val_start = i;
+            while i < bytes.len() && !bytes[i].is_ascii_whitespace() && bytes[i] != b'>' {
+                i += 1;
+            }
+            let value = &text[val_start..i];
+            attrs.insert(name.to_lowercase(), value.to_string());
+        }
+    }
+    attrs
+}
+
+/// Build list of closing prefixes for a given local name and prefix.
+fn close_prefixes(local_name: &str, prefix: &str) -> Vec<String> {
+    let mut prefixes = Vec::new();
+    let unprefixed = format!("</{}", local_name);
+    let antml = format!("</antml:{}", local_name);
+    if prefix.is_empty() || prefix == "antml" {
+        prefixes.push(unprefixed);
+        prefixes.push(antml);
+    } else {
+        prefixes.push(format!("</{}:{}", prefix, local_name));
+        prefixes.push(unprefixed);
+        prefixes.push(antml);
+    }
+    prefixes
+}
+
+/// Check if buffer could be the start of a relevant tag prefix.
+fn could_be_tag_prefix(buf: &str, prefixes: &[String]) -> bool {
+    if !buf.starts_with('<') {
+        return false;
+    }
+    for p in prefixes {
+        if p.starts_with(buf) || buf.starts_with(p.as_str()) {
+            return true;
+        }
+    }
+    false
+}
+
+fn could_be_tag_prefix_static(buf: &str, prefixes: &[&str]) -> bool {
+    if !buf.starts_with('<') {
+        return false;
+    }
+    for p in prefixes {
+        if p.starts_with(buf) || buf.starts_with(p) {
+            return true;
+        }
+    }
+    false
+}
+
+// ---------------------------------------------------------------------------
+// State machine
+// ---------------------------------------------------------------------------
+
+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
+enum ScannerState {
+    Outside,
+    Section,
+    Invoke,
+    Parameter,
+    Thinking,
+}
+
+type ReturnState = ScannerState;
+
+fn is_wrapper(name: &str) -> bool {
+    WRAPPER_LOCAL_NAMES.contains(&name)
+}
+
+fn is_thinking(name: &str) -> bool {
+    THINKING_LOCAL_NAMES.contains(&name)
+}
+
+/// Streaming scanner for the Anthropic XML dialect.
+pub struct AnthropicInbandScanner {
+    buffer: String,
+    state: ScannerState,
+    return_state: ReturnState,
+    after_thinking_state: ReturnState,
+    id: String,
+    name: String,
+    args: HashMap<String, serde_json::Value>,
+    started: bool,
+    param_name: String,
+    param_value: String,
+    param_string: Option<bool>,
+    param_close_prefixes: Vec<String>,
+    raw_block: String,
+    thinking: String,
+    thinking_tag: String,
+    thinking_close_prefixes: Vec<String>,
+    parse_thinking: bool,
+}
+
+impl AnthropicInbandScanner {
+    pub fn new(options: &InbandScannerOptions) -> Self {
+        Self {
+            buffer: String::new(),
+            state: ScannerState::Outside,
+            return_state: ScannerState::Outside,
+            after_thinking_state: ScannerState::Outside,
+            id: String::new(),
+            name: String::new(),
+            args: HashMap::new(),
+            started: false,
+            param_name: String::new(),
+            param_value: String::new(),
+            param_string: None,
+            param_close_prefixes: Vec::new(),
+            raw_block: String::new(),
+            thinking: String::new(),
+            thinking_tag: String::new(),
+            thinking_close_prefixes: Vec::new(),
+            parse_thinking: options.parse_thinking,
+        }
+    }
+
+    fn gen_id() -> String {
+        let nanos = SystemTime::now()
+            .duration_since(UNIX_EPOCH)
+            .unwrap_or_default()
+            .subsec_nanos();
+        format!("anthropic_{:09x}", nanos)
+    }
+
+    fn reset_call(&mut self, next: ReturnState) {
+        self.id.clear();
+        self.name.clear();
+        self.args.clear();
+        self.started = false;
+        self.param_name.clear();
+        self.param_value.clear();
+        self.param_string = None;
+        self.param_close_prefixes.clear();
+        self.raw_block.clear();
+        self.state = next;
+    }
+
+    fn is_thinking_open(&self, tag: &ParsedTag) -> bool {
+        self.parse_thinking && !tag.closing && is_thinking(&tag.local_name)
+    }
+
+    fn relevant_prefixes(&self) -> Vec<&'static str> {
+        if self.parse_thinking {
+            all_tag_prefixes()
+        } else {
+            BASE_TAG_PREFIXES.to_vec()
+        }
+    }
+
+    /// Try to peek at the next tag in the buffer.
+    fn peek_tag(&self, final_: bool, prefixes: &[String]) -> Option<ParsedTag> {
+        let close = self.buffer.find('>')?;
+        let raw = &self.buffer[..=close];
+        if let Some(tag) = parse_tag(raw) {
+            return Some(tag);
+        }
+        // Check for partial
+        if !final_
+            && self.buffer.len() <= MAX_PARTIAL_TAG_LENGTH
+            && could_be_tag_prefix(&self.buffer, prefixes)
+        {
+            return None; // "partial" marker — just return None
+        }
+        None
+    }
+
+    fn peek_tag_static(&self, final_: bool, prefixes: &[&str]) -> Option<ParsedTag> {
+        let close = self.buffer.find('>')?;
+        let raw = &self.buffer[..=close];
+        if let Some(tag) = parse_tag(raw) {
+            return Some(tag);
+        }
+        if !final_
+            && self.buffer.len() <= MAX_PARTIAL_TAG_LENGTH
+            && could_be_tag_prefix_static(&self.buffer, prefixes)
+        {
+            return None;
+        }
+        None
+    }
+
+    fn emit_text(&self, text: &str, events: &mut Vec<InbandScanEvent>) {
+        if !text.is_empty() {
+            events.push(InbandScanEvent::Text(text.to_string()));
+        }
+    }
+
+    fn start_invoke(&mut self, tag: &ParsedTag, return_state: ReturnState, events: &mut Vec<InbandScanEvent>) {
+        self.return_state = return_state;
+        self.id = Self::gen_id();
+        self.name = tag.attrs.get("name").map(|s| s.trim().to_string()).unwrap_or_default();
+        self.args = HashMap::new();
+        self.raw_block = tag.raw.clone();
+        self.started = !self.name.is_empty();
+        if tag.self_closing {
+            if self.started {
+                let args_val = serde_json::Value::Object(
+                    std::mem::take(&mut self.args).into_iter().collect(),
+                );
+                events.push(InbandScanEvent::ToolEnd {
+                    id: self.id.clone(),
+                    name: self.name.clone(),
+                    arguments: args_val,
+                    raw_block: Some(self.raw_block.clone()),
+                });
+            }
+            self.reset_call(return_state);
+            return;
+        }
+        self.state = ScannerState::Invoke;
+    }
+
+    fn start_parameter(&mut self, tag: &ParsedTag) {
+        self.param_name = tag.attrs.get("name").map(|s| s.trim().to_string()).unwrap_or_default();
+        self.param_value = String::new();
+        self.param_string = parse_bool_attr(tag.attrs.get("string").map(|s| s.as_str()));
+        self.param_close_prefixes = close_prefixes("parameter", &tag.prefix);
+        self.state = ScannerState::Parameter;
+    }
+
+    fn append_parameter_value(&mut self, delta: &str) {
+        self.param_value.push_str(delta);
+    }
+
+    fn finish_parameter(&mut self) {
+        if !self.param_name.is_empty() {
+            let value = self.coerce_parameter_value(&self.param_name, &self.param_value, self.param_string);
+            self.args.insert(self.param_name.clone(), value);
+        }
+        self.param_name.clear();
+        self.param_value.clear();
+        self.param_string = None;
+        self.param_close_prefixes.clear();
+        self.state = ScannerState::Invoke;
+    }
+
+    fn coerce_parameter_value(&self, _name: &str, raw: &str, explicit_string: Option<bool>) -> serde_json::Value {
+        if explicit_string == Some(true) {
+            return serde_json::Value::String(raw.to_string());
+        }
+        let trimmed = raw.trim();
+        if trimmed.is_empty() {
+            return serde_json::Value::String(raw.to_string());
+        }
+        // Try parsing as JSON
+        serde_json::from_str(trimmed).unwrap_or_else(|_| serde_json::Value::String(raw.to_string()))
+    }
+
+    fn start_thinking(&mut self, tag: &ParsedTag, after_state: ReturnState, events: &mut Vec<InbandScanEvent>) {
+        self.after_thinking_state = after_state;
+        self.thinking.clear();
+        self.thinking_tag = tag.local_name.clone();
+        self.thinking_close_prefixes = close_prefixes(&tag.local_name, &tag.prefix);
+        self.state = ScannerState::Thinking;
+        events.push(InbandScanEvent::ThinkingStart);
+        if tag.self_closing {
+            self.finish_thinking(events);
+        }
+    }
+
+    fn append_thinking(&mut self, delta: &str, events: &mut Vec<InbandScanEvent>) {
+        if delta.is_empty() {
+            return;
+        }
+        self.thinking.push_str(delta);
+        events.push(InbandScanEvent::ThinkingDelta(delta.to_string()));
+    }
+
+    fn finish_thinking(&mut self, events: &mut Vec<InbandScanEvent>) {
+        events.push(InbandScanEvent::ThinkingEnd(self.thinking.clone()));
+        self.thinking.clear();
+        self.thinking_tag.clear();
+        self.thinking_close_prefixes.clear();
+        self.state = self.after_thinking_state;
+        self.after_thinking_state = ScannerState::Outside;
+    }
+}
+
+impl InbandScanner for AnthropicInbandScanner {
+    fn feed(&mut self, text: &str) -> Vec<InbandScanEvent> {
+        if text.is_empty() {
+            return vec![];
+        }
+        self.buffer.push_str(text);
+        self.consume(false)
+    }
+
+    fn flush(&mut self) -> Vec<InbandScanEvent> {
+        let mut events = self.consume(true);
+        // Close any pending state
+        match self.state {
+            ScannerState::Thinking => {
+                self.finish_thinking(&mut events);
+            }
+            ScannerState::Outside => {}
+            _ => {
+                self.reset_call(self.return_state);
+            }
+        }
+        if !self.buffer.is_empty() {
+            events.push(InbandScanEvent::Text(self.buffer.clone()));
+            self.buffer.clear();
+        }
+        self.state = ScannerState::Outside;
+        events
+    }
+}
+
+impl AnthropicInbandScanner {
+    fn consume(&mut self, final_: bool) -> Vec<InbandScanEvent> {
+        let mut events = Vec::new();
+        loop {
+            if self.buffer.is_empty() {
+                break;
+            }
+            let progressed = match self.state {
+                ScannerState::Outside => self.consume_outside(final_, &mut events),
+                ScannerState::Section => self.consume_section(final_, &mut events),
+                ScannerState::Invoke => self.consume_invoke(final_, &mut events),
+                ScannerState::Parameter => self.consume_parameter(final_),
+                ScannerState::Thinking => self.consume_thinking(final_, &mut events),
+            };
+            if !progressed {
+                break;
+            }
+        }
+        if final_ {
+            match self.state {
+                ScannerState::Thinking => self.finish_thinking(&mut events),
+                ScannerState::Outside => {}
+                _ => self.reset_call(self.return_state),
+            }
+        }
+        events
+    }
+
+    fn consume_outside(&mut self, final_: bool, events: &mut Vec<InbandScanEvent>) -> bool {
+        // Find next '<'
+        let tag_start = self.buffer.find('<');
+        let tag_start = match tag_start {
+            Some(p) => p,
+            None => {
+                // No tag at all — emit remaining text
+                self.emit_text(&self.buffer, events);
+                self.buffer.clear();
+                return false;
+            }
+        };
+
+        if tag_start > 0 {
+            self.emit_text(&self.buffer[..tag_start], events);
+            self.buffer = self.buffer[tag_start..].to_string();
+            return true;
+        }
+
+        // Buffer starts with '<' — try to peek a tag
+        let prefixes = self.relevant_prefixes();
+        let tag = self.peek_tag_static(final_, &prefixes);
+        match tag {
+            None => {
+                // Could be partial or just a '<' character
+                if !final_
+                    && self.buffer.len() <= MAX_PARTIAL_TAG_LENGTH
+                    && could_be_tag_prefix_static(&self.buffer, &prefixes)
+                {
+                    return false;
+                }
+                // Not a valid tag — emit '<' as text
+                self.emit_text("<", events);
+                self.buffer = self.buffer[1..].to_string();
+                return true;
+            }
+            Some(tag) => {
+                if !tag.closing && is_wrapper(&tag.local_name) {
+                    self.buffer = self.buffer[tag.raw.len()..].to_string();
+                    self.state = ScannerState::Section;
+                    return true;
+                }
+                if !tag.closing && tag.local_name == "invoke" {
+                    self.buffer = self.buffer[tag.raw.len()..].to_string();
+                    self.start_invoke(&tag, ScannerState::Outside, events);
+                    return true;
+                }
+                if self.is_thinking_open(&tag) {
+                    self.buffer = self.buffer[tag.raw.len()..].to_string();
+                    self.start_thinking(&tag, ScannerState::Outside, events);
+                    return true;
+                }
+                if tag.closing && is_wrapper(&tag.local_name) {
+                    self.buffer = self.buffer[tag.raw.len()..].to_string();
+                    return true;
+                }
+                // Unrecognized tag — emit as text
+                self.emit_text("<", events);
+                self.buffer = self.buffer[1..].to_string();
+                return true;
+            }
+        }
+    }
+
+    fn consume_section(&mut self, final_: bool, _events: &mut Vec<InbandScanEvent>) -> bool {
+        let tag_start = self.buffer.find('<');
+        let tag_start = match tag_start {
+            Some(p) => p,
+            None => {
+                self.buffer.clear();
+                return false;
+            }
+        };
+
+        if tag_start > 0 {
+            self.buffer = self.buffer[tag_start..].to_string();
+            return true;
+        }
+
+        let prefixes = self.relevant_prefixes();
+        let tag = self.peek_tag_static(final_, &prefixes);
+        match tag {
+            None => {
+                if !final_
+                    && self.buffer.len() <= MAX_PARTIAL_TAG_LENGTH
+                    && could_be_tag_prefix_static(&self.buffer, &prefixes)
+                {
+                    return false;
+                }
+                self.buffer = self.buffer[1..].to_string();
+                return true;
+            }
+            Some(tag) => {
+                self.buffer = self.buffer[tag.raw.len()..].to_string();
+                if tag.closing && is_wrapper(&tag.local_name) {
+                    self.state = ScannerState::Outside;
+                    return true;
+                }
+                if !tag.closing && tag.local_name == "invoke" {
+                    self.start_invoke(&tag, ScannerState::Section, _events);
+                    return true;
+                }
+                if self.parse_thinking && !tag.closing && is_thinking(&tag.local_name) {
+                    self.start_thinking(&tag, ScannerState::Section, _events);
+                }
+                return true;
+            }
+        }
+    }
+
+    fn consume_invoke(&mut self, final_: bool, events: &mut Vec<InbandScanEvent>) -> bool {
+        let tag_start = self.buffer.find('<');
+        let tag_start = match tag_start {
+            Some(p) => p,
+            None => {
+                if final_ {
+                    self.reset_call(self.return_state);
+                } else {
+                    self.raw_block.push_str(&self.buffer);
+                    self.buffer.clear();
+                }
+                return false;
+            }
+        };
+
+        if tag_start > 0 {
+            let consumed = &self.buffer[..tag_start];
+            self.raw_block.push_str(consumed);
+            self.buffer = self.buffer[tag_start..].to_string();
+            return true;
+        }
+
+        let prefixes: Vec<String> = vec!["</invoke".to_string(), "</antml:invoke".to_string(),
+            "<parameter".to_string(), "<antml:parameter".to_string()];
+        let tag = self.peek_tag(final_, &prefixes);
+        match tag {
+            None => {
+                if !final_ {
+                    if self.buffer.len() <= MAX_PARTIAL_TAG_LENGTH
+                        && could_be_tag_prefix(&self.buffer, &prefixes)
+                    {
+                        return false;
+                    }
+                } else {
+                    self.reset_call(self.return_state);
+                    return false;
+                }
+                let consumed = &self.buffer[..1];
+                self.raw_block.push_str(consumed);
+                self.buffer = self.buffer[1..].to_string();
+                return true;
+            }
+            Some(tag) => {
+                self.raw_block.push_str(&tag.raw);
+                self.buffer = self.buffer[tag.raw.len()..].to_string();
+                if tag.closing && tag.local_name == "invoke" {
+                    if self.started {
+                        let args_val = serde_json::Value::Object(
+                            std::mem::take(&mut self.args).into_iter().collect(),
+                        );
+                        events.push(InbandScanEvent::ToolEnd {
+                            id: self.id.clone(),
+                            name: self.name.clone(),
+                            arguments: args_val,
+                            raw_block: Some(self.raw_block.clone()),
+                        });
+                    }
+                    self.reset_call(self.return_state);
+                    return true;
+                }
+                if !tag.closing && tag.local_name == "parameter" {
+                    self.start_parameter(&tag);
+                    if tag.self_closing {
+                        self.finish_parameter();
+                    }
+                    return true;
+                }
+                return true;
+            }
+        }
+    }
+
+    fn consume_parameter(&mut self, final_: bool) -> bool {
+        let tag_start = self.buffer.find('<');
+        let tag_start = match tag_start {
+            Some(p) => p,
+            None => {
+                if final_ {
+                    self.reset_call(self.return_state);
+                    self.buffer.clear();
+                    return false;
+                }
+                let buf = self.buffer.clone();
+                self.append_parameter_value(&buf);
+                self.raw_block.push_str(&buf);
+                self.buffer.clear();
+                return false;
+            }
+        };
+
+        if tag_start > 0 {
+            let consumed = self.buffer[..tag_start].to_string();
+            self.append_parameter_value(&consumed);
+            self.raw_block.push_str(&consumed);
+            self.buffer = self.buffer[tag_start..].to_string();
+            return true;
+        }
+
+        let tag = self.peek_tag(final_, &self.param_close_prefixes);
+        match tag {
+            None => {
+                if !final_
+                    && self.buffer.len() <= MAX_PARTIAL_TAG_LENGTH
+                    && could_be_tag_prefix(&self.buffer, &self.param_close_prefixes)
+                {
+                    return false;
+                }
+                if final_ {
+                    self.reset_call(self.return_state);
+                    self.buffer.clear();
+                    return false;
+                }
+                let consumed = self.buffer[..1].to_string();
+                self.append_parameter_value(&consumed);
+                self.raw_block.push_str(&consumed);
+                self.buffer = self.buffer[1..].to_string();
+                return true;
+            }
+            Some(tag) => {
+                if tag.closing && tag.local_name == "parameter" {
+                    self.raw_block.push_str(&tag.raw);
+                    self.buffer = self.buffer[tag.raw.len()..].to_string();
+                    self.finish_parameter();
+                    return true;
+                }
+                if final_ {
+                    self.reset_call(self.return_state);
+                    self.buffer.clear();
+                    return false;
+                }
+                let consumed = self.buffer[..1].to_string();
+                self.append_parameter_value(&consumed);
+                self.raw_block.push_str(&consumed);
+                self.buffer = self.buffer[1..].to_string();
+                return true;
+            }
+        }
+    }
+
+    fn consume_thinking(&mut self, final_: bool, events: &mut Vec<InbandScanEvent>) -> bool {
+        let tag_start = self.buffer.find('<');
+        let tag_start = match tag_start {
+            Some(p) => p,
+            None => {
+                if final_ {
+                    let buf = self.buffer.clone();
+                    self.append_thinking(&buf, events);
+                    self.buffer.clear();
+                    self.finish_thinking(events);
+                    return false;
+                }
+                let buf = self.buffer.clone();
+                self.append_thinking(&buf, events);
+                self.buffer.clear();
+                return false;
+            }
+        };
+
+        if tag_start > 0 {
+            let buf = self.buffer[..tag_start].to_string();
+            self.append_thinking(&buf, events);
+            self.buffer = self.buffer[tag_start..].to_string();
+            return true;
+        }
+
+        let tag = self.peek_tag(final_, &self.thinking_close_prefixes);
+        match tag {
+            None => {
+                if !final_
+                    && self.buffer.len() <= MAX_PARTIAL_TAG_LENGTH
+                    && could_be_tag_prefix(&self.buffer, &self.thinking_close_prefixes)
+                {
+                    return false;
+                }
+                if final_ {
+                    let buf = self.buffer.clone();
+                    self.append_thinking(&buf, events);
+                    self.buffer.clear();
+                    self.finish_thinking(events);
+                    return false;
+                }
+                let buf = self.buffer[..1].to_string();
+                self.append_thinking(&buf, events);
+                self.buffer = self.buffer[1..].to_string();
+                return true;
+            }
+            Some(tag) => {
+                if tag.closing && tag.local_name == self.thinking_tag {
+                    self.buffer = self.buffer[tag.raw.len()..].to_string();
+                    self.finish_thinking(events);
+                    return true;
+                }
+                if final_ {
+                    let buf = self.buffer.clone();
+                    self.append_thinking(&buf, events);
+                    self.buffer.clear();
+                    self.finish_thinking(events);
+                    return false;
+                }
+                let buf = self.buffer[..1].to_string();
+                self.append_thinking(&buf, events);
+                self.buffer = self.buffer[1..].to_string();
+                return true;
+            }
+        }
+    }
+}
+
+// ---------------------------------------------------------------------------
+// Helpers
+// ---------------------------------------------------------------------------
+
+fn parse_bool_attr(value: Option<&str>) -> Option<bool> {
+    let value = value?;
+    let normalized = value.trim().to_lowercase();
+    if normalized == "false" || normalized == "0" || normalized == "no" {
+        return Some(false);
+    }
+    Some(true)
+}
+
+// ---------------------------------------------------------------------------
+// Tests
+// ---------------------------------------------------------------------------
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn test_plain_text() {
+        let mut scanner = AnthropicInbandScanner::new(&InbandScannerOptions::default());
+        let events = scanner.feed("Hello, this is plain text.");
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        assert_eq!(all.len(), 1);
+        assert!(matches!(&all[0], InbandScanEvent::Text(t) if t == "Hello, this is plain text."));
+    }
+
+    #[test]
+    fn test_xml_tool_call() {
+        let mut scanner = AnthropicInbandScanner::new(&InbandScannerOptions::default());
+        let input = "What's the weather?<function_calls><invoke name=\"get_weather\"><parameter name=\"city\">NYC</parameter></invoke></function_calls>";
+        let events = scanner.feed(input);
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        assert!(all.iter().any(|e| matches!(e, InbandScanEvent::Text(t) if t == "What's the weather?")),
+            "expected text, got {all:?}");
+        assert!(all.iter().any(|e| matches!(e, InbandScanEvent::ToolEnd { name, .. } if name == "get_weather")),
+            "expected ToolEnd for get_weather, got {all:?}");
+        if let Some(InbandScanEvent::ToolEnd { arguments, .. }) = all.iter().find(|e| matches!(e, InbandScanEvent::ToolEnd { .. })) {
+            assert_eq!(arguments.get("city").and_then(|v| v.as_str()), Some("NYC"));
+        }
+    }
+
+    #[test]
+    fn test_chunked_xml() {
+        let mut scanner = AnthropicInbandScanner::new(&InbandScannerOptions::default());
+        scanner.feed("<function_calls><invoke name=\"get_weather\"><parameter name=\"city\">");
+        let events = scanner.feed("NYC");
+        let events2 = scanner.feed("</parameter></invoke></function_calls>");
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(events2).chain(flushed).collect();
+        assert!(all.iter().any(|e| matches!(e, InbandScanEvent::ToolEnd { name, .. } if name == "get_weather")),
+            "expected ToolEnd for get_weather, got {all:?}");
+    }
+
+    #[test]
+    fn test_thinking_block() {
+        let mut scanner = AnthropicInbandScanner::new(&InbandScannerOptions {
+            parse_thinking: true,
+            ..Default::default()
+        });
+        let input = "Let me <thinking>ponder deeply</thinking>done";
+        let events = scanner.feed(input);
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        assert!(all.iter().any(|e| matches!(e, InbandScanEvent::ThinkingStart)),
+            "expected ThinkingStart, got {all:?}");
+        assert!(all.iter().any(|e| matches!(e, InbandScanEvent::ThinkingEnd(t) if t == "ponder deeply")),
+            "expected ThinkingEnd, got {all:?}");
+    }
+
+    #[test]
+    fn test_self_closing_invoke() {
+        let mut scanner = AnthropicInbandScanner::new(&InbandScannerOptions::default());
+        let input = "<function_calls><invoke name=\"get_weather\"/></function_calls>";
+        let events = scanner.feed(input);
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        // Self-closing invoke with no parameters should produce ToolEnd
+        assert!(all.iter().any(|e| matches!(e, InbandScanEvent::ToolEnd { name, .. } if name == "get_weather")),
+            "expected ToolEnd for get_weather, got {all:?}");
+    }
+
+    #[test]
+    fn test_self_closing_parameter() {
+        let mut scanner = AnthropicInbandScanner::new(&InbandScannerOptions::default());
+        let input = "<function_calls><invoke name=\"get_weather\"><parameter name=\"city\"/></invoke></function_calls>";
+        let events = scanner.feed(input);
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        assert!(all.iter().any(|e| matches!(e, InbandScanEvent::ToolEnd { name, .. } if name == "get_weather")),
+            "expected ToolEnd for get_weather, got {all:?}");
+        if let Some(InbandScanEvent::ToolEnd { arguments, .. }) = all.iter().find(|e| matches!(e, InbandScanEvent::ToolEnd { .. })) {
+            assert!(arguments.get("city").is_some(),
+                "expected city arg to exist, got {arguments:?}");
+        }
+    }
+
+    #[test]
+    fn test_mixed_namespace() {
+        let mut scanner = AnthropicInbandScanner::new(&InbandScannerOptions::default());
+        let input = "<antml:function_calls><antml:invoke name=\"get_weather\"><antml:parameter name=\"city\">NYC</antml:parameter></antml:invoke></antml:function_calls>";
+        let events = scanner.feed(input);
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        assert!(all.iter().any(|e| matches!(e, InbandScanEvent::ToolEnd { name, .. } if name == "get_weather")),
+            "expected ToolEnd for get_weather, got {all:?}");
+    }
+
+    #[test]
+    fn test_antml_thinking_block() {
+        let mut scanner = AnthropicInbandScanner::new(&InbandScannerOptions {
+            parse_thinking: true,
+            ..Default::default()
+        });
+        let input = "<antml:thinking>deep thoughts</antml:thinking>done";
+        let events = scanner.feed(input);
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        assert!(all.iter().any(|e| matches!(e, InbandScanEvent::ThinkingStart)),
+            "expected ThinkingStart, got {all:?}");
+        assert!(all.iter().any(|e| matches!(e, InbandScanEvent::ThinkingEnd(t) if t == "deep thoughts")),
+            "expected ThinkingEnd, got {all:?}");
+    }
+
+    #[test]
+    fn test_flush() {
+        let mut scanner = AnthropicInbandScanner::new(&InbandScannerOptions::default());
+        let events = scanner.feed("some leftover without close");
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        assert!(all.iter().any(|e| matches!(e, InbandScanEvent::Text(t) if t == "some leftover without close")),
+            "expected text, got {all:?}");
+    }
+
+    #[test]
+    fn test_arg_delta_via_multi_invoke() {
+        let mut scanner = AnthropicInbandScanner::new(&InbandScannerOptions::default());
+        let input = "<function_calls>\
+            <invoke name=\"func_a\"><parameter name=\"x\">1</parameter></invoke>\
+            <invoke name=\"func_b\"><parameter name=\"y\">2</parameter></invoke>\
+            </function_calls>";
+        let events = scanner.feed(input);
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        let tool_ends: Vec<_> = all.iter().filter_map(|e| {
+            if let InbandScanEvent::ToolEnd { name, .. } = e { Some(name.as_str()) } else { None }
+        }).collect();
+        assert_eq!(tool_ends, vec!["func_a", "func_b"]);
+    }
+}
diff --git a/crates/jcode-llm-dialects/src/deepseek.rs b/crates/jcode-llm-dialects/src/deepseek.rs
new file mode 100644
index 0000000000..40de95d7a2
--- /dev/null
+++ b/crates/jcode-llm-dialects/src/deepseek.rs
@@ -0,0 +1,1019 @@
+//! DeepSeek Inband Scanner — DSML with fullwidth delimiters, ASCII DSML, and legacy JSON.
+//!
+//! Three formats are parsed:
+//!
+//! 1. **Fullwidth token format** (legacy):
+//!    `<\u{ff5c}tool\u{2581}calls\u{2581}begin\u{ff5c}><\u{ff5c}tool\u{2581}call\u{2581}begin\u{ff5c}>name<\u{ff5c}tool\u{2581}sep\u{ff5c}>JSON<\u{ff5c}tool\u{2581}call\u{2581}end\u{ff5c}><\u{ff5c}tool\u{2581}calls\u{2581}end\u{ff5c}>`
+//! 2. **DSML format** (fullwidth or ASCII):
+//!    `<\u{ff5c}DSML\u{ff5c}tool_calls><\u{ff5c}DSML\u{ff5c}invoke name="x"><\u{ff5c}DSML\u{ff5c}parameter name="k">v</\u{ff5c}DSML\u{ff5c}parameter></\u{ff5c}DSML\u{ff5c}invoke></\u{ff5c}DSML\u{ff5c}tool_calls>`
+//!    (or ASCII `<|DSML|tool_calls>` etc.)
+//! 3. **Legacy JSON format**:
+//!    `<\u{ff5c}tool\u{2581}call\u{2581}begin\u{ff5c}>function<\u{ff5c}tool\u{2581}sep\u{ff5c}>name\n```json\nJSON\n```<\u{ff5c}tool\u{2581}call\u{2581}end\u{ff5c}>`
+//!
+//! Control tokens (BOS, EOS, User, Assistant) are stripped.
+
+use std::time::{SystemTime, UNIX_EPOCH};
+
+use crate::types::{InbandScanEvent, InbandScanner, InbandScannerOptions};
+
+// ---------------------------------------------------------------------------
+// Tokens — fullwidth (3-byte Unicode) and ASCII
+// ---------------------------------------------------------------------------
+
+const TOOL_CALLS_BEGIN: &str = "<\u{ff5c}tool\u{2581}calls\u{2581}begin\u{ff5c}>";
+const TOOL_CALLS_END: &str = "<\u{ff5c}tool\u{2581}calls\u{2581}end\u{ff5c}>";
+const TOOL_CALL_BEGIN: &str = "<\u{ff5c}tool\u{2581}call\u{2581}begin\u{ff5c}>";
+const TOOL_CALL_END: &str = "<\u{ff5c}tool\u{2581}call\u{2581}end\u{ff5c}>";
+const TOOL_SEPARATOR: &str = "<\u{ff5c}tool\u{2581}sep\u{ff5c}>";
+
+const BOS: &str = "<\u{ff5c}begin\u{2581}of\u{2581}sentence\u{ff5c}>";
+const USER: &str = "<\u{ff5c}User\u{ff5c}>";
+const ASSISTANT: &str = "<\u{ff5c}Assistant\u{ff5c}>";
+const EOS: &str = "<\u{ff5c}end\u{2581}of\u{2581}sentence\u{ff5c}>";
+
+const THINK_OPEN: &str = "<think>";
+const THINK_CLOSE: &str = "</think>";
+
+const LEGACY_TOOL_TYPE: &str = "function";
+const LEGACY_JSON_FENCE: &str = "```json";
+const CODE_FENCE: &str = "```";
+
+const DSML_TOOL_CALLS_OPEN_FULLWIDTH: &str = "<\u{ff5c}DSML\u{ff5c}tool_calls>";
+const DSML_TOOL_CALLS_CLOSE_FULLWIDTH: &str = "</\u{ff5c}DSML\u{ff5c}tool_calls>";
+const DSML_TOOL_CALLS_OPEN_ASCII: &str = "<|DSML|tool_calls>";
+const DSML_TOOL_CALLS_CLOSE_ASCII: &str = "</|DSML|tool_calls>";
+
+/// Control tokens that get stripped entirely.
+const CONTROL_TOKENS: &[&str] = &[
+    BOS, EOS, USER, ASSISTANT,
+    "<\u{ff5c}\u{2581}pad\u{2581}\u{ff5c}>",
+    "<|EOT|>",
+    "<\u{ff5c}search\u{2581}begin\u{ff5c}>",
+    "<\u{ff5c}search\u{2581}end\u{ff5c}>",
+    "<\u{ff5c}fim\u{2581}hole\u{ff5c}>Tok",
+];
+
+/// Tokens scanned for in the `Outside` state.
+const OUTSIDE_TOKENS: &[&str] = &[
+    TOOL_CALLS_BEGIN,
+    TOOL_CALL_BEGIN,
+    THINK_OPEN,
+    DSML_TOOL_CALLS_OPEN_FULLWIDTH,
+    DSML_TOOL_CALLS_OPEN_ASCII,
+    BOS, EOS, USER, ASSISTANT,
+    "<\u{ff5c}\u{2581}pad\u{2581}\u{ff5c}>",
+    "<|EOT|>",
+    "<\u{ff5c}search\u{2581}begin\u{ff5c}>",
+    "<\u{ff5c}search\u{2581}end\u{ff5c}>",
+    "<\u{ff5c}fim\u{2581}hole\u{ff5c}>Tok",
+];
+
+/// Tokens scanned for in the legacy `Section` state.
+const SECTION_TOKENS: &[&str] = &[TOOL_CALLS_END, TOOL_CALL_BEGIN];
+
+/// Tokens scanned for in the DSML section state.
+const DSML_SECTION_TOKENS: &[&str] = &[
+    DSML_TOOL_CALLS_CLOSE_FULLWIDTH,
+    DSML_TOOL_CALLS_CLOSE_ASCII,
+    "<\u{ff5c}DSML\u{ff5c}invoke",
+    "<|DSML|invoke",
+    "<\u{ff5c}DSML\u{ff5c}parameter",
+    "<|DSML|parameter",
+];
+
+/// Tokens scanned for in DSML invoke state (close + parameter open).
+const DSML_INVOKE_TOKENS: &[&str] = &[
+    "</\u{ff5c}DSML\u{ff5c}invoke>",
+    "</|DSML|invoke>",
+    "<\u{ff5c}DSML\u{ff5c}parameter",
+    "<|DSML|parameter",
+];
+
+/// Tokens that close a DSML parameter value (parameter close tags).
+const DSML_PARAMETER_CLOSE_TOKENS: &[&str] = &[
+    "</\u{ff5c}DSML\u{ff5c}parameter>",
+    "</|DSML|parameter>",
+];
+
+/// A parsed DSML open tag (invoke or parameter).
+#[derive(Debug, Clone)]
+struct DsmlOpenTag {
+    name: String,
+    string_attr: Option<String>,
+    raw: String,
+    tag_len: usize,
+}
+
+#[derive(Debug, Clone)]
+enum TokInfo {
+    None,
+    SelfClosing,
+    /// Consumed an opening tag like `<｜DSML｜invoke name="x">`.
+    Opening(String, /* raw tag */ String),
+    /// Found an unknown token — emit it as text.
+    Unknown,
+}
+
+impl TokInfo {
+    fn is_some(&self) -> bool {
+        !matches!(self, TokInfo::None)
+    }
+}
+
+// ---------------------------------------------------------------------------
+// State machine
+// ---------------------------------------------------------------------------
+
+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
+enum State {
+    Outside,
+    Thinking,
+    Section,
+    Header,
+    LegacyName,
+    Args,
+    LegacyArgs,
+    DsmlSection,
+    DsmlInvoke,
+    DsmlParam,
+}
+
+/// Streaming scanner for the DeepSeek dialect.
+pub struct DeepSeekInbandScanner {
+    buffer: String,
+    state: State,
+    parse_thinking: bool,
+    in_tool_section: bool,
+    id: String,
+    name: String,
+    thinking: String,
+    dsml_args: serde_json::Map<String, serde_json::Value>,
+    dsml_param_name: String,
+    dsml_param_is_string: bool,
+    raw_block: String,
+    strip_leading_ws: bool,
+}
+
+impl DeepSeekInbandScanner {
+    pub fn new(options: &InbandScannerOptions) -> Self {
+        Self {
+            buffer: String::new(),
+            state: State::Outside,
+            parse_thinking: options.parse_thinking,
+            in_tool_section: false,
+            id: String::new(),
+            name: String::new(),
+            thinking: String::new(),
+            dsml_args: serde_json::Map::new(),
+            dsml_param_name: String::new(),
+            dsml_param_is_string: true,
+            raw_block: String::new(),
+            strip_leading_ws: false,
+        }
+    }
+
+    fn gen_id() -> String {
+        let nanos = SystemTime::now()
+            .duration_since(UNIX_EPOCH)
+            .unwrap_or_default()
+            .subsec_nanos();
+        format!("deepseek_{:09x}", nanos)
+    }
+
+    fn reset_tool(&mut self, next: State) {
+        self.id.clear();
+        self.name.clear();
+        self.raw_block.clear();
+        self.state = next;
+    }
+
+    fn reset_dsml_tool(&mut self) {
+        self.id.clear();
+        self.name.clear();
+        self.dsml_args.clear();
+        self.dsml_param_name.clear();
+        self.dsml_param_is_string = true;
+        self.raw_block.clear();
+    }
+
+    fn skip_whitespace(&mut self) -> String {
+        let trimmed = self.buffer.trim_start();
+        let skipped_len = self.buffer.len() - trimmed.len();
+        let skipped = self.buffer[..skipped_len].to_string();
+        self.buffer = trimmed.to_string();
+        skipped
+    }
+
+    fn drop_one_linebreak(&mut self) -> String {
+        if self.buffer.starts_with("\u{d}\u{a}") {
+            let s = "\u{d}\u{a}".to_string();
+            self.buffer = self.buffer[2..].to_string();
+            s
+        } else if self.buffer.starts_with('\u{a}') {
+            let s = "\u{a}".to_string();
+            self.buffer = self.buffer[1..].to_string();
+            s
+        } else {
+            String::new()
+        }
+    }
+
+    fn emit_thinking(&mut self, delta: &str, events: &mut Vec<InbandScanEvent>) {
+        if delta.is_empty() {
+            return;
+        }
+        if self.parse_thinking {
+            self.thinking.push_str(delta);
+            events.push(InbandScanEvent::ThinkingDelta(delta.to_string()));
+        } else {
+            events.push(InbandScanEvent::Text(delta.to_string()));
+        }
+    }
+
+    fn end_thinking(&mut self, events: &mut Vec<InbandScanEvent>) {
+        if self.parse_thinking {
+            events.push(InbandScanEvent::ThinkingEnd(self.thinking.clone()));
+        }
+        self.thinking.clear();
+        self.state = State::Outside;
+    }
+
+    fn parse_args(&self, raw: &str) -> serde_json::Value {
+        let trimmed = raw.trim();
+        if trimmed.is_empty() {
+            return serde_json::Value::Object(serde_json::Map::new());
+        }
+        serde_json::from_str(trimmed).unwrap_or(serde_json::Value::Object(serde_json::Map::new()))
+    }
+
+    /// Find the earliest matching token in buffer and return position + token.
+    fn find_earliest<'a>(&self, tokens: &'a [&str]) -> Option<(usize, &'a str)> {
+        let mut best_pos = None;
+        let mut best_tok: Option<&'a str> = None;
+        for tok in tokens {
+            if let Some(pos) = self.buffer.find(tok) {
+                match best_pos {
+                    None => {
+                        best_pos = Some(pos);
+                        best_tok = Some(tok);
+                    }
+                    Some(bp) if pos < bp || (pos == bp && tok.len() > best_tok.unwrap().len()) => {
+                        best_pos = Some(pos);
+                        best_tok = Some(tok);
+                    }
+                    _ => {}
+                }
+            }
+        }
+        best_pos.zip(best_tok)
+    }
+
+    /// Match a DSML open tag (`<\u{ff5c}DSML\u{ff5c}invoke` or `<\u{ff5c}DSML\u{ff5c}parameter`) with name attribute.
+    fn match_dsml_open(&self, kind: &str) -> Option<DsmlOpenTag> {
+        let fullwidth_tag = format!("<\u{ff5c}DSML\u{ff5c}{}", kind);
+        let ascii_tag = format!("<|DSML|{}", kind);
+        if !self.buffer.starts_with(&fullwidth_tag) && !self.buffer.starts_with(&ascii_tag) {
+            return None;
+        }
+        let close = self.buffer.find('>')?;
+        let tag = &self.buffer[..=close];
+        // Extract name attribute
+        let name = extract_attr(tag, "name")?;
+        let string_attr = extract_attr(tag, "string");
+        Some(DsmlOpenTag {
+            name: name.to_string(),
+            string_attr,
+            raw: tag.to_string(),
+            tag_len: close + 1,
+        })
+    }
+
+    /// Check for DSML closing tag (fullwidth or ASCII variant).
+    fn matching_dsml_close(&self, full: &'static str, ascii: &'static str) -> Option<&'static str> {
+        if self.buffer.starts_with(full) {
+            Some(full)
+        } else if self.buffer.starts_with(ascii) {
+            Some(ascii)
+        } else {
+            None
+        }
+    }
+
+    /// Check if buffer starts with a known DSML invoke/parameter open tag.
+    fn is_dsml_open(&self) -> bool {
+        self.buffer.starts_with("<\u{ff5c}DSML\u{ff5c}invoke")
+            || self.buffer.starts_with("<|DSML|invoke")
+            || self.buffer.starts_with("<\u{ff5c}DSML\u{ff5c}parameter")
+            || self.buffer.starts_with("<|DSML|parameter")
+    }
+}
+
+impl InbandScanner for DeepSeekInbandScanner {
+    fn feed(&mut self, text: &str) -> Vec<InbandScanEvent> {
+        if text.is_empty() {
+            return vec![];
+        }
+        self.buffer.push_str(text);
+        self.consume(false)
+    }
+
+    fn flush(&mut self) -> Vec<InbandScanEvent> {
+        let mut events = self.consume(true);
+        if self.state == State::Thinking {
+            self.end_thinking(&mut events);
+        }
+        if !self.buffer.is_empty() {
+            events.push(InbandScanEvent::Text(self.buffer.clone()));
+            self.buffer.clear();
+        }
+        self.state = State::Outside;
+        events
+    }
+}
+
+impl DeepSeekInbandScanner {
+    fn consume(&mut self, final_: bool) -> Vec<InbandScanEvent> {
+        let mut events = Vec::new();
+        loop {
+            if self.buffer.is_empty() {
+                break;
+            }
+            match self.state {
+                State::Outside => self.consume_outside(final_, &mut events),
+                State::Thinking => self.consume_thinking(final_, &mut events),
+                State::Section => {
+                    if !self.consume_section(final_) {
+                        break;
+                    }
+                    continue;
+                }
+                State::Header => {
+                    if !self.consume_header(final_, &mut events) {
+                        break;
+                    }
+                    continue;
+                }
+                State::LegacyName => {
+                    if !self.consume_legacy_name(final_, &mut events) {
+                        break;
+                    }
+                    continue;
+                }
+                State::Args | State::LegacyArgs => {
+                    if !self.consume_args(final_, &mut events) {
+                        break;
+                    }
+                    continue;
+                }
+                State::DsmlSection => {
+                    if !self.consume_dsml_section(final_, &mut events) {
+                        break;
+                    }
+                    continue;
+                }
+                State::DsmlInvoke => {
+                    if !self.consume_dsml_invoke(final_, &mut events) {
+                        break;
+                    }
+                    continue;
+                }
+                State::DsmlParam => {
+                    if !self.consume_dsml_param(final_) {
+                        break;
+                    }
+                    continue;
+                }
+            }
+            // If we returned from consume_outside or consume_thinking without a state change,
+            // and state is still the old one, we break
+            break;
+        }
+        if final_ && self.state == State::Thinking {
+            self.end_thinking(&mut events);
+        }
+        events
+    }
+
+    fn consume_outside(&mut self, final_: bool, events: &mut Vec<InbandScanEvent>) {
+        loop {
+            if self.buffer.is_empty() {
+                return;
+            }
+
+            // Handle pending whitespace stripping after control tokens
+            if self.strip_leading_ws {
+                let trimmed = self.buffer.trim_start();
+                let skipped = self.buffer.len() - trimmed.len();
+                self.buffer = trimmed.to_string();
+                self.strip_leading_ws = false;
+                if self.buffer.is_empty() {
+                    return;
+                }
+            }
+
+            // Find earliest token
+            let match_ = self.find_earliest(&OUTSIDE_TOKENS);
+            let match_ = match match_ {
+                Some(m) => m,
+                None => {
+                    let hold = if final_ {
+                        0
+                    } else {
+                        partial_suffix_overlap_any(&self.buffer, &OUTSIDE_TOKENS)
+                    };
+                    let emit_end = self.buffer.len().saturating_sub(hold);
+                    if emit_end > 0 {
+                        events.push(InbandScanEvent::Text(self.buffer[..emit_end].to_string()));
+                    }
+                    self.buffer = self.buffer[emit_end..].to_string();
+                    return;
+                }
+            };
+
+            let (pos, token) = match_;
+
+            // Emit text before the token
+            if pos > 0 {
+                events.push(InbandScanEvent::Text(self.buffer[..pos].to_string()));
+            }
+            self.buffer = self.buffer[pos..].to_string();
+
+            // Dispatch on token
+            if self.buffer.starts_with(TOOL_CALLS_BEGIN) {
+                self.buffer = self.buffer[TOOL_CALLS_BEGIN.len()..].to_string();
+                self.in_tool_section = true;
+                self.state = State::Section;
+                return;
+            }
+
+            if self.buffer.starts_with(TOOL_CALL_BEGIN) {
+                self.buffer = self.buffer[TOOL_CALL_BEGIN.len()..].to_string();
+                self.raw_block = TOOL_CALL_BEGIN.to_string();
+                self.in_tool_section = false;
+                self.state = State::Header;
+                return;
+            }
+
+            if self.buffer.starts_with(THINK_OPEN) {
+                self.buffer = self.buffer[THINK_OPEN.len()..].to_string();
+                self.state = State::Thinking;
+                self.thinking.clear();
+                if self.parse_thinking {
+                    events.push(InbandScanEvent::ThinkingStart);
+                }
+                return;
+            }
+
+            if self.buffer.starts_with(DSML_TOOL_CALLS_OPEN_FULLWIDTH)
+                || self.buffer.starts_with(DSML_TOOL_CALLS_OPEN_ASCII)
+            {
+                let open = if self.buffer.starts_with(DSML_TOOL_CALLS_OPEN_FULLWIDTH) {
+                    DSML_TOOL_CALLS_OPEN_FULLWIDTH
+                } else {
+                    DSML_TOOL_CALLS_OPEN_ASCII
+                };
+                self.buffer = self.buffer[open.len()..].to_string();
+                self.state = State::DsmlSection;
+                return;
+            }
+
+            // Check for control tokens
+            if let Some(ctrl) = self.matching_control_token() {
+                self.buffer = self.buffer[ctrl.len()..].to_string();
+                self.strip_leading_ws = true;
+                continue;
+            }
+
+            // Some other non-control token (shouldn't happen since find_earliest found it)
+            self.buffer = self.buffer[token.len()..].to_string();
+        }
+    }
+
+    fn consume_thinking(&mut self, final_: bool, events: &mut Vec<InbandScanEvent>) {
+        if let Some(pos) = self.buffer.find(THINK_CLOSE) {
+            let delta = self.buffer[..pos].to_string();
+            self.emit_thinking(&delta, events);
+            self.buffer = self.buffer[(pos + THINK_CLOSE.len())..].to_string();
+            self.end_thinking(events);
+            return;
+        }
+        // No close found
+        let hold = if final_ {
+            0
+        } else {
+            partial_suffix_overlap_any(&self.buffer, &[THINK_CLOSE])
+        };
+        let emit_end = self.buffer.len().saturating_sub(hold);
+        if emit_end > 0 {
+            let delta = self.buffer[..emit_end].to_string();
+            self.emit_thinking(&delta, events);
+        }
+        self.buffer = self.buffer[emit_end..].to_string();
+        if final_ {
+            self.end_thinking(events);
+        }
+    }
+
+    fn consume_section(&mut self, final_: bool) -> bool {
+        loop {
+            if self.buffer.is_empty() {
+                return final_;
+            }
+            self.skip_whitespace();
+            if self.buffer.starts_with(TOOL_CALLS_END) {
+                self.buffer = self.buffer[TOOL_CALLS_END.len()..].to_string();
+                self.in_tool_section = false;
+                self.state = State::Outside;
+                return true;
+            }
+            if self.buffer.starts_with(TOOL_CALL_BEGIN) {
+                self.buffer = self.buffer[TOOL_CALL_BEGIN.len()..].to_string();
+                self.raw_block = TOOL_CALL_BEGIN.to_string();
+                self.state = State::Header;
+                return true;
+            }
+            if !final_
+                && partial_suffix_overlap_any(&self.buffer, &SECTION_TOKENS) == self.buffer.len()
+            {
+                return false;
+            }
+            if self.buffer.is_empty() {
+                return false;
+            }
+            // Consume one char to advance past garbage
+            let next = self.buffer.chars().next().map(|c| c.len_utf8()).unwrap_or(1);
+            self.buffer = self.buffer[next..].to_string();
+        }
+    }
+
+    fn consume_header(&mut self, final_: bool, events: &mut Vec<InbandScanEvent>) -> bool {
+        let sep_pos = self.buffer.find(TOOL_SEPARATOR);
+        let sep_pos = match sep_pos {
+            Some(p) => p,
+            None => {
+                if final_ {
+                    self.reset_tool(State::Outside);
+                }
+                return false;
+            }
+        };
+
+        let raw_head = &self.buffer[..sep_pos + TOOL_SEPARATOR.len()];
+        let head = self.buffer[..sep_pos].trim().to_string();
+        self.raw_block.push_str(raw_head);
+        self.buffer = self.buffer[raw_head.len()..].to_string();
+
+        if head == LEGACY_TOOL_TYPE {
+            self.state = State::LegacyName;
+            return true;
+        }
+
+        self.name = head;
+        self.id = Self::gen_id();
+        self.state = State::Args;
+        true
+    }
+
+    fn consume_legacy_name(&mut self, final_: bool, events: &mut Vec<InbandScanEvent>) -> bool {
+        let fence = self.buffer.find(LEGACY_JSON_FENCE);
+        let fence = match fence {
+            Some(f) => f,
+            None => {
+                if final_ {
+                    self.reset_tool(State::Outside);
+                }
+                return false;
+            }
+        };
+
+        let raw_name = &self.buffer[..fence + LEGACY_JSON_FENCE.len()];
+        let name = self.buffer[..fence].trim().to_string();
+        self.raw_block.push_str(raw_name);
+        self.buffer = self.buffer[raw_name.len()..].to_string();
+        let lb = self.drop_one_linebreak();
+        self.raw_block.push_str(&lb);
+
+        self.name = name;
+        self.id = Self::gen_id();
+        self.state = State::LegacyArgs;
+        true
+    }
+
+    fn consume_args(&mut self, final_: bool, events: &mut Vec<InbandScanEvent>) -> bool {
+        let end = self.buffer.find(TOOL_CALL_END);
+        let end = match end {
+            Some(e) => e,
+            None => {
+                if final_ {
+                    self.reset_tool(if self.in_tool_section {
+                        State::Section
+                    } else {
+                        State::Outside
+                    });
+                }
+                return false;
+            }
+        };
+
+        let mut raw_args = self.buffer[..end].to_string();
+        if self.state == State::LegacyArgs {
+            // Strip trailing code fence
+            if let Some(fence_pos) = raw_args.rfind(CODE_FENCE) {
+                raw_args = raw_args[..fence_pos].to_string();
+            }
+        }
+
+        let raw_tail = &self.buffer[..end + TOOL_CALL_END.len()];
+        self.raw_block.push_str(raw_tail);
+        self.buffer = self.buffer[raw_tail.len()..].to_string();
+
+        let args = self.parse_args(&raw_args);
+        events.push(InbandScanEvent::ToolEnd {
+            id: self.id.clone(),
+            name: self.name.clone(),
+            arguments: args,
+            raw_block: Some(self.raw_block.clone()),
+        });
+
+        let next = if self.in_tool_section {
+            State::Section
+        } else {
+            State::Outside
+        };
+        self.reset_tool(next);
+        true
+    }
+
+    fn consume_dsml_section(&mut self, final_: bool, events: &mut Vec<InbandScanEvent>) -> bool {
+        loop {
+            if self.buffer.is_empty() {
+                return final_;
+            }
+            self.skip_whitespace();
+
+            // Check for DSML section close
+            if let Some(close) =
+                self.matching_dsml_close(DSML_TOOL_CALLS_CLOSE_FULLWIDTH, DSML_TOOL_CALLS_CLOSE_ASCII)
+            {
+                self.buffer = self.buffer[close.len()..].to_string();
+                self.state = State::Outside;
+                return true;
+            }
+
+            // Check for invoke opening
+            if let Some(open) = self.match_dsml_open("invoke") {
+                self.buffer = self.buffer[open.tag_len..].to_string();
+                self.raw_block = open.raw.clone();
+                self.name = open.name;
+                self.id = Self::gen_id();
+                self.dsml_args.clear();
+                // NOTE: Task says do NOT emit ToolStart — only ToolEnd
+                self.state = State::DsmlInvoke;
+                return true;
+            }
+
+            // Check if we have a partial DSML open tag
+            if !final_ && self.is_dsml_open() && !self.buffer.contains('>') {
+                return false;
+            }
+            if !final_
+                && partial_suffix_overlap_any(&self.buffer, &DSML_SECTION_TOKENS)
+                    == self.buffer.len()
+            {
+                return false;
+            }
+
+            if self.buffer.is_empty() {
+                return false;
+            }
+            let next = self.buffer.chars().next().map(|c| c.len_utf8()).unwrap_or(1);
+            self.buffer = self.buffer[next..].to_string();
+        }
+    }
+
+    fn consume_dsml_invoke(&mut self, final_: bool, events: &mut Vec<InbandScanEvent>) -> bool {
+        loop {
+            if self.buffer.is_empty() {
+                return final_;
+            }
+
+            let skipped = self.skip_whitespace();
+            if !skipped.is_empty() {
+                self.raw_block.push_str(&skipped);
+            }
+
+            // Check for invoke close
+            if let Some(close) =
+                self.matching_dsml_close("</\u{ff5c}DSML\u{ff5c}invoke>", "</|DSML|invoke>")
+            {
+                self.raw_block.push_str(close);
+                self.buffer = self.buffer[close.len()..].to_string();
+
+                let args_val = serde_json::Value::Object(
+                    std::mem::take(&mut self.dsml_args),
+                );
+                events.push(InbandScanEvent::ToolEnd {
+                    id: self.id.clone(),
+                    name: self.name.clone(),
+                    arguments: args_val,
+                    raw_block: Some(self.raw_block.clone()),
+                });
+                self.reset_dsml_tool();
+                self.state = State::DsmlSection;
+                return true;
+            }
+
+            // Check for parameter open
+            if let Some(param) = self.match_dsml_open("parameter") {
+                self.raw_block.push_str(&param.raw);
+                self.dsml_param_name = param.name;
+                self.dsml_param_is_string = param.string_attr.as_deref() != Some("false");
+                self.buffer = self.buffer[param.tag_len..].to_string();
+                self.state = State::DsmlParam;
+                return true;
+            }
+
+            if !final_ {
+                if self.is_dsml_open() && !self.buffer.contains('>') {
+                    return false;
+                }
+                if partial_suffix_overlap_any(&self.buffer, &DSML_INVOKE_TOKENS) == self.buffer.len() {
+                    return false;
+                }
+            }
+
+            if self.buffer.is_empty() {
+                return false;
+            }
+            let ch = self.buffer.chars().next().map(|c| c.len_utf8()).unwrap_or(1);
+            let raw_chunk = self.buffer[..ch].to_string();
+            self.raw_block.push_str(&raw_chunk);
+            self.buffer = self.buffer[ch..].to_string();
+        }
+    }
+
+    fn consume_dsml_param(&mut self, final_: bool) -> bool {
+        let close = self.find_earliest(&DSML_PARAMETER_CLOSE_TOKENS);
+        let close = match close {
+            Some(c) => c,
+            None => {
+                if final_ {
+                    self.reset_dsml_tool();
+                    self.state = State::Outside;
+                }
+                return false;
+            }
+        };
+
+        let (pos, token) = close;
+        let raw_value = &self.buffer[..pos];
+        let value = coerce_dsml_value(raw_value, self.dsml_param_is_string);
+        self.dsml_args.insert(self.dsml_param_name.clone(), value);
+        self.raw_block.push_str(raw_value);
+        self.raw_block.push_str(token);
+        self.buffer = self.buffer[(pos + token.len())..].to_string();
+        self.dsml_param_name.clear();
+        self.dsml_param_is_string = true;
+        self.state = State::DsmlInvoke;
+        true
+    }
+
+    fn matching_control_token(&self) -> Option<&'static str> {
+        if self.buffer.starts_with(TOOL_CALLS_END) {
+            return Some(TOOL_CALLS_END);
+        }
+        if self.buffer.starts_with(THINK_CLOSE) {
+            return Some(THINK_CLOSE);
+        }
+        if self.buffer.starts_with(DSML_TOOL_CALLS_CLOSE_FULLWIDTH) {
+            return Some(DSML_TOOL_CALLS_CLOSE_FULLWIDTH);
+        }
+        if self.buffer.starts_with(DSML_TOOL_CALLS_CLOSE_ASCII) {
+            return Some(DSML_TOOL_CALLS_CLOSE_ASCII);
+        }
+        for tok in CONTROL_TOKENS {
+            if self.buffer.starts_with(tok) {
+                return Some(tok);
+            }
+        }
+        None
+    }
+}
+
+// ---------------------------------------------------------------------------
+// Helpers
+// ---------------------------------------------------------------------------
+
+fn extract_attr(tag: &str, attr: &str) -> Option<String> {
+    let pattern = format!("{}=\x22", attr);
+    let start = tag.find(&pattern)?;
+    let val_start = start + pattern.len();
+    let val_end = tag[val_start..].find('"')?;
+    Some(tag[val_start..val_start + val_end].to_string())
+}
+
+fn coerce_dsml_value(raw: &str, is_string: bool) -> serde_json::Value {
+    if is_string {
+        return serde_json::Value::String(raw.to_string());
+    }
+    let trimmed = raw.trim();
+    if trimmed.is_empty() {
+        return serde_json::Value::String(raw.to_string());
+    }
+    serde_json::from_str(trimmed).unwrap_or_else(|_| {
+        serde_json::Value::String(raw.to_string())
+    })
+}
+
+fn partial_suffix_overlap_any(buf: &str, tags: &[&str]) -> usize {
+    let buf_chars: Vec<char> = buf.to_lowercase().chars().collect();
+    let buf_char_len = buf_chars.len();
+    let mut max_hold = 0usize;
+    for tag in tags {
+        let tag_chars: Vec<char> = tag.to_lowercase().chars().collect();
+        let tag_char_len = tag_chars.len();
+        if buf_char_len < tag_char_len && tag_chars[..buf_char_len].iter().eq(buf_chars.iter()) {
+            // Buffer is a prefix of the tag — keep entire buffer
+            max_hold = max_hold.max(buf.len());
+        } else if buf_char_len > 0 {
+            // Check if suffix of buffer matches prefix of tag
+            let min_chars = buf_char_len.min(tag_char_len);
+            let buf_suffix = &buf_chars[buf_char_len - min_chars..];
+            let tag_prefix = &tag_chars[..min_chars];
+            if tag_prefix.iter().eq(buf_suffix.iter()) {
+                max_hold = max_hold.max(byte_len_of_chars(min_chars, buf));
+            }
+        }
+    }
+    max_hold
+}
+
+/// Compute byte length of the first `n` chars of `s`.
+fn byte_len_of_chars(n: usize, s: &str) -> usize {
+    s.chars().take(n).map(|c| c.len_utf8()).sum()
+}
+
+// ---------------------------------------------------------------------------
+// Tests
+// ---------------------------------------------------------------------------
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn test_plain_text() {
+        let mut scanner = DeepSeekInbandScanner::new(&InbandScannerOptions::default());
+        let events = scanner.feed("Hello, this is plain text.");
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        assert_eq!(all.len(), 1);
+        assert!(matches!(&all[0], InbandScanEvent::Text(t) if t == "Hello, this is plain text."));
+    }
+
+    #[test]
+    fn test_thinking_block() {
+        let mut scanner = DeepSeekInbandScanner::new(&InbandScannerOptions {
+            parse_thinking: true,
+            ..Default::default()
+        });
+        let input = "Let me think<think>pondering deeply</think>done";
+        let events = scanner.feed(input);
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        assert!(all.iter().any(|e| matches!(e, InbandScanEvent::ThinkingStart)),
+            "expected ThinkingStart, got {all:?}");
+        assert!(all.iter().any(|e| matches!(e, InbandScanEvent::ThinkingEnd(t) if t == "pondering deeply")),
+            "expected ThinkingEnd with 'pondering deeply', got {all:?}");
+    }
+
+    #[test]
+    fn test_control_tokens_are_stripped() {
+        let mut scanner = DeepSeekInbandScanner::new(&InbandScannerOptions::default());
+        let input = format!("{}Hello, world!{}", BOS, EOS);
+        let events = scanner.feed(&input);
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        assert_eq!(all.len(), 1, "expected 1 Text event, got {all:?}");
+        assert!(matches!(&all[0], InbandScanEvent::Text(t) if t == "Hello, world!"));
+    }
+
+    #[test]
+    fn test_strip_whitespace_after_control_tokens() {
+        let mut scanner = DeepSeekInbandScanner::new(&InbandScannerOptions::default());
+        let input = format!("{}  Hello", ASSISTANT);
+        let events = scanner.feed(&input);
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        assert_eq!(all.len(), 1);
+        assert!(matches!(&all[0], InbandScanEvent::Text(t) if t == "Hello"),
+            "expected 'Hello', got {all:?}");
+    }
+
+    #[test]
+    fn test_simple_tool_call_fullwidth() {
+        let mut scanner = DeepSeekInbandScanner::new(&InbandScannerOptions::default());
+        let input = format!(
+            "{}{}get_weather{} {{\"city\":\"NYC\"}}{}{}",
+            TOOL_CALLS_BEGIN, TOOL_CALL_BEGIN, TOOL_SEPARATOR, TOOL_CALL_END, TOOL_CALLS_END
+        );
+        let events = scanner.feed(&input);
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        // Should produce: ToolEnd (no ToolStart per spec)
+        assert_eq!(all.len(), 1, "expected 1 ToolEnd, got {all:?}");
+        match &all[0] {
+            InbandScanEvent::ToolEnd { name, arguments, .. } => {
+                assert_eq!(name, "get_weather");
+                assert_eq!(arguments.get("city").and_then(|v| v.as_str()), Some("NYC"));
+            }
+            other => panic!("expected ToolEnd, got {other:?}"),
+        }
+    }
+
+    #[test]
+    fn test_flush_remaining_text() {
+        let mut scanner = DeepSeekInbandScanner::new(&InbandScannerOptions::default());
+        let events = scanner.feed("some leftover");
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        // The text is not inside any special tag, so it should be emitted as Text
+        assert!(all.iter().any(|e| matches!(e, InbandScanEvent::Text(t) if t == "some leftover")),
+            "expected text, got {all:?}");
+    }
+
+    #[test]
+    fn test_chunked_tool_call() {
+        let mut scanner = DeepSeekInbandScanner::new(&InbandScannerOptions::default());
+        let part1 = format!("{}{}get_weather", TOOL_CALLS_BEGIN, TOOL_CALL_BEGIN);
+        scanner.feed(&part1);
+        let part2 = format!("{} {{\"city\":\"NYC\"}}{}", TOOL_SEPARATOR, TOOL_CALL_END);
+        let events = scanner.feed(&part2);
+        let part3 = TOOL_CALLS_END;
+        let events2 = scanner.feed(part3);
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(events2).chain(flushed).collect();
+        assert!(all.iter().any(|e| matches!(e, InbandScanEvent::ToolEnd { name, .. } if name == "get_weather")),
+            "expected ToolEnd for get_weather, got {all:?}");
+    }
+
+    #[test]
+    fn test_legacy_json_tool_call() {
+        let mut scanner = DeepSeekInbandScanner::new(&InbandScannerOptions::default());
+        let json_payload = r#"{"city":"NYC"}"#;
+        let input = format!(
+            "{}{}{}{}get_weather\n{} {}\n{}{}{}",
+            TOOL_CALLS_BEGIN,
+            TOOL_CALL_BEGIN,
+            LEGACY_TOOL_TYPE,
+            TOOL_SEPARATOR,
+            LEGACY_JSON_FENCE,
+            json_payload,
+            CODE_FENCE,
+            TOOL_CALL_END,
+            TOOL_CALLS_END
+        );
+        let events = scanner.feed(&input);
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        assert!(all.iter().any(|e| matches!(e, InbandScanEvent::ToolEnd { name, .. } if name == "get_weather")),
+            "expected ToolEnd for get_weather, got {all:?}");
+        if let Some(InbandScanEvent::ToolEnd { arguments, .. }) = all.iter().find(|e| matches!(e, InbandScanEvent::ToolEnd { .. })) {
+            assert_eq!(arguments.get("city").and_then(|v| v.as_str()), Some("NYC"));
+        }
+    }
+
+    #[test]
+    fn test_dsml_format() {
+        let mut scanner = DeepSeekInbandScanner::new(&InbandScannerOptions::default());
+        let input = format!(
+            "{}<|DSML|invoke name=\"get_weather\"><|DSML|parameter name=\"city\">NYC</|DSML|parameter></|DSML|invoke>{}",
+            DSML_TOOL_CALLS_OPEN_ASCII, DSML_TOOL_CALLS_CLOSE_ASCII
+        );
+        let events = scanner.feed(&input);
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        assert!(all.iter().any(|e| matches!(e, InbandScanEvent::ToolEnd { name, .. } if name == "get_weather")),
+            "expected ToolEnd for get_weather, got {all:?}");
+        if let Some(InbandScanEvent::ToolEnd { arguments, .. }) = all.iter().find(|e| matches!(e, InbandScanEvent::ToolEnd { .. })) {
+            assert_eq!(arguments.get("city").and_then(|v| v.as_str()), Some("NYC"));
+        }
+    }
+
+    #[test]
+    fn test_dsml_fullwidth_format() {
+        let mut scanner = DeepSeekInbandScanner::new(&InbandScannerOptions::default());
+        let input = format!(
+            "{}<\u{ff5c}DSML\u{ff5c}invoke name=\"get_weather\"><\u{ff5c}DSML\u{ff5c}parameter name=\"city\">NYC</\u{ff5c}DSML\u{ff5c}parameter></\u{ff5c}DSML\u{ff5c}invoke>{}",
+            DSML_TOOL_CALLS_OPEN_FULLWIDTH, DSML_TOOL_CALLS_CLOSE_FULLWIDTH
+        );
+        let events = scanner.feed(&input);
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        assert!(all.iter().any(|e| matches!(e, InbandScanEvent::ToolEnd { name, .. } if name == "get_weather")),
+            "expected ToolEnd for get_weather, got {all:?}");
+        if let Some(InbandScanEvent::ToolEnd { arguments, .. }) = all.iter().find(|e| matches!(e, InbandScanEvent::ToolEnd { .. })) {
+            assert_eq!(arguments.get("city").and_then(|v| v.as_str()), Some("NYC"));
+        }
+    }
+}
diff --git a/crates/jcode-llm-dialects/src/gemini.rs b/crates/jcode-llm-dialects/src/gemini.rs
new file mode 100644
index 0000000000..09c71318e0
--- /dev/null
+++ b/crates/jcode-llm-dialects/src/gemini.rs
@@ -0,0 +1,472 @@
+//! Gemini Inband Scanner — Python-fenced ```` ```tool_code ```` format.
+//!
+//! The model emits tool calls inside fenced Python code blocks:
+//!
+//! ````text
+//! ```tool_code
+//! tool_call(name: "get_weather", city: "NYC")
+//! ```
+//! ````
+//!
+//! Arguments use Python-style keyword notation (`key: value`). Values may be
+//! strings (quoted), numbers, booleans, or nested dicts/lists.
+
+use std::time::{SystemTime, UNIX_EPOCH};
+use crate::types::{InbandScanEvent, InbandScanner, InbandScannerOptions};
+use serde_json::Value;
+
+const FENCE_OPEN: &str = "```tool_code";
+const FENCE_CLOSE: &str = "```";
+
+/// Streaming scanner for the Gemini dialect.
+pub struct GeminiInbandScanner {
+    buffer: String,
+    inside_fence: bool,
+    fence_content: String,
+}
+
+impl GeminiInbandScanner {
+    pub fn new(_options: &InbandScannerOptions) -> Self {
+        Self {
+            buffer: String::new(),
+            inside_fence: false,
+            fence_content: String::new(),
+        }
+    }
+
+    fn gen_id(&self) -> String {
+        let nanos = SystemTime::now()
+            .duration_since(UNIX_EPOCH)
+            .unwrap_or_default()
+            .subsec_nanos();
+        format!("gemini_{:09x}", nanos)
+    }
+}
+
+impl InbandScanner for GeminiInbandScanner {
+    fn feed(&mut self, text: &str) -> Vec<InbandScanEvent> {
+        if text.is_empty() {
+            return vec![];
+        }
+        self.buffer.push_str(text);
+        self.consume(false)
+    }
+
+    fn flush(&mut self) -> Vec<InbandScanEvent> {
+        let mut events = self.consume(true);
+        if self.inside_fence && !self.fence_content.is_empty() {
+            // Try to parse what we have as tool calls
+            events.extend(parse_tool_calls_from_body(&self.fence_content, &self.gen_id()));
+            self.fence_content.clear();
+            self.inside_fence = false;
+        }
+        if !self.buffer.is_empty() {
+            events.push(InbandScanEvent::Text(self.buffer.clone()));
+            self.buffer.clear();
+        }
+        events
+    }
+}
+
+impl GeminiInbandScanner {
+    fn consume(&mut self, final_: bool) -> Vec<InbandScanEvent> {
+        let mut events = Vec::new();
+        loop {
+            if self.inside_fence {
+                // Look for closing fence
+                if let Some(pos) = self.buffer.find(FENCE_CLOSE) {
+                    let content = &self.buffer[..pos];
+                    self.fence_content.push_str(content);
+                    self.buffer = self.buffer[(pos + FENCE_CLOSE.len())..].to_string();
+                    self.inside_fence = false;
+
+                    // Parse the collected fence content
+                    events.extend(parse_tool_calls_from_body(&self.fence_content, &self.gen_id()));
+                    self.fence_content.clear();
+                    continue;
+                }
+                // Content before the last line that could be a close fence overlap
+                if final_ {
+                    self.fence_content.push_str(&self.buffer);
+                    events.extend(parse_tool_calls_from_body(&self.fence_content, &self.gen_id()));
+                    self.fence_content.clear();
+                    self.buffer.clear();
+                    self.inside_fence = false;
+                    return events;
+                }
+                // Hold back potential partial "```"
+                let hold = if self.buffer.ends_with('`') {
+                    1.min(self.buffer.len())
+                } else if self.buffer.ends_with("``") {
+                    2.min(self.buffer.len())
+                } else {
+                    0
+                };
+                let emit_end = self.buffer.len().saturating_sub(hold);
+                self.fence_content.push_str(&self.buffer[..emit_end]);
+                self.buffer = self.buffer[emit_end..].to_string();
+                return events;
+            }
+
+            // Outside fence — look for opening
+            if let Some(pos) = self.buffer.find(FENCE_OPEN) {
+                if pos > 0 {
+                    events.push(InbandScanEvent::Text(self.buffer[..pos].to_string()));
+                }
+                // Emit text up to fence, then enter fence mode
+                self.buffer = self.buffer[(pos + FENCE_OPEN.len())..].to_string();
+                // Skip whitespace after fence marker
+                let trimmed = self.buffer.trim_start();
+                let _skipped = self.buffer.len() - trimmed.len();
+                self.buffer = trimmed.to_string();
+                self.inside_fence = true;
+                self.fence_content.clear();
+                continue;
+            }
+
+            // No fence at all
+            let hold = if final_ {
+                0
+            } else if self.buffer.ends_with('`') {
+                // Could be partial FENCE_OPEN
+                let min_len = self.buffer.len().min(FENCE_OPEN.len());
+                if self.buffer[FENCE_OPEN.len().saturating_sub(min_len)..]
+                    .to_lowercase()
+                    .as_str()
+                    == &"```tool_code"[..min_len]
+                {
+                    min_len
+                } else {
+                    0
+                }
+            } else {
+                0
+            };
+            let emit_end = self.buffer.len().saturating_sub(hold);
+            if emit_end > 0 {
+                events.push(InbandScanEvent::Text(self.buffer[..emit_end].to_string()));
+            }
+            self.buffer = self.buffer[emit_end..].to_string();
+            return events;
+        }
+    }
+}
+
+/// Parse Python-style `tool_call(key: value, ...)` calls from fence body text.
+fn parse_tool_calls_from_body(body: &str, id_prefix: &str) -> Vec<InbandScanEvent> {
+    let mut events = Vec::new();
+    let body = body.trim();
+    if body.is_empty() {
+        return events;
+    }
+
+    // Match `tool_call(...)` patterns
+    let mut remaining = body;
+    let mut counter = 0u32;
+
+    while let Some(start) = remaining.find("tool_call(") {
+        let before = &remaining[..start];
+        if !before.trim().is_empty() {
+            events.push(InbandScanEvent::Text(before.to_string()));
+        }
+        remaining = &remaining[(start + 10)..]; // skip "tool_call("
+
+        // Find matching closing paren
+        let mut depth = 1i32;
+        let mut end = 0usize;
+        for (i, ch) in remaining.char_indices() {
+            match ch {
+                '(' => depth += 1,
+                ')' => {
+                    depth -= 1;
+                    if depth == 0 {
+                        end = i;
+                        break;
+                    }
+                }
+                _ => {}
+            }
+        }
+        if depth != 0 {
+            // No closing paren — emit as text
+            events.push(InbandScanEvent::Text(format!("tool_call({remaining}")));
+            break;
+        }
+
+        let args_str = remaining[..end].trim();
+        remaining = &remaining[(end + 1)..];
+        counter += 1;
+        let id = format!("{id_prefix}_{counter}");
+
+        // Parse keyword arguments
+        let (name, args_map) = parse_python_kwargs(args_str);
+        events.push(InbandScanEvent::ToolStart {
+            id: id.clone(),
+            name: name.clone(),
+        });
+        events.push(InbandScanEvent::ToolEnd {
+            id,
+            name,
+            arguments: args_map,
+            raw_block: Some(format!("tool_call({args_str})")),
+        });
+    }
+
+    if !remaining.trim().is_empty() {
+        events.push(InbandScanEvent::Text(remaining.to_string()));
+    }
+
+    events
+}
+
+/// Parse Python-style keyword arguments like `name: "get_weather", city: "NYC"`.
+fn parse_python_kwargs(input: &str) -> (String, Value) {
+    let input = input.trim();
+    if input.is_empty() {
+        return ("unknown".to_string(), Value::Object(Default::default()));
+    }
+
+    let mut name = String::from("unknown");
+    let mut map = serde_json::Map::new();
+
+    // Split by top-level commas first, then parse key:value per part
+    for part in split_by_top_level_comma(input) {
+        let part = part.trim();
+        if let Some(colon_pos) = part.find(':') {
+            let key = part[..colon_pos].trim().trim_matches('"').trim_matches('\'').to_string();
+            let val_str = part[colon_pos + 1..].trim();
+            if key == "name" {
+                if let Some(val) = read_python_value(val_str) {
+                    if let Some(s) = val.as_str() {
+                        name = s.to_string();
+                    }
+                }
+            } else {
+                let val = read_python_value(val_str).unwrap_or(Value::Null);
+                map.insert(key, val);
+            }
+        }
+    }
+
+    (name, Value::Object(map))
+}
+
+/// Read a single Python value from the start of the string.
+fn read_python_value(s: &str) -> Option<Value> {
+    let s = s.trim();
+    if s.is_empty() {
+        return None;
+    }
+    let first = s.chars().next()?;
+
+    // String
+    if first == '"' || first == '\'' {
+        let quote = first;
+        let mut escaped = false;
+        let mut end = None;
+        for (i, ch) in s[1..].char_indices() {
+            if escaped {
+                escaped = false;
+                continue;
+            }
+            if ch == '\\' {
+                escaped = true;
+            } else if ch == quote {
+                end = Some(i + 1);
+                break;
+            }
+        }
+        if let Some(end) = end {
+            let inner = &s[1..end]; // strip quotes
+            // Return the value with the consumed length via string slicing
+            return Some(Value::String(
+                inner.replace("\\\"", "\"").replace("\\'", "'"),
+            ));
+        }
+        // Unterminated string
+        return Some(Value::String(s[1..].to_string()));
+    }
+
+    // Number
+    if first.is_ascii_digit() || first == '-' {
+        let mut end = 0;
+        for (i, ch) in s.char_indices() {
+            if ch.is_ascii_digit() || ch == '.' || ch == '-' || ch == '+' || ch == 'e' || ch == 'E' {
+                end = i + 1;
+            } else {
+                break;
+            }
+        }
+        if end > 0 {
+            let num_str = &s[..end];
+            if let Ok(n) = num_str.parse::<i64>() {
+                return Some(Value::Number(n.into()));
+            }
+            if let Ok(n) = num_str.parse::<f64>() {
+                if let Some(v) = serde_json::Number::from_f64(n) {
+                    return Some(Value::Number(v));
+                }
+            }
+            return Some(Value::String(num_str.to_string()));
+        }
+    }
+
+    // Boolean / None
+    if s.starts_with("true") || s.starts_with("True") {
+        return Some(Value::Bool(true));
+    }
+    if s.starts_with("false") || s.starts_with("False") {
+        return Some(Value::Bool(false));
+    }
+    if s.starts_with("none") || s.starts_with("None") || s.starts_with("null") {
+        return Some(Value::Null);
+    }
+
+    // List
+    if first == '[' {
+        let mut depth = 1i32;
+        let mut end = None;
+        for (i, ch) in s[1..].char_indices() {
+            match ch {
+                '[' => depth += 1,
+                ']' => {
+                    depth -= 1;
+                    if depth == 0 {
+                        end = Some(i + 1);
+                        break;
+                    }
+                }
+                _ => {}
+            }
+        }
+        if let Some(end) = end {
+            let inner = s[1..end].trim();
+            if inner.is_empty() {
+                return Some(Value::Array(vec![]));
+            }
+            // Parse comma-separated items
+            let items: Vec<Value> = inner
+                .split(',')
+                .filter_map(|item| read_python_value(item.trim()))
+                .collect();
+            return Some(Value::Array(items));
+        }
+        return None;
+    }
+
+    // Dict
+    if first == '{' {
+        let mut depth = 1i32;
+        let mut end = None;
+        for (i, ch) in s[1..].char_indices() {
+            match ch {
+                '{' => depth += 1,
+                '}' => {
+                    depth -= 1;
+                    if depth == 0 {
+                        end = Some(i + 1);
+                        break;
+                    }
+                }
+                _ => {}
+            }
+        }
+        if let Some(end) = end {
+            let inner = s[1..end].trim();
+            if inner.is_empty() {
+                return Some(Value::Object(Default::default()));
+            }
+            let mut map = serde_json::Map::new();
+            for part in split_by_top_level_comma(inner) {
+                let part = part.trim();
+                if let Some(eq_pos) = part.find(':') {
+                    let k = part[..eq_pos].trim().trim_matches('"').trim_matches('\'');
+                    let v = read_python_value(part[eq_pos + 1..].trim());
+                    if let Some(v) = v {
+                        map.insert(k.to_string(), v);
+                    }
+                }
+            }
+            return Some(Value::Object(map));
+        }
+        return None;
+    }
+
+    None
+}
+
+fn split_by_top_level_comma(s: &str) -> Vec<&str> {
+    let mut parts = Vec::new();
+    let mut depth = 0i32;
+    let mut start = 0usize;
+    for (i, ch) in s.char_indices() {
+        match ch {
+            '(' | '[' | '{' => depth += 1,
+            ')' | ']' | '}' => depth -= 1,
+            ',' if depth == 0 => {
+                parts.push(&s[start..i]);
+                start = i + 1;
+            }
+            _ => {}
+        }
+    }
+    if start < s.len() {
+        parts.push(&s[start..]);
+    }
+    parts
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn test_gemini_single_tool_call() {
+        let mut scanner = GeminiInbandScanner::new(&InbandScannerOptions::default());
+        let input = "Let me check\n```tool_code\ntool_call(name: \"get_weather\", city: \"NYC\")\n```\n";
+        let events = scanner.feed(input);
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        assert!(all.iter().any(|e| matches!(e, InbandScanEvent::ToolStart { name, .. } if name == "get_weather")),
+            "expected ToolStart for get_weather, got {all:?}");
+    }
+
+    #[test]
+    fn test_gemini_no_tool_calls() {
+        let mut scanner = GeminiInbandScanner::new(&InbandScannerOptions::default());
+        let events = scanner.feed("Just some regular text without any tool calls.");
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        assert_eq!(all.len(), 1);
+        assert!(matches!(&all[0], InbandScanEvent::Text(_)));
+    }
+
+    #[test]
+    fn test_gemini_malformed_fence() {
+        let mut scanner = GeminiInbandScanner::new(&InbandScannerOptions::default());
+        let input = "```tool_code\nthis is not a valid tool call\n```";
+        let events = scanner.feed(input);
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        // Should produce no tool events, just text
+        assert!(!all.iter().any(|e| matches!(e, InbandScanEvent::ToolStart { .. })));
+    }
+
+    #[test]
+    fn test_python_kwargs_simple() {
+        let (name, args) = parse_python_kwargs(r#"name: "get_weather", city: "NYC""#);
+        assert_eq!(name, "get_weather");
+        assert_eq!(args.get("city").and_then(|v| v.as_str()), Some("NYC"));
+    }
+
+    #[test]
+    fn test_python_value_string() {
+        assert_eq!(
+            read_python_value(r#""hello world""#),
+            Some(Value::String("hello world".to_string()))
+        );
+        assert_eq!(read_python_value("42"), Some(Value::Number(42.into())));
+        assert_eq!(read_python_value("true"), Some(Value::Bool(true)));
+        assert_eq!(read_python_value("None"), Some(Value::Null));
+    }
+}
diff --git a/crates/jcode-llm-dialects/src/hermes.rs b/crates/jcode-llm-dialects/src/hermes.rs
new file mode 100644
index 0000000000..709c12cd74
--- /dev/null
+++ b/crates/jcode-llm-dialects/src/hermes.rs
@@ -0,0 +1,254 @@
+//! Hermes Inband Scanner — JSON‑in‑`<tool_call>` format.
+//!
+//! The simplest dialect: the model emits tool calls inside
+//! `<tool_call>{"name":"...","arguments":{...}}</tool_call>` tags.
+//! Optional `<think>`…`</think>` blocks are also parsed.
+
+use std::time::{SystemTime, UNIX_EPOCH};
+
+use crate::types::{InbandScanEvent, InbandScanner, InbandScannerOptions};
+
+const TOOL_OPEN: &str = "<tool_call>";
+const TOOL_CLOSE: &str = "</tool_call>";
+const THINK_OPEN: &str = "<think>";
+const THINK_CLOSE: &str = "</think>";
+
+/// Streaming scanner for the Hermes dialect.
+pub struct HermesInbandScanner {
+    buffer: String,
+    inside_tool: bool,
+    thinking_accum: String,
+    in_thinking: bool,
+    parse_thinking: bool,
+}
+
+impl HermesInbandScanner {
+    pub fn new(options: &InbandScannerOptions) -> Self {
+        Self {
+            buffer: String::new(),
+            inside_tool: false,
+            thinking_accum: String::new(),
+            in_thinking: false,
+            parse_thinking: options.parse_thinking,
+        }
+    }
+
+    fn gen_id() -> String {
+        let nanos = SystemTime::now()
+            .duration_since(UNIX_EPOCH)
+            .unwrap_or_default()
+            .subsec_nanos();
+        format!("hermes_{:09x}", nanos)
+    }
+}
+
+impl InbandScanner for HermesInbandScanner {
+    fn feed(&mut self, text: &str) -> Vec<InbandScanEvent> {
+        if text.is_empty() {
+            return vec![];
+        }
+        self.buffer.push_str(text);
+        self.consume(false)
+    }
+
+    fn flush(&mut self) -> Vec<InbandScanEvent> {
+        let mut events = self.consume(true);
+        if self.in_thinking {
+            events.push(InbandScanEvent::ThinkingEnd(self.thinking_accum.clone()));
+            self.thinking_accum.clear();
+            self.in_thinking = false;
+        }
+        if !self.buffer.is_empty() {
+            events.push(InbandScanEvent::Text(self.buffer.clone()));
+            self.buffer.clear();
+        }
+        self.inside_tool = false;
+        events
+    }
+}
+
+impl HermesInbandScanner {
+    fn consume(&mut self, final_: bool) -> Vec<InbandScanEvent> {
+        let mut events = Vec::new();
+        loop {
+            if self.in_thinking {
+                if let Some(pos) = self.buffer.find(THINK_CLOSE) {
+                    let delta = self.buffer[..pos].to_string();
+                    if !delta.is_empty() {
+                        self.thinking_accum.push_str(&delta);
+                        events.push(InbandScanEvent::ThinkingDelta(delta));
+                    }
+                    self.buffer = self.buffer[(pos + THINK_CLOSE.len())..].to_string();
+                    events.push(InbandScanEvent::ThinkingEnd(self.thinking_accum.clone()));
+                    self.thinking_accum.clear();
+                    self.in_thinking = false;
+                    continue;
+                } else if final_ {
+                    events.push(InbandScanEvent::ThinkingDelta(self.buffer.clone()));
+                    self.thinking_accum.push_str(&self.buffer);
+                    events.push(InbandScanEvent::ThinkingEnd(self.thinking_accum.clone()));
+                    self.thinking_accum.clear();
+                    self.buffer.clear();
+                    self.in_thinking = false;
+                }
+                return events;
+            }
+
+            if !self.inside_tool {
+                let open = self.buffer.find(TOOL_OPEN);
+                let think = if self.parse_thinking {
+                    self.buffer.find(THINK_OPEN)
+                } else {
+                    None
+                };
+                let _start = match (open, think) {
+                    (Some(o), Some(t)) if t < o => {
+                        // thinking before tool call
+                        if t > 0 {
+                            events.push(InbandScanEvent::Text(self.buffer[..t].to_string()));
+                        }
+                        self.buffer = self.buffer[(t + THINK_OPEN.len())..].to_string();
+                        events.push(InbandScanEvent::ThinkingStart);
+                        self.in_thinking = true;
+                        continue;
+                    }
+                    (Some(o), _) => {
+                        if o > 0 {
+                            events.push(InbandScanEvent::Text(self.buffer[..o].to_string()));
+                        }
+                        self.buffer = self.buffer[(o + TOOL_OPEN.len())..].to_string();
+                        self.inside_tool = true;
+                        continue;
+                    }
+                    (None, Some(t)) => {
+                        if t > 0 {
+                            events.push(InbandScanEvent::Text(self.buffer[..t].to_string()));
+                        }
+                        self.buffer = self.buffer[(t + THINK_OPEN.len())..].to_string();
+                        events.push(InbandScanEvent::ThinkingStart);
+                        self.in_thinking = true;
+                        continue;
+                    }
+                    (None, None) => {
+                        let hold = if final_ {
+                            0
+                        } else {
+                            partial_suffix_overlap_any(
+                                &self.buffer,
+                                &[TOOL_OPEN, TOOL_CLOSE, THINK_OPEN, THINK_CLOSE],
+                            )
+                        };
+                        let emit_end = self.buffer.len().saturating_sub(hold);
+                        if emit_end > 0 {
+                            events.push(InbandScanEvent::Text(self.buffer[..emit_end].to_string()));
+                        }
+                        self.buffer = self.buffer[emit_end..].to_string();
+                        return events;
+                    }
+                };
+            }
+
+            // Inside a tool call
+            if let Some(pos) = self.buffer.find(TOOL_CLOSE) {
+                let body = self.buffer[..pos].trim().to_string();
+                self.buffer = self.buffer[(pos + TOOL_CLOSE.len())..].to_string();
+                self.inside_tool = false;
+
+                if let Ok(value) = serde_json::from_str::<serde_json::Value>(&body) {
+                    let name = value.get("name").and_then(|v| v.as_str()).unwrap_or("").to_string();
+                    let args = value.get("arguments").cloned().unwrap_or_default();
+                    let id = Self::gen_id();
+                    events.push(InbandScanEvent::ToolStart { id: id.clone(), name: name.clone() });
+                    events.push(InbandScanEvent::ToolEnd {
+                        id,
+                        name,
+                        arguments: args,
+                        raw_block: Some(format!("<tool_call>{body}</tool_call>")),
+                    });
+                }
+                continue;
+            }
+            return events;
+        }
+    }
+}
+
+fn partial_suffix_overlap_any(buf: &str, tags: &[&str]) -> usize {
+    let buf_lower = buf.to_lowercase();
+    let mut max_hold = 0usize;
+    for tag in tags {
+        let tag_lower = tag.to_lowercase();
+        let min_len = buf_lower.len().min(tag_lower.len());
+        if buf_lower.len() < tag_lower.len() && tag_lower.starts_with(&buf_lower) {
+            max_hold = max_hold.max(buf.len());
+        } else if min_len > 0
+            && tag_lower[..min_len] == buf_lower[buf_lower.len() - min_len..]
+        {
+            max_hold = max_hold.max(min_len);
+        }
+    }
+    max_hold
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn test_hermes_simple_tool_call() {
+        let mut scanner = HermesInbandScanner::new(&InbandScannerOptions::default());
+        let input = r#"Hello<tool_call>{"name":"get_weather","arguments":{"city":"NYC"}}</tool_call>"#;
+        let events = scanner.feed(input);
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        assert_eq!(all.len(), 3);
+        assert!(matches!(&all[0], InbandScanEvent::Text(t) if t == "Hello"));
+        assert!(matches!(&all[1], InbandScanEvent::ToolStart { name, .. } if name == "get_weather"));
+        assert!(matches!(&all[2], InbandScanEvent::ToolEnd { name, .. } if name == "get_weather"));
+    }
+
+    #[test]
+    fn test_hermes_streaming_chunks() {
+        let mut scanner = HermesInbandScanner::new(&InbandScannerOptions::default());
+        scanner.feed(r#"Some text<tool_call>{"name":"get"#);
+        let events = scanner.feed(r#"","arguments":{}}</tool_call>"#);
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        let mut names: Vec<_> = all.iter().filter_map(|e| {
+            if let InbandScanEvent::ToolStart { name, .. } = e { Some(name.as_str()) } else { None }
+        }).collect();
+        if names.is_empty() {
+            // Try checking ToolEnd instead — the whole call might arrive in one chunk
+            names = all.iter().filter_map(|e| {
+                if let InbandScanEvent::ToolEnd { name, .. } = e { Some(name.as_str()) } else { None }
+            }).collect();
+        }
+        assert_eq!(names, vec!["get"]);
+    }
+
+    #[test]
+    fn test_hermes_thinking() {
+        let mut scanner = HermesInbandScanner::new(&InbandScannerOptions {
+            parse_thinking: true,
+            ..Default::default()
+        });
+        let input = r#"Let me think<think>pondering</think>done"#;
+        let events = scanner.feed(input);
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        assert!(all.iter().any(|e| matches!(e, InbandScanEvent::ThinkingStart)));
+        assert!(all.iter().any(|e| matches!(e, InbandScanEvent::ThinkingEnd(t) if t == "pondering")));
+    }
+
+    #[test]
+    fn test_hermes_no_tool_call() {
+        let mut scanner = HermesInbandScanner::new(&InbandScannerOptions::default());
+        let events = scanner.feed("Just plain text with no tags.");
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        assert!(all.len() <= 1);
+        if let Some(ev) = all.first() {
+            assert!(matches!(ev, InbandScanEvent::Text(_)));
+        }
+    }
+}
diff --git a/crates/jcode-llm-dialects/src/kimi.rs b/crates/jcode-llm-dialects/src/kimi.rs
new file mode 100644
index 0000000000..92568f6f9c
--- /dev/null
+++ b/crates/jcode-llm-dialects/src/kimi.rs
@@ -0,0 +1,362 @@
+//! Kimi Inband Scanner — token-delimited `<|...|>` format.
+//!
+//! Model emits tool calls inside these markers:
+//! `<|tool_calls_section_begin|><|tool_call_begin|>id<|tool_call_argument_begin|>JSON<|tool_call_end|><|tool_calls_section_end|>`
+//! Optional `<think>`...`</think>` blocks are also parsed.
+
+use std::time::{SystemTime, UNIX_EPOCH};
+use crate::types::{InbandScanEvent, InbandScanner, InbandScannerOptions};
+
+const SECTION_BEGIN: &str = "<|tool_calls_section_begin|>";
+const SECTION_END: &str = "<|tool_calls_section_end|>";
+const CALL_BEGIN: &str = "<|tool_call_begin|>";
+const CALL_END: &str = "<|tool_call_end|>";
+const ARG_BEGIN: &str = "<|tool_call_argument_begin|>";
+const THINK_OPEN: &str = "<think>";
+const THINK_CLOSE: &str = "</think>";
+
+const TOKENS: &[&str] = &[SECTION_BEGIN, SECTION_END, CALL_BEGIN, CALL_END, ARG_BEGIN];
+
+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
+enum State {
+    Outside,
+    Section,
+    Header,
+    Args,
+    Thinking,
+}
+
+/// Streaming scanner for the Kimi dialect.
+pub struct KimiInbandScanner {
+    buffer: String,
+    state: State,
+    // Accumulated call state
+    call_id: String,
+    call_name: String,
+    raw_block: String,
+    thinking: String,
+    parse_thinking: bool,
+}
+
+impl KimiInbandScanner {
+    pub fn new(options: &InbandScannerOptions) -> Self {
+        Self {
+            buffer: String::new(),
+            state: State::Outside,
+            call_id: String::new(),
+            call_name: String::new(),
+            raw_block: String::new(),
+            thinking: String::new(),
+            parse_thinking: options.parse_thinking,
+        }
+    }
+
+    fn gen_id(&self) -> String {
+        let nanos = SystemTime::now()
+            .duration_since(UNIX_EPOCH)
+            .unwrap_or_default()
+            .subsec_nanos();
+        format!("kimi_{:09x}", nanos)
+    }
+
+    fn reset_call(&mut self) {
+        self.call_id.clear();
+        self.call_name.clear();
+        self.raw_block.clear();
+    }
+}
+
+impl InbandScanner for KimiInbandScanner {
+    fn feed(&mut self, text: &str) -> Vec<InbandScanEvent> {
+        if text.is_empty() {
+            return vec![];
+        }
+        self.buffer.push_str(text);
+        self.consume(false)
+    }
+
+    fn flush(&mut self) -> Vec<InbandScanEvent> {
+        let mut events = self.consume(true);
+
+        // Close any pending thinking block
+        if self.state == State::Thinking {
+            events.push(InbandScanEvent::ThinkingEnd(self.thinking.clone()));
+            self.thinking.clear();
+            self.state = State::Outside;
+        }
+
+        // Emit remaining text
+        if !self.buffer.is_empty() {
+            events.push(InbandScanEvent::Text(self.buffer.clone()));
+            self.buffer.clear();
+        }
+        self.reset_call();
+        events
+    }
+}
+
+impl KimiInbandScanner {
+    fn consume(&mut self, final_: bool) -> Vec<InbandScanEvent> {
+        let mut events = Vec::new();
+        loop {
+            match self.state {
+                State::Thinking => {
+                    if let Some(pos) = self.buffer.find(THINK_CLOSE) {
+                        let delta = &self.buffer[..pos];
+                        if !delta.is_empty() {
+                            self.thinking.push_str(delta);
+                            events.push(InbandScanEvent::ThinkingDelta(delta.to_string()));
+                        }
+                        self.buffer = self.buffer[(pos + THINK_CLOSE.len())..].to_string();
+                        events.push(InbandScanEvent::ThinkingEnd(self.thinking.clone()));
+                        self.thinking.clear();
+                        self.state = State::Outside;
+                        continue;
+                    } else if final_ {
+                        if !self.buffer.is_empty() {
+                            events.push(InbandScanEvent::ThinkingDelta(self.buffer.clone()));
+                            self.thinking.push_str(&self.buffer);
+                        }
+                        events.push(InbandScanEvent::ThinkingEnd(self.thinking.clone()));
+                        self.thinking.clear();
+                        self.buffer.clear();
+                        self.state = State::Outside;
+                    }
+                    return events;
+                }
+
+                State::Outside => {
+                    // Look for next interesting marker
+                    let tok_pos = self.next_token_index();
+                    let think_pos = if self.parse_thinking {
+                        self.buffer.find(THINK_OPEN)
+                    } else {
+                        None
+                    };
+                    let start = match (tok_pos, think_pos) {
+                        (Some(t), Some(h)) if h < t => {
+                            // Thinking starts first
+                            if h > 0 {
+                                events.push(InbandScanEvent::Text(self.buffer[..h].to_string()));
+                            }
+                            self.buffer = self.buffer[(h + THINK_OPEN.len())..].to_string();
+                            self.thinking.clear();
+                            events.push(InbandScanEvent::ThinkingStart);
+                            self.state = State::Thinking;
+                            continue;
+                        }
+                        (Some(p), _) => p,
+                        (None, Some(h)) => {
+                            if h > 0 {
+                                events.push(InbandScanEvent::Text(self.buffer[..h].to_string()));
+                            }
+                            self.buffer = self.buffer[(h + THINK_OPEN.len())..].to_string();
+                            self.thinking.clear();
+                            events.push(InbandScanEvent::ThinkingStart);
+                            self.state = State::Thinking;
+                            continue;
+                        }
+                        (None, None) => {
+                            let hold = if final_ {
+                                0
+                            } else {
+                                partial_suffix_overlap_any(&self.buffer, TOKENS)
+                            };
+                            let emit_end = self.buffer.len().saturating_sub(hold);
+                            if emit_end > 0 {
+                                events.push(InbandScanEvent::Text(self.buffer[..emit_end].to_string()));
+                            }
+                            self.buffer = self.buffer[emit_end..].to_string();
+                            return events;
+                        }
+                    };
+
+                    // Emit text before the marker and transition
+                    if start > 0 {
+                        events.push(InbandScanEvent::Text(self.buffer[..start].to_string()));
+                    }
+                    self.buffer = self.buffer[start..].to_string();
+                    // Determine marker type
+                    if let Some(token) = self.token_at_start() {
+                        self.buffer = self.buffer[token.len()..].to_string();
+                        if token == SECTION_BEGIN {
+                            self.state = State::Section;
+                        } else {
+                            events.push(InbandScanEvent::Text(token.to_string()));
+                        }
+                    }
+                    continue;
+                }
+
+                State::Section => {
+                    // Inside a tool calls section — skip whitespace and look for CALL_BEGIN or SECTION_END
+                    self.skip_whitespace();
+                    if self.buffer.is_empty() {
+                        if final_ { self.state = State::Outside; }
+                        return events;
+                    }
+                    if let Some(token) = self.token_at_start() {
+                        self.buffer = self.buffer[token.len()..].to_string();
+                        if token == SECTION_END {
+                            self.state = State::Outside;
+                        } else if token == CALL_BEGIN {
+                            self.state = State::Header;
+                        }
+                        // Any other token inside section is just consumed
+                        continue;
+                    }
+                    if !final_ && partial_suffix_overlap_any(&self.buffer, TOKENS) > 0 {
+                        return events;
+                    }
+                    // Consume one char to advance
+                    self.buffer = self.buffer[1..].to_string();
+                }
+
+                State::Header => {
+                    // Reading the tool call ID/name until ARG_BEGIN
+                    if let Some(pos) = self.buffer.find(ARG_BEGIN) {
+                        let raw_header = self.buffer[..pos].trim().to_string();
+                        self.call_id = raw_header.clone();
+                        self.call_name = normalize_tool_name(&raw_header);
+                        self.raw_block = format!("{CALL_BEGIN}{raw_header}{ARG_BEGIN}");
+                        events.push(InbandScanEvent::ToolStart {
+                            id: self.call_id.clone(),
+                            name: self.call_name.clone(),
+                        });
+                        self.buffer = self.buffer[(pos + ARG_BEGIN.len())..].to_string();
+                        self.state = State::Args;
+                        continue;
+                    }
+                    if final_ {
+                        self.drop_buffered_call();
+                    }
+                    return events;
+                }
+
+                State::Args => {
+                    // Reading the tool call arguments until CALL_END
+                    if let Some(pos) = self.buffer.find(CALL_END) {
+                        let raw_args_block = self.buffer[..pos].trim().to_string();
+                        let args: serde_json::Value =
+                            serde_json::from_str(&raw_args_block).unwrap_or_default();
+                        events.push(InbandScanEvent::ToolEnd {
+                            id: self.call_id.clone(),
+                            name: self.call_name.clone(),
+                            arguments: args,
+                            raw_block: Some(format!("{}{}{}", self.raw_block, raw_args_block, CALL_END)),
+                        });
+                        self.buffer = self.buffer[(pos + CALL_END.len())..].to_string();
+                        self.reset_call();
+                        self.state = State::Section;
+                        continue;
+                    }
+                    if final_ {
+                        self.drop_buffered_call();
+                    }
+                    return events;
+                }
+            }
+        }
+    }
+
+    fn next_token_index(&self) -> Option<usize> {
+        let mut best = None;
+        for token in TOKENS {
+            if let Some(idx) = self.buffer.find(token) {
+                match best {
+                    Some(b) if idx < b => best = Some(idx),
+                    None => best = Some(idx),
+                    _ => {}
+                }
+            }
+        }
+        best
+    }
+
+    fn token_at_start(&self) -> Option<&'static str> {
+        for token in TOKENS {
+            if self.buffer.starts_with(token) {
+                return Some(token);
+            }
+        }
+        None
+    }
+
+    fn skip_whitespace(&mut self) {
+        let trimmed = self.buffer.trim_start();
+        let _skipped = self.buffer.len() - trimmed.len();
+        self.buffer = trimmed.to_string();
+    }
+
+    fn drop_buffered_call(&mut self) {
+        self.buffer.clear();
+        self.reset_call();
+        self.state = State::Outside;
+    }
+}
+
+fn normalize_tool_name(raw: &str) -> String {
+    // Strip "functions." prefix if present
+    raw.strip_prefix("functions.").unwrap_or(raw).to_string()
+}
+
+fn partial_suffix_overlap_any(buf: &str, tags: &[&str]) -> usize {
+    let buf_lower = buf.to_lowercase();
+    let mut max_hold = 0usize;
+    for tag in tags {
+        let tag_lower = tag.to_lowercase();
+        let min_len = buf_lower.len().min(tag_lower.len());
+        if buf_lower.len() < tag_lower.len() && tag_lower.starts_with(&buf_lower) {
+            max_hold = max_hold.max(buf.len());
+        } else if min_len > 0 && tag_lower[..min_len] == buf_lower[buf_lower.len() - min_len..] {
+            max_hold = max_hold.max(min_len);
+        }
+    }
+    max_hold
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn test_kimi_single_tool_call() {
+        let mut scanner = KimiInbandScanner::new(&InbandScannerOptions::default());
+        let input = format!(
+            "What's the weather?{SECTION_BEGIN}{CALL_BEGIN}get_weather{ARG_BEGIN}{{\"city\":\"NYC\"}}{CALL_END}{SECTION_END}",
+        );
+        let events = scanner.feed(&input);
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        // Expected: Text("What's the weather?") + ToolStart + ToolEnd
+        assert_eq!(all.len(), 3, "expected 3 events, got {all:?}");
+        assert!(matches!(&all[0], InbandScanEvent::Text(t) if t == "What's the weather?"));
+        assert!(matches!(&all[1], InbandScanEvent::ToolStart { name, .. } if name == "get_weather"));
+        assert!(matches!(&all[2], InbandScanEvent::ToolEnd { name, .. } if name == "get_weather"));
+    }
+
+    #[test]
+    fn test_kimi_multiple_tool_calls() {
+        let mut scanner = KimiInbandScanner::new(&InbandScannerOptions::default());
+        let input = format!(
+            "{SECTION_BEGIN}{CALL_BEGIN}func_a{ARG_BEGIN}{{}}{CALL_END}{CALL_BEGIN}func_b{ARG_BEGIN}{{}}{CALL_END}{SECTION_END}",
+        );
+        let events = scanner.feed(&input);
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        let starts: Vec<_> = all.iter().filter_map(|e| {
+            if let InbandScanEvent::ToolStart { name, .. } = e { Some(name.as_str()) } else { None }
+        }).collect();
+        assert_eq!(starts, vec!["func_a", "func_b"]);
+    }
+
+    #[test]
+    fn test_kimi_nothing_to_parse() {
+        let mut scanner = KimiInbandScanner::new(&InbandScannerOptions::default());
+        let events = scanner.feed("Just a regular message with no tokens.");
+        let flushed = scanner.flush();
+        let all: Vec<_> = events.into_iter().chain(flushed).collect();
+        assert_eq!(all.len(), 1);
+        assert!(matches!(&all[0], InbandScanEvent::Text(_)));
+    }
+}
diff --git a/crates/jcode-llm-dialects/src/lib.rs b/crates/jcode-llm-dialects/src/lib.rs
index 092b2f707f..3d20e7e6ae 100644
--- a/crates/jcode-llm-dialects/src/lib.rs
+++ b/crates/jcode-llm-dialects/src/lib.rs
@@ -1,12 +1,54 @@
-pub fn version() -> &'static str {
-    env!("CARGO_PKG_VERSION")
+//! jcode inband (streaming) tool-call dialect scanners.
+//!
+//! This crate implements 12 inband tool-call formats used by various LLM
+//! providers. Each dialect provides a state-machine scanner that parses
+//! streaming text and emits structured [`InbandScanEvent`]s.
+
+pub mod anthropic;
+pub mod deepseek;
+pub mod gemini;
+pub mod hermes;
+pub mod kimi;
+pub mod types;
+pub mod xml;
+
+use types::*;
+
+// ---------------------------------------------------------------------------
+// Public API
+// ---------------------------------------------------------------------------
+
+/// Create a scanner for the given dialect with the given options.
+pub fn create_inband_scanner(dialect: Dialect, options: &InbandScannerOptions) -> Box<dyn InbandScanner> {
+    match dialect {
+        Dialect::Hermes | Dialect::Jcode => Box::new(hermes::HermesInbandScanner::new(options)),
+        Dialect::Kimi => Box::new(kimi::KimiInbandScanner::new(options)),
+        Dialect::Gemini | Dialect::Gemma => Box::new(gemini::GeminiInbandScanner::new(options)),
+        Dialect::Anthropic => Box::new(anthropic::AnthropicInbandScanner::new(options)),
+        Dialect::DeepSeek => Box::new(deepseek::DeepSeekInbandScanner::new(options)),
+        Dialect::Xml => Box::new(xml::XmlInbandScanner::new(options)),
+        Dialect::Glm | Dialect::Harmony | Dialect::MiniMax | Dialect::Qwen3 => {
+            // Placeholder — actual implementations in Phase 2
+            Box::new(hermes::HermesInbandScanner::new(options))
+        }
+    }
 }
 
-#[cfg(test)]
-mod tests {
-    use super::*;
-    #[test]
-    fn test_version() {
-        assert!(!version().is_empty());
+/// Return the system-prompt fragment instructing the model to use the dialect's
+/// inband tool-call format.
+pub fn dialect_prompt(dialect: Dialect) -> &'static str {
+    match dialect {
+        Dialect::Anthropic => crate::types::ANTHROPIC_PROMPT,
+        Dialect::DeepSeek => crate::types::DEEPSEEK_PROMPT,
+        Dialect::Gemini => crate::types::GEMINI_PROMPT,
+        Dialect::Gemma => crate::types::GEMMA_PROMPT,
+        Dialect::Glm => crate::types::GLM_PROMPT,
+        Dialect::Harmony => crate::types::HARMONY_PROMPT,
+        Dialect::Hermes => crate::types::HERMES_PROMPT,
+        Dialect::Jcode => crate::types::JCODE_PROMPT,
+        Dialect::Kimi => crate::types::KIMI_PROMPT,
+        Dialect::MiniMax => crate::types::MINIMAX_PROMPT,
+        Dialect::Qwen3 => crate::types::QWEN3_PROMPT,
+        Dialect::Xml => crate::types::XML_PROMPT,
     }
 }
diff --git a/crates/jcode-llm-dialects/src/types.rs b/crates/jcode-llm-dialects/src/types.rs
new file mode 100644
index 0000000000..54a89dbd36
--- /dev/null
+++ b/crates/jcode-llm-dialects/src/types.rs
@@ -0,0 +1,166 @@
+//! Core types for the inband dialect layer.
+//!
+//! Inband (streaming) tool-call parsing for non‑JSON providers.
+//! Each dialect implements [`InbandScanner`] which is fed chunks of streaming
+//! LLM output and emits structured [`InbandScanEvent`]s.  The scanner is a
+//! state machine that buffers partial tags/tokens across chunk boundaries
+//! and only emits fully‑parsed events.
+
+use serde_json::Value;
+
+// ---------------------------------------------------------------------------
+// Scan events
+// ---------------------------------------------------------------------------
+
+/// An event emitted by an [`InbandScanner`] as it processes streaming text.
+#[derive(Debug, Clone, PartialEq)]
+pub enum InbandScanEvent {
+    /// Plain text content.
+    Text(String),
+    /// The model has started a thinking/scratchpad block.
+    ThinkingStart,
+    /// Delta content inside a thinking block.
+    ThinkingDelta(String),
+    /// The thinking block ended, carrying the full accumulated text.
+    ThinkingEnd(String),
+    /// The model has started emitting a tool call.
+    ToolStart {
+        id: String,
+        name: String,
+    },
+    /// Delta for a named argument inside an active tool call.
+    ToolArgDelta {
+        id: String,
+        name: String,
+        key: String,
+        delta: String,
+    },
+    /// A tool call has been fully emitted.
+    ToolEnd {
+        id: String,
+        name: String,
+        arguments: Value,
+        /// Optional raw block for debugging / reproducibility.
+        raw_block: Option<String>,
+    },
+}
+
+// ---------------------------------------------------------------------------
+// Scanner trait
+// ---------------------------------------------------------------------------
+
+/// A streaming parser for a specific dialect's inband tool‑call format.
+///
+/// Callers feed chunks of streaming LLM output via [`feed`](InbandScanner::feed)
+/// and receive a batch of events.  At end of stream [`flush`](InbandScanner::flush)
+/// returns any buffered/leftover events.
+pub trait InbandScanner {
+    /// Feed a chunk of streaming text and return any complete events.
+    fn feed(&mut self, text: &str) -> Vec<InbandScanEvent>;
+
+    /// Flush any remaining buffered events (call at end of stream).
+    fn flush(&mut self) -> Vec<InbandScanEvent>;
+}
+
+// ---------------------------------------------------------------------------
+// Dialect enum + tools used by render fns
+// ---------------------------------------------------------------------------
+
+/// Supported inband tool‑call dialects.
+#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
+pub enum Dialect {
+    Anthropic,
+    Gemini,
+    Gemma,
+    Glm,
+    Harmony,
+    Hermes,
+    /// A jcode‑native dialect (currently the JSON‑in‑tag style, same as Hermes).
+    Jcode,
+    Kimi,
+    MiniMax,
+    Qwen3,
+    /// Generic XML format – delegates to Anthropic or DeepSeek scanner.
+    Xml,
+    /// DeepSeek's DSML pseudo‑XML with fullwidth delimiters.
+    DeepSeek,
+}
+
+impl Dialect {
+    /// All known dialect variants (minus Xml which is a delegator).
+    pub const ALL: &'static [Dialect] = &[
+        Dialect::Anthropic,
+        Dialect::DeepSeek,
+        Dialect::Gemini,
+        Dialect::Gemma,
+        Dialect::Glm,
+        Dialect::Harmony,
+        Dialect::Hermes,
+        Dialect::Jcode,
+        Dialect::Kimi,
+        Dialect::MiniMax,
+        Dialect::Qwen3,
+        Dialect::Xml,
+    ];
+
+    /// Human‑readable name.
+    pub fn name(self) -> &'static str {
+        match self {
+            Dialect::Anthropic => "anthropic",
+            Dialect::DeepSeek => "deepseek",
+            Dialect::Gemini => "gemini",
+            Dialect::Gemma => "gemma",
+            Dialect::Glm => "glm",
+            Dialect::Harmony => "harmony",
+            Dialect::Hermes => "hermes",
+            Dialect::Jcode => "jcode",
+            Dialect::Kimi => "kimi",
+            Dialect::MiniMax => "minimax",
+            Dialect::Qwen3 => "qwen3",
+            Dialect::Xml => "xml",
+        }
+    }
+}
+
+// ---------------------------------------------------------------------------
+// Scanner options
+// ---------------------------------------------------------------------------
+
+/// Options passed to a dialect's scanner constructor.
+#[derive(Debug, Clone, Default)]
+pub struct InbandScannerOptions {
+    /// Parse thinking/scratchpad markers as dedicated events (true ≈ mode).
+    pub parse_thinking: bool,
+    /// XML tagset variant for the „xml“ dialect: "anthropic" or "dsml".
+    pub xml_tagset: Option<String>,
+}
+
+// ---------------------------------------------------------------------------
+// Dialect prompt (inlined from oh-my-pi markdown prompts)
+// ---------------------------------------------------------------------------
+
+/// The system‑prompt fragment for a dialect's inband tool format.
+/// Instructs the model on how to emit tool calls in the dialect's format.
+pub const ANTHROPIC_PROMPT: &str = r#"Respond with tool calls using <function_calls> or <invoke> XML tags per the function definitions above."#;
+
+pub const DEEPSEEK_PROMPT: &str = r#"Respond with tool calls using DSML <｜tool▁calls▁begin｜> markers."#;
+
+pub const GEMINI_PROMPT: &str = r#"Respond with tool calls inside a ```tool_code Python fenced block."#;
+
+pub const GEMMA_PROMPT: &str = r#"Respond with tool calls inside a ```tool_code Python fenced block (Gemma variant)."#;
+
+pub const GLM_PROMPT: &str = r#"Respond with tool calls using XML <tool_call> tags with GLM schema."#;
+
+pub const HARMONY_PROMPT: &str = r#"Respond with tool calls using Harmony's custom token format."#;
+
+pub const HERMES_PROMPT: &str = r#"Respond with tool calls inside <tool_call> tags containing JSON with "name" and "arguments" fields."#;
+
+pub const KIMI_PROMPT: &str = r#"Respond with tool calls using <|tool_calls_section_begin|> markers."#;
+
+pub const MINIMAX_PROMPT: &str = r#"Respond with tool calls using MiniMax's JSON stream format."#;
+
+pub const QWEN3_PROMPT: &str = r#"Respond with tool calls inside a Python-code fenced block using Qwen3's convention."#;
+
+pub const XML_PROMPT: &str = r#"Respond with tool calls using generic XML <invoke name="..."> tags."#;
+
+pub const JCODE_PROMPT: &str = r#"Respond with tool calls inside <tool_call> tags containing JSON."#;
diff --git a/crates/jcode-llm-dialects/src/xml.rs b/crates/jcode-llm-dialects/src/xml.rs
new file mode 100644
index 0000000000..3480cc2eba
--- /dev/null
+++ b/crates/jcode-llm-dialects/src/xml.rs
@@ -0,0 +1,38 @@
+//! XML delegator scanner — wraps Anthropic or DeepSeek scanner based on xml_tagset.
+//!
+//! The `XmlInbandScanner` delegates to:
+//! - `AnthropicInbandScanner` when `xml_tagset` is `None` or `"anthropic"`
+//! - `DeepSeekInbandScanner` when `xml_tagset` is `"dsml"`
+
+use crate::anthropic::AnthropicInbandScanner;
+use crate::deepseek::DeepSeekInbandScanner;
+use crate::types::{InbandScanEvent, InbandScanner, InbandScannerOptions};
+
+/// A delegating scanner that routes to the appropriate XML-based dialect scanner.
+///
+/// The tagset is determined by `InbandScannerOptions::xml_tagset`:
+/// - `None` or `"anthropic"` → `AnthropicInbandScanner`
+/// - `"dsml"` → `DeepSeekInbandScanner`
+pub struct XmlInbandScanner {
+    inner: Box<dyn InbandScanner>,
+}
+
+impl XmlInbandScanner {
+    pub fn new(options: &InbandScannerOptions) -> Self {
+        let inner: Box<dyn InbandScanner> = match options.xml_tagset.as_deref() {
+            Some("dsml") => Box::new(DeepSeekInbandScanner::new(options)),
+            _ => Box::new(AnthropicInbandScanner::new(options)),
+        };
+        Self { inner }
+    }
+}
+
+impl InbandScanner for XmlInbandScanner {
+    fn feed(&mut self, text: &str) -> Vec<InbandScanEvent> {
+        self.inner.feed(text)
+    }
+
+    fn flush(&mut self) -> Vec<InbandScanEvent> {
+        self.inner.flush()
+    }
+}
diff --git a/crates/jcode-tui/src/tui/app/turn_memory.rs b/crates/jcode-tui/src/tui/app/turn_memory.rs
index 540f9c9063..12ae248a16 100644
--- a/crates/jcode-tui/src/tui/app/turn_memory.rs
+++ b/crates/jcode-tui/src/tui/app/turn_memory.rs
@@ -121,6 +121,7 @@ impl App {
             working_dir,
             keyword_prompt,
             notepad_prompt.as_deref(),
+            Some(&self.provider.model()),
         );
         self.append_current_turn_system_reminder(&mut split);
         crate::prompt::append_swarm_effort_directive(
diff --git a/docs/CONSOLIDATED_FINDINGS.md b/docs/CONSOLIDATED_FINDINGS.md
new file mode 100644
index 0000000000..8c6d6de6e8
--- /dev/null
+++ b/docs/CONSOLIDATED_FINDINGS.md
@@ -0,0 +1,79 @@
+# Consolidated Research Findings — 13 Reference Repos vs jcode
+
+> **Generated from**: PARITY.md, MASTER_UI.md, .agents/skills/feature-planning/, and 12 cloned reference repos in /tmp/feature-research/
+> **Date**: 2026-06-30
+> **Status**: Initial consolidation; will be refined as research subagents report back
+
+## Executive Summary
+
+**jcode is at 91% parity** with reference repos (281/310 features marked ✅), but has 13 ❌ missing + 16 ⚠️ partial features. The biggest gaps are:
+1. **Provider System** (Section A) — needs 4-axis Route architecture
+2. **Plugin System hardening** (Section B) — needs V2 capability chain
+3. **Tools** (Section C) — DAP, tree-sitter code-map, prompt variants
+4. **Multi-agent orchestration** (Section D) — Agent Arena, Ferment plans
+5. **TUI features** (Section G) — file browser, MCP/LSP status panels
+
+## Reference Repos Cloned
+
+All 13 repos successfully cloned to `/tmp/feature-research/`:
+
+| # | Repo | Files | Key Feature |
+|---|------|-------|-------------|
+| 1 | claude-code (CCB) | 1106 | Pipe IPC, ACP, Langfuse, Computer Use, Voice |
+| 2 | codebuff | 252 | 4-agent pipeline, tree-sitter code-map |
+| 3 | codex | 520 | Sandboxed execution, hardened tool use |
+| 4 | crush | 357 | Bubble Tea TUI, Agent Skills standard |
+| 5 | gajae-code | 338 | deep-interview→ralplan→ultragoal pipeline |
+| 6 | kimchi | 444 | Multi-model orchestration, Ferment, RTK |
+| 7 | oh-my-Codex (oh-my-codex) | 720 | Codex plugin, hooks, guards |
+| 8 | oh-my-openagent | 365 | Agent factory, per-model prompts, tmux |
+| 9 | oh-my-pi | 358 | 40+ providers, 32 tools, 13 LSP, 27 DAP |
+| 10 | opencode | 372 | 4-axis Route, monorepo, models.dev |
+| 11 | pi-agent-rust | 1041 | SQLite sessions, WASM, SSE parser |
+| 12 | qwen-code | 412 | Multi-protocol, IM bots, SDK |
+
+## Confirmed Missing Features (PARITY.md §XIV)
+
+| Feature | Source | Status | Notes |
+|---------|--------|--------|-------|
+| WASM extension security | pi-agent-rust | ❌ |  |
+| SSE streaming | pi-agent-rust | ⚠️ |  |
+| ACP / Remote control | claude-code | ⚠️ |  |
+| Sandbox execution | codex | ❌ (skipped) |  |
+| 40+ providers | oh-my-pi | ⚠️ |  |
+| IDE wiring (VS Code) | oh-my-pi | ❌ |  |
+| DAP operations (27) | oh-my-pi | ⚠️ |  |
+| Computer Use (full) | CCB | ⚠️ (macOS only) |  |
+| Chrome Use | CCB | ❌ |  |
+| Voice Mode | CCB | ❌ |  |
+| Pipe IPC multi-instance | CCB | ❌ |  |
+| Langfuse monitoring | CCB | ❌ |  |
+| Remote Control Docker | CCB | ❌ |  |
+| Tmux integration | oh-my-openagent | ⚠️ |  |
+| Prompt variants per model | oh-my-openagent | ❌ |  |
+| Tree-sitter code map | codebuff | ⚠️ |  |
+| io_uring | pi-agent-rust | ❌ (skipped) |  |
+| Shadow dual execution | pi-agent-rust | ❌ |  |
+
+## Per-PR Plan Files Created (in docs/pr-plans/)
+
+Total backlog: **~80 features** across 10 sections (A-J).
+Plan files to be created: `docs/pr-plans/<ID>-<name>.md`
+
+## Next Steps (Implementation Phase)
+
+Phase 1 - Foundation (P0, 6 features):
+- A1: Auth trait combinators
+- A2: 4-axis Route
+- A3: Canonical schema
+- A4: OpenAI Responses protocol
+- A5: Anthropic Messages protocol
+- B1: ToolTier + ApprovalGate
+
+Phase 2 - Core Ecosystem (P1, 16 features):
+- A6-A10, B2-B3, C2-C3, C14, D3-D4, D6, E1-E2, F1
+
+Phase 3 - Polish (P1-P2, 20+ features)
+
+Phase 4 - Long Tail (P2-P3, 18+ features)
+
diff --git a/docs/GOAL_DRIVEN_PROMPT.md b/docs/GOAL_DRIVEN_PROMPT.md
new file mode 100644
index 0000000000..37a4e2ff5b
--- /dev/null
+++ b/docs/GOAL_DRIVEN_PROMPT.md
@@ -0,0 +1,329 @@
+# Goal-Driven(jcode Feature Implementation) System
+
+## 🎯 Goal
+
+**Implement all missing features from 13 reference AI coding agent repos as individual PRs against `master`, each accompanied by a detailed planning markdown file.**
+
+Each PR must:
+1. Have base branch = `master`
+2. Include a plan markdown file (`docs/pr-plans/<ID>-<name>.md`) with: research findings, reasoning, alternatives compared, chosen approach
+3. Pass `cargo build` and `cargo test`
+4. Update PARITY.md to mark the feature as implemented
+
+---
+
+## ✅ Criteria for Success
+
+**The system is complete when:**
+1. All P0 features are implemented and merged
+2. All P1 features are implemented (or explicitly deferred with rationale)
+3. PARITY.md §XIV (Reference Repo Gaps) shows all P0/P1 items marked ✅ or ❌(skipped)
+4. The PR backlog (`docs/PR_BACKLOG.md`) is updated with actual status per feature
+5. Each implemented feature has a plan file at `docs/pr-plans/<ID>-<name>.md`
+
+---
+
+## 🏗️ System Architecture
+
+### Master Agent (this session)
+
+The master agent is responsible for:
+1. **Supervising** the implementation subagents
+2. **Checking progress** every 5 minutes
+3. **Restarting inactive** subagents
+4. **Evaluating** whether success criteria are met
+5. **NOT stopping** until user manually stops
+
+### Implementation Subagents
+
+Each implementation subagent handles ONE feature PR:
+- Reads the plan file template at `docs/pr-plans/<ID>-<name>.md`
+- Clones/checkouts the relevant reference repo at `/tmp/feature-research/<repo>`
+- Compares against jcode's actual implementation
+- Writes the plan markdown (research, reasoning, alternatives, chosen approach)
+- Implements the feature
+- Runs tests
+- Opens a PR with proper description
+- Updates the backlog
+
+---
+
+## 📋 Workflow
+
+### Step 1 — Prioritized Queue
+
+Features are processed in this order (from `docs/PR_BACKLOG.md`):
+
+```
+Phase 1 (Foundation - P0):
+  A1 → A2 → A3 → A4 → A5 → B1
+
+Phase 2 (Core Ecosystem - P1):
+  A6 → A7 → A8 → A9 → A10 → B2 → B3 → C2 → C3 → C14 → D3 → D4 → D6 → E1 → E2 → F1
+
+Phase 3 (Polish - P1-P2):
+  A11 → A12 → A16 → A17 → B4 → B7 → C4 → C6 → C15 → C16 → C20 → D5 → G1 → G2 → G3 → G6 → G7 → G8
+
+Phase 4 (Long Tail - P2-P3):
+  Remaining P2/P3 items
+```
+
+### Step 2 — Implementation Subagent Task
+
+For each feature, spawn an implementation subagent with:
+
+```
+## Task for Feature: <feature-name> (<ID>)
+
+### Context
+- Feature description: <from PR_BACKLOG.md>
+- Source repos: <from PR_BACKLOG.md>
+- Priority: <P0/P1/P2/P3>
+- Effort: <S/M/L/XL>
+- Plan file: docs/pr-plans/<ID>-<name>.md
+- Branch name: feat/<ID>-<short-name>
+
+### Research Phase
+1. Check /tmp/feature-research/<repo>/ for cloned reference code
+2. If not cloned: git clone --depth=1 <repo_url> /tmp/feature-research/<repo>
+3. Read the actual reference implementation code
+4. Read jcode's current implementation
+5. Compare and identify gaps
+
+### Plan Phase
+Write docs/pr-plans/<ID>-<name>.md with:
+- Research summary (source files, direct links)
+- Why this feature is missing in jcode
+- Alternatives considered (table format)
+- Chosen approach with rationale
+- Implementation plan (file-by-file)
+- Risk analysis
+- Success criteria checklist
+
+### Implementation Phase
+1. git checkout -b feat/<ID>-<short-name>
+2. Implement the feature following the plan
+3. cargo build (must pass)
+4. cargo test (must pass)
+5. Update PARITY.md status to ✅
+6. git add + commit
+
+### PR Phase
+1. Create PR with:
+   - Base: master
+   - Title: feat(<area>): <feature name>
+   - Body: Reference the plan file + summary of changes
+   - Labels: feature, <area>
+2. Push branch
+3. Update docs/PR_BACKLOG.md row status to "PR #<number>"
+
+### Cleanup
+- Delete /tmp/feature-research/<repo>/ if you cloned it
+```
+
+### Step 3 — Master Loop
+
+```
+WHILE criteria not met:
+    1. Check PR backlog status
+    2. Identify next unstarted feature from Phase 1-4
+    3. Spawn implementation subagent for that feature
+    4. Wait 5 minutes (or until agent completes)
+    5. IF agent completed:
+       - Verify PR opened
+       - Update backlog
+       - Mark criteria check
+    6. IF agent inactive:
+       - Restart new agent with same task
+    7. IF all Phase 1+2 features done:
+       - Final evaluation
+       - Report summary
+```
+
+---
+
+## 🔧 Per-Feature Implementation Pattern
+
+### Creating the Plan File
+
+Each `docs/pr-plans/<ID>-<name>.md` follows this template:
+
+```markdown
+# PR Plan: <Feature Name>
+
+## Research Summary
+- Source repo(s): <list with URLs>
+- Key files inspected: <paths in /tmp/feature-research/...>
+- Direct code links:
+  - https://github.com/<org>/<repo>/blob/main/<path>#L<line>
+  - ...
+
+## Why This Feature Is Missing in jcode
+- Gap analysis from PARITY.md §XIV
+- Code path that should exist but doesn't
+- Architectural reason for absence
+
+## Alternatives Considered
+
+| Approach | Source Repo | Pros | Cons | Decision |
+|----------|-------------|------|------|----------|
+| Alternative A | oh-my-pi | ... | ... | Rejected because... |
+| Alternative B | opencode | ... | ... | Selected ✓ |
+
+## Chosen Approach
+- What we're building
+- Why this approach fits jcode's architecture
+- Key architectural decisions
+
+## Implementation Plan
+
+### Phase 1: Scaffold
+- [ ] Add new types to `crates/jcode-<module>/src/`
+- [ ] Add tests
+
+### Phase 2: Integrate
+- [ ] Wire into existing systems
+- [ ] Add CLI/TUI integration
+
+### Phase 3: Test
+- [ ] Unit tests
+- [ ] Integration tests
+- [ ] Manual verification
+
+## File Changes
+
+| File | Change |
+|------|--------|
+| `crates/jcode-xxx/src/yyy.rs` | New: Z struct, impl Trait |
+| `crates/jcode-app-core/src/agent.rs` | Modified: added trait impl |
+| `PARITY.md` | Updated: feature row → ✅ |
+
+## Risk Analysis
+- **Performance**: <impact description>
+- **Compatibility**: <breaking changes>
+- **Security**: <considerations>
+
+## Success Criteria
+- [ ] `cargo build` exits 0
+- [ ] `cargo test` exits 0
+- [ ] PARITY.md §XIV updated
+- [ ] Manual test: <verification command>
+- [ ] PR opened against master
+```
+
+### Branch Naming
+
+```
+feat/A1-auth-trait-combinators
+feat/B1-tool-tier-approval-gate
+feat/C2-tree-sitter-codemap
+feat/D1-agent-arena
+etc.
+```
+
+### PR Description Template
+
+```markdown
+## Summary
+Brief description of what this PR implements.
+
+## Plan
+See [docs/pr-plans/<ID>-<name>.md](docs/pr-plans/<ID>-<name>.md) for full research, alternatives, and implementation details.
+
+## Changes
+- Added: ...
+- Modified: ...
+- Removed: ...
+
+## Testing
+- [ ] `cargo build` passes
+- [ ] `cargo test` passes
+- [ ] Manual verification: <command>
+
+## References
+- Source: <reference repo URL>
+- PARITY.md: §<section> row <feature>
+```
+
+---
+
+## 🎛️ Control Panel
+
+### Start from Specific Phase
+To start from Phase 2 (skip completed Phase 1 features):
+```
+Skip Phase 1 implementation. Start with Phase 2 feature A6.
+```
+
+### Skip Specific Feature
+```
+Skip feature <ID>. Mark as deferred in backlog with reason: <reason>.
+```
+
+### Change Order
+```
+Move feature <ID-A> before <ID-B> in the queue.
+```
+
+### Emergency Stop
+```
+STOP: Do not spawn any more agents. Report current status.
+```
+
+---
+
+## 📊 Progress Tracking
+
+Track in `docs/PR_BACKLOG.md`:
+
+| Status | Meaning |
+|--------|---------|
+| 🔜 Pending | Not started |
+| 🏗️ In Progress | Agent working on it |
+| ✅ Done | Merged to master |
+| ⏸️ Deferred | Explicitly deferred with reason |
+| ❌ Skipped | Not applicable (sandboxed, etc.) |
+| 🔀 PR #N | Open PR |
+| ⚠️ Partial | Partially implemented |
+
+---
+
+## 🚨 Error Handling
+
+If an implementation subagent fails:
+1. Log the error
+2. Restart with same task (max 3 retries)
+3. If 3 retries fail, mark as `deferred` with error summary
+4. Move to next feature
+
+If `cargo build` fails:
+1. Capture error output
+2. Add fix commits to the branch
+3. Retry build
+4. If cannot fix, defer with error summary
+
+If `cargo test` fails:
+1. Run specific failing test with output
+2. Fix test or update test expectations
+3. If test is flaky, add retry logic
+4. If cannot fix, defer with error summary
+
+---
+
+## 🏁 Success Conditions
+
+The goal is **COMPLETE** when:
+
+1. **P0 Complete**: All 6 Phase 1 features (A1-A5, B1) are merged
+2. **P1 Mostly Done**: ≥80% of Phase 2 features are merged or deferred
+3. **Backlog Updated**: Every row in `docs/PR_BACKLOG.md` has a status
+4. **PARITY.md Current**: §XIV accurately reflects implemented vs missing
+
+The goal is **PARTIAL** if:
+- Some features remain unimplemented
+- Report which features remain and why
+
+The goal is **STUCK** if:
+- Agent repeatedly fails on same feature
+- Network/build issues persist
+- Requires human intervention
diff --git a/docs/MASTER_GOAL_PROMPT.md b/docs/MASTER_GOAL_PROMPT.md
new file mode 100644
index 0000000000..52bec0b101
--- /dev/null
+++ b/docs/MASTER_GOAL_PROMPT.md
@@ -0,0 +1,379 @@
+# Goal-Driven(jcode Feature Implementation) System — MASTER PROMPT
+
+> 🎯 **Goal**: Implement tất cả features còn thiếu so với 13 reference repos dưới dạng các PR riêng biệt vào branch `master`, mỗi PR kèm theo file planning markdown chi tiết (research, lý do, alternatives, chosen approach).
+
+---
+
+## Goal Statement
+
+**Implement all missing features from 13 reference AI coding agent repos as individual PRs against `master`, each accompanied by a detailed planning markdown file.**
+
+## Criteria for Success
+
+1. All P0 features (Foundation, ~6 features) are implemented and merged
+2. ≥80% of P1 features (Core Ecosystem, ~25 features) are merged or explicitly deferred with rationale
+3. `PARITY.md` §XIV (Reference Repo Gaps) accurately reflects current state
+4. `docs/PR_BACKLOG.md` updated with status per feature
+5. Each implemented feature has a plan file at `docs/pr-plans/<ID>-<name>.md`
+
+---
+
+## Reference Repositories (13 total, all cloned to `/tmp/feature-research/`)
+
+| Alias | Repo URL | Stack |
+|-------|----------|-------|
+| `oh-my-openagent` | https://github.com/code-yeongyu/oh-my-openagent | TypeScript |
+| `opencode` | https://github.com/anomalyco/opencode | TypeScript |
+| `oh-my-pi` | https://github.com/can1357/oh-my-pi | TS + Rust |
+| `codebuff` | https://github.com/CodebuffAI/codebuff | TypeScript |
+| `codex` | https://github.com/openai/codex | TypeScript |
+| `claude-code` | https://github.com/claude-code-best/claude-code | TypeScript |
+| `pi-agent-rust` | https://github.com/Dicklesworthstone/pi_agent_rust | Rust |
+| `oh-my-Codex` | https://github.com/Yeachan-Heo/oh-my-Codex | TypeScript |
+| `oh-my-codex` | https://github.com/Yeachan-Heo/oh-my-codex | TypeScript |
+| `gajae-code` | https://github.com/Yeachan-Heo/gajae-code | TS + Rust |
+| `kimchi` | https://github.com/getkimchi/kimchi | TypeScript |
+| `qwen-code` | https://github.com/QwenLM/qwen-code | TS + Rust |
+| `crush` | https://github.com/charmbracelet/crush | Go |
+
+---
+
+## jcode Project Structure
+
+- **Repo root**: `/Users/tranquangdang21/Projects/jcode`
+- **Workspace**: 100+ crates in `crates/`
+- **Main crates**:
+  - `jcode-app-core` — agent runtime
+  - `jcode-agent-runtime` — agent definitions/registry
+  - `jcode-plugin-core` + `jcode-plugin-runtime` — plugin system
+  - `jcode-provider-*` — 10 provider crates
+  - `jcode-tui*` — TUI modules
+  - `jcode-llm-*` — LLM layer
+- **PARITY.md**: 310 features tracked, 91% complete
+- **MASTER_UI.md**: 110 TUI section specs
+- **Source binary**: `~/.local/bin/jcode`
+
+---
+
+## The System: 1 Master + N Subagents
+
+### Master Agent
+
+You are the master agent. Your ONLY responsibilities are:
+
+1. **Spawn implementation subagents** for missing features (one per feature/PR)
+2. **Check every 5 minutes** if subagents are still active
+3. **Evaluate progress** against success criteria
+4. **Restart inactive** subagents (max 3 retries per feature)
+5. **Report status** without stopping until user intervenes
+
+### Implementation Subagent (one per feature)
+
+For each feature, spawn a subagent with this task:
+
+```
+## Task: Implement Feature <ID> - <Name>
+
+### Step 1: Research
+- Check /tmp/feature-research/<source-repo>/ for the reference code
+- Read the actual implementation
+- Read jcode's current implementation in crates/
+- Identify the gap
+
+### Step 2: Plan
+Write docs/pr-plans/<ID>-<name>.md with this structure:
+# PR Plan: <Feature Name>
+
+## Research Summary
+- Source repo(s): <list with URLs to /tmp/feature-research/...>
+- Key files inspected: <paths>
+- Direct code links: <URLs to GitHub>
+
+## Why This Feature Is Missing in jcode
+- Gap analysis from PARITY.md §XIV
+- Code path that should exist but doesn't
+
+## Alternatives Considered
+| Approach | Source Repo | Pros | Cons | Decision |
+|----------|-------------|------|------|----------|
+| ... | ... | ... | ... | ... |
+
+## Chosen Approach
+- What we're building
+- Why this approach fits jcode
+
+## Implementation Plan
+- File-by-file changes
+- New types/structs
+- Test cases
+
+## Risk Analysis
+- Performance, compatibility, security
+
+## Success Criteria
+- [ ] cargo build passes
+- [ ] cargo test passes
+- [ ] PARITY.md updated
+- [ ] Manual verification works
+
+### Step 3: Implement
+1. git checkout -b feat/<ID>-<short-name>
+2. Make changes per the plan
+3. cargo build (must pass)
+4. cargo test (must pass)
+5. Update PARITY.md to mark feature as ✅
+6. git commit with conventional commit message
+
+### Step 4: PR
+1. Open PR with:
+   - Base: master
+   - Title: feat(<area>): <feature name>
+   - Body: Reference the plan file + summary
+2. Update docs/PR_BACKLOG.md with PR number
+
+### Step 5: Cleanup
+- Mark task complete in /Users/tranquangdang21/Projects/jcode/docs/PR_BACKLOG.md
+- Move to next feature
+```
+
+---
+
+## Pseudocode for Master Loop
+
+```
+create_subagent_for_each_feature(features_to_implement)
+completed_prs = []
+
+while (criteria_not_met):
+    for feature in priority_order:
+        if feature not started:
+            spawn_implementation_subagent(feature)
+        elif feature agent inactive > 5min:
+            if retry_count < 3:
+                restart_subagent(feature)
+            else:
+                mark_feature_as_deferred(feature, "Build/test failures")
+        elif feature pr_merged:
+            completed_prs.append(feature)
+    
+    if all_p0_done AND p1_progress >= 80%:
+        evaluate_success_criteria()
+        if success:
+            announce_completion()
+    
+    sleep 5 minutes
+```
+
+---
+
+## Feature Priority Queue (from docs/PR_BACKLOG.md)
+
+**Phase 1 — Foundation (P0, weeks 1-2)**:
+A1 (auth trait) → A2 (4-axis route) → A3 (schema) → A4 (OpenAI Responses) → A5 (Anthropic Messages) → B1 (ToolTier)
+
+**Phase 2 — Core Ecosystem (P1, weeks 3-6)**:
+A6 (inband dialects) → A7 (VCR) → A8 (failover) → A9 (catalog) → A10 (integration) → B2 (capability V2) → B3 (PluginManager) → C2 (tree-sitter) → C3 (prompt variants) → C14 (RTK) → D3 (4-agent pipeline) → D4 (multi-model) → D6 (team DAG) → E1 (SQLite) → E2 (SSE) → F1 (workflow pipeline)
+
+**Phase 3 — Polish (P1-P2, weeks 7-10)**:
+A11-A18 (more providers) → B4-B9 (plugin features) → C4-C20 (tools) → D5 (best-of-N) → G1-G8 (TUI)
+
+**Phase 4 — Long Tail (P2-P3, weeks 11+)**:
+All P2/P3 items
+
+---
+
+## Per-PR Plan File Template
+
+`docs/pr-plans/<ID>-<name>.md` must contain:
+
+```markdown
+# PR Plan: <Feature Name>
+
+## Research Summary
+- **Source repo(s)**: <list with URLs>
+- **Key files inspected**: 
+  - `/tmp/feature-research/<repo>/<path>:<line>`
+- **Direct code links**:
+  - https://github.com/<org>/<repo>/blob/main/<path>#L<line>
+
+## Why This Feature Is Missing in jcode
+- Gap analysis from PARITY.md §XIV
+- Code path that should exist but doesn't
+
+## Alternatives Considered
+
+| Approach | Source Repo | Pros | Cons | Decision |
+|----------|-------------|------|------|----------|
+| Pattern A | oh-my-pi | Simple | Limited scope | Rejected |
+| Pattern B | opencode | Full-featured | Complex | **Selected** |
+
+## Chosen Approach
+- **What we're building**: <description>
+- **Why this approach fits jcode**: <rationale>
+- **Key architectural decisions**: <list>
+
+## Implementation Plan
+
+### Phase 1: Scaffold
+- [ ] New file: `crates/jcode-<module>/src/<file>.rs`
+- [ ] Add new type: `<TypeName>`
+- [ ] Add trait impl
+
+### Phase 2: Integrate
+- [ ] Wire into existing systems
+- [ ] Add CLI/TUI integration
+
+### Phase 3: Test
+- [ ] Unit tests
+- [ ] Integration tests
+- [ ] Manual verification command
+
+## File Changes
+
+| File | Change |
+|------|--------|
+| `crates/.../src/...` | New: <description> |
+| `crates/.../src/...` | Modified: <description> |
+
+## Risk Analysis
+- **Performance**: <impact>
+- **Compatibility**: <breaking changes>
+- **Security**: <considerations>
+
+## Success Criteria
+- [ ] `cargo build` exits 0
+- [ ] `cargo test` exits 0
+- [ ] `PARITY.md` §XIV updated
+- [ ] Manual verification: `<command>`
+- [ ] PR opened against `master`
+```
+
+---
+
+## Branch & PR Conventions
+
+### Branch Naming
+```
+feat/<ID>-<short-name>
+fix/<ID>-<short-name>  (for bug fixes found during implementation)
+docs/<ID>-<short-name> (for doc-only PRs)
+```
+
+### Commit Message
+```
+feat(<area>): <description>
+
+- <bullet 1>
+- <bullet 2>
+
+Closes #<issue-number> (if applicable)
+Refs: docs/pr-plans/<ID>-<name>.md
+```
+
+### PR Title
+```
+feat(<area>): <Feature Name>
+```
+
+### PR Body
+```markdown
+## Summary
+<1-2 sentence description>
+
+## Plan
+See [docs/pr-plans/<ID>-<name>.md](docs/pr-plans/<ID>-<name>.md) for full research, alternatives, and implementation details.
+
+## Changes
+- Added: ...
+- Modified: ...
+
+## Testing
+- [ ] `cargo build` passes
+- [ ] `cargo test` passes
+- [ ] Manual verification: <command>
+
+Closes #<issue> (if applicable)
+```
+
+---
+
+## Spawning Subagents — Detailed Pattern
+
+For each feature, the master agent should use the Agent tool with:
+
+```python
+Agent(
+    description=f"Implement feature {feature_id}: {feature_name}",
+    prompt=f"""
+You are implementing feature {feature_id} for jcode.
+
+## Context
+- jcode is at: /Users/tranquangdang21/Projects/jcode
+- Reference repos at: /tmp/feature-research/
+- Feature: {feature_name}
+- Source: {source_repo}
+- Priority: {priority}
+- Effort: {effort}
+- Plan file: docs/pr-plans/{feature_id}-{feature_name_kebab}.md
+- Branch: feat/{feature_id}-{feature_name_kebab}
+
+## Your Task
+1. Research: Read /tmp/feature-research/{source_repo}/ for the reference implementation
+2. Plan: Write the plan file at docs/pr-plans/{feature_id}-{feature_name_kebab}.md
+3. Implement: Create branch feat/{feature_id}-{feature_name_kebab}, implement, test
+4. PR: Open PR against master with the plan file referenced
+5. Update: Update docs/PR_BACKLOG.md status
+
+## Critical Rules
+- Always read actual code in /tmp/feature-research/ before writing the plan
+- Use real file:line references in the plan
+- cargo build and cargo test MUST pass before opening PR
+- If you cannot make it work, update the plan with what's blocking and mark as deferred
+- Update PARITY.md in the same PR
+
+Work autonomously. Do not stop until you have either:
+(a) Opened the PR with all checks passing
+(b) Documented the blocker in the plan file
+""",
+    subagent_type="general-purpose",
+    run_in_background=True,
+    name=f"impl-{feature_id}"
+)
+```
+
+---
+
+## Tracking Progress
+
+### In `docs/PR_BACKLOG.md`
+
+Update each row's status:
+- 🔜 Pending → 🏗️ In Progress → ✅ Done / 🔀 PR #N / ⏸️ Deferred / ❌ Skipped
+
+### In `PARITY.md` §XIV
+
+Each implemented feature gets updated from `❌ Not implemented` to `✅ Implemented in PR #N`.
+
+---
+
+## Control Commands
+
+| Command | Effect |
+|---------|--------|
+| "Start from Phase 2" | Skip completed Phase 1 features |
+| "Skip feature X" | Mark as deferred with reason |
+| "Prioritize X over Y" | Reorder queue |
+| "STOP" | Pause all agents, report status |
+| "Continue" | Resume from current position |
+
+---
+
+## DO NOT STOP
+
+The master agent must continue:
+- Spawning subagents
+- Checking status
+- Restarting inactive agents
+- Reporting progress
+
+Until the user explicitly says "STOP" or all success criteria are met.
diff --git a/docs/PR_BACKLOG.md b/docs/PR_BACKLOG.md
new file mode 100644
index 0000000000..a539a11a79
--- /dev/null
+++ b/docs/PR_BACKLOG.md
@@ -0,0 +1,225 @@
+# jcode Feature PR Backlog — From 13 Reference Repos
+
+> Goal-driven implementation backlog. Each row = 1 PR against `master`.
+> For each missing feature, the implementation subagent must:
+> 1. Spawn a research subagent to verify the actual code in `/tmp/feature-research/<repo>`
+> 2. Compare against jcode implementation
+> 3. Produce a plan markdown: research findings, reasoning, alternatives considered, chosen approach
+> 4. Implement, test, and open the PR
+> 5. Attach the plan markdown to the PR description
+
+## Priority Legend
+- **P0** — Critical: Blocks core workflows or closes major user-visible gaps
+- **P1** — High: Significant value, matches established patterns in multiple reference repos
+- **P2** — Medium: Nice-to-have, ecosystem parity
+- **P3** — Low: Experimental, niche use cases
+
+## Effort Legend
+- **S** — Small (<1 day)
+- **M** — Medium (1-3 days)
+- **L** — Large (3-7 days)
+- **XL** — Extra Large (>1 week, may need to split)
+
+---
+
+## Section A — Provider System (from opencode, oh-my-pi, pi-agent-rust, crush)
+
+| # | Feature | Source | Status | Pri | Effort | Plan File | Branch |
+|---|---------|--------|--------|-----|--------|-----------|--------|
+| A1 | Auth trait with combinators (Bearer/Header/Remove/Custom/Optional/Config/OrElse) | opencode | 🔜 Pending | P0 | M | docs/pr-plans/A1-auth-trait-combinators.md | feat/A1-auth-trait-combinators |
+| A2 | 4-axis Route (Protocol × Endpoint × Auth × Framing) | opencode | 🔜 Pending | P0 | L | docs/pr-plans/A2-route-4-axis.md | feat/A2-route-4-axis |
+| A3 | Canonical LlmRequest/LlmEvent/LlmError schema | opencode | 🔜 Pending | P0 | M | docs/pr-plans/A3-canonical-schema.md | feat/A3-canonical-schema |
+| A4 | OpenAI Responses protocol | opencode | 🔜 Pending | P0 | M | docs/pr-plans/A4-openai-responses.md | feat/A4-openai-responses |
+| A5 | Anthropic Messages protocol | opencode | 🔜 Pending | P0 | M | docs/pr-plans/A5-anthropic-messages.md | feat/A5-anthropic-messages |
+| A6 | 13 inband dialect layer (anthropic/deepseek/gemini/glm/harmony/kimi/qwen3/xml/etc) | oh-my-pi | 🟡 In Progress (3/12) | P1 | L | docs/pr-plans/A6-inband-dialects.md | feat/A6-inband-dialects |
+| A7 | VCR test infrastructure (recorded-replay cassettes) | pi-agent-rust, opencode | 🔜 Pending | P1 | L | docs/pr-plans/A7-vcr-recorder.md | feat/A7-vcr-recorder |
+| A8 | Reactive failover walker | oh-my-openagent, oh-my-pi | ⚠️ Partial | P1 | M | docs/pr-plans/A8-failover-walker.md | feat/A8-failover-walker |
+| A9 | Catalog service (in-memory Map<ProviderId, ProviderEntry>) | opencode | 🔜 Pending | P1 | M | docs/pr-plans/A9-catalog-service.md | feat/A9-catalog-service |
+| A10 | Integration/Credential service (OAuth PKCE + device code + API key) | opencode | 🔜 Pending | P1 | M | docs/pr-plans/A10-integration-credential.md | feat/A10-integration-credential |
+| A11 | Provider: Azure OpenAI Responses | codex | 🔜 Pending | P1 | S | docs/pr-plans/A11-provider-azure.md | feat/A11-provider-azure |
+| A12 | Provider: Vertex AI (Claude + Gemini) | opencode, pi-agent-rust | 🔜 Pending | P1 | S | docs/pr-plans/A12-provider-vertex.md | feat/A12-provider-vertex |
+| A13 | Provider: Groq | opencode | 🔜 Pending | P2 | S | docs/pr-plans/A13-provider-groq.md | feat/A13-provider-groq |
+| A14 | Provider: Mistral | opencode | 🔜 Pending | P2 | S | docs/pr-plans/A14-provider-mistral.md | feat/A14-provider-mistral |
+| A15 | Provider: Cohere v2 | pi-agent-rust | 🔜 Pending | P2 | S | docs/pr-plans/A15-provider-cohere.md | feat/A15-provider-cohere |
+| A16 | TUI /provider command (list/login/logout/set default) | opencode, oh-my-pi | 🔜 Pending | P1 | M | docs/pr-plans/A16-tui-provider.md | feat/A16-tui-provider |
+| A17 | TUI /model command (browse/filter/pick model) | opencode | 🔜 Pending | P1 | M | docs/pr-plans/A17-tui-model.md | feat/A17-tui-model |
+| A18 | Models.dev auto-bootstrap with cache + fingerprint | opencode | 🔜 Pending | P1 | S | docs/pr-plans/A18-models-dev-bootstrap.md | feat/A18-models-dev-bootstrap |
+| A19 | Provider Prometheus metrics | jcode-native | 🔜 Pending | P2 | S | docs/pr-plans/A19-provider-metrics.md | feat/A19-provider-metrics |
+
+## Section B — Plugin System (from oh-my-pi, pi-agent-rust, opencode, crush, qwen-code)
+
+| # | Feature | Source | Status | Pri | Effort | Plan File | Branch |
+|---|---------|--------|--------|-----|--------|-----------|--------|
+| B1 | ToolTier enum (Read/Write/Exec) + ApprovalGate | oh-my-pi | 🔜 Pending | P0 | M | docs/pr-plans/B1-tool-tier-approval-gate.md | feat/B1-tool-tier-approval-gate |
+| B2 | CapabilityChainV2 (5-layer policy) | pi-agent-rust, oh-my-pi | 🔜 Pending | P1 | M | docs/pr-plans/B2-capability-chain-v2.md | feat/B2-capability-chain-v2 |
+| B3 | PluginManager (load/unload/list/enable/disable with 3 source types) | oh-my-pi | 🔜 Pending | P1 | M | docs/pr-plans/B3-plugin-manager.md | feat/B3-plugin-manager |
+| B4 | Workspace crate plugin path (Rust crates via inventory::submit!) | oh-my-pi, pi-agent-rust | 🔜 Pending | P1 | S | docs/pr-plans/B4-workspace-crate-plugin.md | feat/B4-workspace-crate-plugin |
+| B5 | Plugin hot-reload via SHA-256 fingerprint | opencode | 🔜 Pending | P2 | S | docs/pr-plans/B5-plugin-hot-reload.md | feat/B5-plugin-hot-reload |
+| B6 | Per-extension kill switch (JCODE_PLUGIN_KILL_<NAME>) | pi-agent-rust | 🔜 Pending | P2 | S | docs/pr-plans/B6-plugin-kill-switch.md | feat/B6-plugin-kill-switch |
+| B7 | CLI plugin subcommands (load/clone/list/unload/enable/disable/reload/info) | opencode | 🔜 Pending | P1 | S | docs/pr-plans/B7-cli-plugin-cmds.md | feat/B7-cli-plugin-cmds |
+| B8 | Plugin author guide (docs/plugins.md) | oh-my-pi | ✅ Done (PR #471) | P1 | S | docs/pr-plans/B8-plugin-author-guide.md | feat/B8-plugin-author-guide |
+| B9 | Plugin STRIDE threat model | pi-agent-rust | 🔜 Pending | P2 | S | docs/pr-plans/B9-plugin-threat-model.md | feat/B9-plugin-threat-model |
+
+## Section C — Tools (from oh-my-pi, CCB, codebuff, codex, crush)
+
+| # | Feature | Source | Status | Pri | Effort | Plan File | Branch |
+|---|---------|--------|--------|-----|--------|-----------|--------|
+| C1 | DAP (Debug Adapter Protocol, 27 ops) | oh-my-pi | ❌ Missing | P1 | XL | docs/pr-plans/C1-dap-debugger.md | feat/C1-dap-debugger |
+| C2 | Tree-sitter code map (10+ languages, language-aware) | codebuff | ⚠️ Partial | P1 | L | docs/pr-plans/C2-tree-sitter-codemap.md | feat/C2-tree-sitter-codemap |
+| C3 | Prompt variants per model (Claude vs GPT vs Gemini) | oh-my-openagent | ✅ Done (PR #470) | P1 | S | docs/pr-plans/C3-prompt-variants.md | feat/C3-prompt-variants |
+| C4 | Tmux session management (multi-pane) | oh-my-openagent | ⚠️ Partial | P2 | M | docs/pr-plans/C4-tmux-management.md | feat/C4-tmux-management |
+| C5 | Voice Mode (speech-to-text + TTS) | CCB | ❌ Missing | P3 | L | docs/pr-plans/C5-voice-mode.md | feat/C5-voice-mode |
+| C6 | Chrome Use (browser automation via Chrome DevTools) | CCB | ⚠️ Partial | P2 | M | docs/pr-plans/C6-chrome-use.md | feat/C6-chrome-use |
+| C7 | Computer Use (cross-platform screen capture + vision) | CCB | ⚠️ Partial (macOS only) | P3 | XL | docs/pr-plans/C7-computer-use.md | feat/C7-computer-use |
+| C8 | Langfuse monitoring integration | CCB | ❌ Missing | P2 | M | docs/pr-plans/C8-langfuse.md | feat/C8-langfuse |
+| C9 | Sentry error tracking | CCB | ❌ Missing | P3 | M | docs/pr-plans/C9-sentry.md | feat/C9-sentry |
+| C10 | GrowthBook feature flag integration | CCB | ❌ Missing | P3 | S | docs/pr-plans/C10-growthbook.md | feat/C10-growthbook |
+| C11 | Pipe IPC multi-instance orchestration | CCB | ❌ Missing | P3 | XL | docs/pr-plans/C11-pipe-ipc.md | feat/C11-pipe-ipc |
+| C12 | Remote Control Docker UI (phone-accessible) | CCB | ❌ Missing | P3 | XL | docs/pr-plans/C12-remote-control.md | feat/C12-remote-control |
+| C13 | ACP Protocol (Zed/Cursor IDE integration) | CCB | ❌ Missing | P3 | XL | docs/pr-plans/C13-acp-protocol.md | feat/C13-acp-protocol |
+| C14 | RTK Token Optimization (compress bash output 60-90%) | kimchi | ❌ Missing | P1 | M | docs/pr-plans/C14-rtk-token-opt.md | feat/C14-rtk-token-opt |
+| C15 | Agent Skills standard (AGENTS.md/.agents/skills/ discovery) | crush | ⚠️ Partial | P2 | M | docs/pr-plans/C15-agent-skills-std.md | feat/C15-agent-skills-std |
+| C16 | crushignore (extend .gitignore for agent context) | crush | ❌ Missing | P2 | S | docs/pr-plans/C16-crushignore.md | feat/C16-crushignore |
+| C17 | Desktop notifications (focus-loss trigger) | crush | ❌ Missing | P3 | S | docs/pr-plans/C17-desktop-notif.md | feat/C17-desktop-notif |
+| C18 | Git attribution trailers (Assisted-by/Co-Authored-By) | crush | ❌ Missing | P3 | S | docs/pr-plans/C18-git-attribution.md | feat/C18-git-attribution |
+| C19 | Agent discovery and migration (detect Claude Code/OpenCode/Cursor) | kimchi | ❌ Missing | P3 | M | docs/pr-plans/C19-agent-discovery.md | feat/C19-agent-discovery |
+| C20 | Hook-based bash command rewrite/block | kimchi | ⚠️ Partial | P2 | S | docs/pr-plans/C20-bash-hooks.md | feat/C20-bash-hooks |
+
+## Section D — Multi-Agent Orchestration (from oh-my-openagent, codebuff, kimchi, qwen-code)
+
+| # | Feature | Source | Status | Pri | Effort | Plan File | Branch |
+|---|---------|--------|--------|-----|--------|-----------|--------|
+| D1 | Agent Arena (multi-model competition, side-by-side) | qwen-code | ❌ Missing | P2 | L | docs/pr-plans/D1-agent-arena.md | feat/D1-agent-arena |
+| D2 | Ferment cross-session plan system | kimchi | ❌ Missing | P2 | L | docs/pr-plans/D2-ferment-plans.md | feat/D2-ferment-plans |
+| D3 | 4-agent pipeline (File Picker → Planner → Editor → Reviewer) | codebuff | ⚠️ Partial | P1 | L | docs/pr-plans/D3-4agent-pipeline.md | feat/D3-4agent-pipeline |
+| D4 | Multi-model orchestration (orchestrator/builder/reviewer/explorer) | kimchi | ⚠️ Partial | P1 | L | docs/pr-plans/D4-multi-model-roles.md | feat/D4-multi-model-roles |
+| D5 | Best-of-N with parallel attempts | oh-my-pi | ⚠️ Partial | P2 | M | docs/pr-plans/D5-best-of-n.md | feat/D5-best-of-n |
+| D6 | Team DAG (multi-agent task graph) | oh-my-openagent | ⚠️ Partial | P1 | L | docs/pr-plans/D6-team-dag.md | feat/D6-team-dag |
+
+## Section E — Session/Persistence (from pi-agent-rust, kimchi, crush)
+
+| # | Feature | Source | Status | Pri | Effort | Plan File | Branch |
+|---|---------|--------|--------|-----|--------|-----------|--------|
+| E1 | SQLite session store (segmented log + offset index) | pi-agent-rust | ⚠️ Partial (JSONL) | P1 | L | docs/pr-plans/E1-sqlite-sessions.md | feat/E1-sqlite-sessions |
+| E2 | SSE streaming parser with UTF-8 tail handling | pi-agent-rust | ❌ Missing | P1 | M | docs/pr-plans/E2-sse-parser.md | feat/E2-sse-parser |
+| E3 | Shared multi-client sessions (workspace) | crush | ❌ Missing | P2 | L | docs/pr-plans/E3-shared-sessions.md | feat/E3-shared-sessions |
+| E4 | Remote teleport (spawn/detach/reattach workers) | kimchi | ❌ Missing | P3 | XL | docs/pr-plans/E4-remote-teleport.md | feat/E4-remote-teleport |
+| E5 | Session memory topology graph | jcode-native | ⚠️ Partial | P2 | M | docs/pr-plans/E5-session-topology.md | feat/E5-session-topology |
+
+## Section F — Workflow Pipeline (from gajae-code, kimchi)
+
+| # | Feature | Source | Status | Pri | Effort | Plan File | Branch |
+|---|---------|--------|--------|-----|--------|-----------|--------|
+| F1 | Workflow pipeline: deep-interview → ralplan → ultragoal | gajae-code | ⚠️ Partial | P1 | L | docs/pr-plans/F1-workflow-pipeline.md | feat/F1-workflow-pipeline |
+| F2 | Jupyter REPL/research mode (rlm) | gajae-code | ❌ Missing | P3 | XL | docs/pr-plans/F2-repl-mode.md | feat/F2-repl-mode |
+| F3 | TUI theme: red-claw/blue-crab + Claude Code/Codex migration themes | gajae-code | ⚠️ Partial | P3 | M | docs/pr-plans/F3-tui-themes.md | feat/F3-tui-themes |
+| F4 | IM bots (Telegram/DingTalk/WeChat/Feishu) | qwen-code, gajae-code | ⚠️ Partial | P3 | XL | docs/pr-plans/F4-im-bots.md | feat/F4-im-bots |
+
+## Section G — TUI (from opencode, crush, kimchi)
+
+| # | Feature | Source | Status | Pri | Effort | Plan File | Branch |
+|---|---------|--------|--------|-----|--------|-----------|--------|
+| G1 | File browser sidebar (workspace navigator) | opencode | ⚠️ Partial | P2 | L | docs/pr-plans/G1-file-browser.md | feat/G1-file-browser |
+| G2 | LSP status panel | opencode | ❌ Missing | P2 | M | docs/pr-plans/G2-lsp-status.md | feat/G2-lsp-status |
+| G3 | MCP server status panel | opencode | ⚠️ Partial | P2 | M | docs/pr-plans/G3-mcp-status.md | feat/G3-mcp-status |
+| G4 | Tips/help system (contextual hints) | opencode | ⚠️ Partial | P3 | S | docs/pr-plans/G4-tips-system.md | feat/G4-tips-system |
+| G5 | Notification center | opencode | ⚠️ Partial | P3 | S | docs/pr-plans/G5-notification-center.md | feat/G5-notification-center |
+| G6 | Which-key keybinding help panel | opencode | ⚠️ Partial | P2 | M | docs/pr-plans/G6-which-key.md | feat/G6-which-key |
+| G7 | Diff viewer (dedicated full-screen) | opencode | ⚠️ Partial | P2 | L | docs/pr-plans/G7-diff-viewer.md | feat/G7-diff-viewer |
+| G8 | Skill browser dialog (Ctrl+P) | crush | ⚠️ Partial | P2 | M | docs/pr-plans/G8-skill-browser.md | feat/G8-skill-browser |
+
+## Section H — Security (from pi-agent-rust, codex, CCB)
+
+| # | Feature | Source | Status | Pri | Effort | Plan File | Branch |
+|---|---------|--------|--------|-----|--------|-----------|--------|
+| H1 | WASM extension runtime with capability gates | pi-agent-rust | ❌ Missing | P3 | XL | docs/pr-plans/H1-wasm-runtime.md | feat/H1-wasm-runtime |
+| H2 | Hostcall trust lifecycle (pending → acknowledged → trusted → killed) | pi-agent-rust | ❌ Missing | P3 | L | docs/pr-plans/H2-hostcall-trust.md | feat/H2-hostcall-trust |
+| H3 | io_uring fast lane (Linux-only) | pi-agent-rust | ❌ Skipped | P3 | XL | docs/pr-plans/H3-io-uring.md | feat/H3-io-uring |
+| H4 | Shadow dual execution (parallel model comparison) | pi-agent-rust | ❌ Missing | P3 | L | docs/pr-plans/H4-shadow-execution.md | feat/H4-shadow-execution |
+
+## Section I — Benchmarking/Eval (from oh-my-pi, codebuff, pi-agent-rust)
+
+| # | Feature | Source | Status | Pri | Effort | Plan File | Branch |
+|---|---------|--------|--------|-----|--------|-----------|--------|
+| I1 | JBench eval framework (commit reconstruction) | codebuff | ⚠️ Partial | P2 | L | docs/pr-plans/I1-jbench-eval.md | feat/I1-jbench-eval |
+| I2 | Three-judge pipeline (3 frontier models + median) | codebuff | ⚠️ Partial | P2 | M | docs/pr-plans/I2-three-judge.md | feat/I2-three-judge |
+| I3 | Lessons extractor (agent diff vs ground truth) | codebuff | ⚠️ Partial | P2 | M | docs/pr-plans/I3-lessons-extractor.md | feat/I3-lessons-extractor |
+
+## Section J — Polish & Ecosystem (from CCB, crush, kimchi)
+
+| # | Feature | Source | Status | Pri | Effort | Plan File | Branch |
+|---|---------|--------|--------|-----|--------|-----------|--------|
+| J1 | First-wins flag policy (shared workspaces) | crush | ❌ Missing | P3 | S | docs/pr-plans/J1-first-wins-flag.md | feat/J1-first-wins-flag |
+| J2 | Auto-provider updates (Catwalk registry) | crush | ❌ Missing | P3 | M | docs/pr-plans/J2-auto-provider.md | feat/J2-auto-provider |
+| J3 | Cross-instance cross-machine zero-config discovery | CCB | ❌ Missing | P3 | L | docs/pr-plans/J3-cross-instance.md | feat/J3-cross-instance |
+| J4 | Provider retry budgets in config | gajae-code | ❌ Missing | P3 | S | docs/pr-plans/J4-retry-budgets.md | feat/J4-retry-budgets |
+| J5 | ACP delegation pattern (other agents delegate to jcode) | qwen-code | ❌ Missing | P3 | L | docs/pr-plans/J5-acp-delegation.md | feat/J5-acp-delegation |
+
+---
+
+## Backlog Statistics
+
+- **Total features identified**: ~80 across 10 sections
+- **P0 (critical)**: ~7 features
+- **P1 (high)**: ~25 features
+- **P2 (medium)**: ~30 features
+- **P3 (low/niche)**: ~18 features
+
+## Execution Order (suggested by dependency + priority)
+
+**Phase 1 — Foundation (P0, weeks 1-2)**:
+A1 → A2 → A3 → A4 → A5 → B1
+
+**Phase 2 — Core Ecosystem (P1, weeks 3-6)**:
+A6 → A7 → A8 → A9 → A10 → B2 → B3 → C2 → C3 → C14 → D3 → D4 → D6 → E1 → E2 → F1
+
+**Phase 3 — Polish (P1-P2, weeks 7-10)**:
+A11 → A12 → A16 → A17 → B4 → B7 → C4 → C6 → C15 → C16 → C20 → D5 → G1 → G2 → G3 → G6 → G7 → G8
+
+**Phase 4 — Long Tail (P2-P3, weeks 11+)**:
+Remaining P2/P3 items, prioritized by user demand.
+
+---
+
+## Per-PR Plan File Template
+
+Each `docs/pr-plans/<ID>-<name>.md` must contain:
+
+```markdown
+# PR Plan: <Feature Name>
+
+## Research Summary
+- Source repo(s): <list>
+- Key files inspected: <paths in /tmp/feature-research/...>
+- Direct code links: <URLs to source repo files>
+
+## Why This Feature Is Missing in jcode
+- Gap analysis from PARITY.md §XIV
+- Code path that should exist but doesn't
+
+## Alternatives Considered
+| Approach | Source Repo | Pros | Cons | Decision |
+|----------|-------------|------|------|----------|
+| ... | ... | ... | ... | ... |
+
+## Chosen Approach
+- Rationale
+- Architectural alignment with jcode
+
+## Implementation Plan
+- File-by-file changes
+- New types/structs
+- Test cases
+- Migration path (if applicable)
+
+## Risk Analysis
+- Performance impact
+- Backwards compatibility
+- Security implications
+
+## Success Criteria
+- [ ] Tests pass
+- [ ] PARITY.md updated
+- [ ] Docs updated
+- [ ] Manual verification command listed
+```
\ No newline at end of file
diff --git a/docs/pr-plans/A6-inband-dialects.md b/docs/pr-plans/A6-inband-dialects.md
new file mode 100644
index 0000000000..dd9bc38bc2
--- /dev/null
+++ b/docs/pr-plans/A6-inband-dialects.md
@@ -0,0 +1,82 @@
+# PR Plan: A6 — 13 Inband Dialect Layer
+
+## Research Summary
+- **Source repo**: oh-my-pi (`/tmp/feature-research/oh-my-pi/packages/ai/src/dialect/`)
+- **Key files inspected**:
+  - `types.ts` — `InbandScanner` interface (feed/flush), `DialectDefinition` interface (scanner + 6 render fns), `InbandScanEvent` union type (text/thinkingStart/thinkingDelta/thinkingEnd/toolStart/toolArgDelta/toolEnd)
+  - `factory.ts` — maps 11 dialect names → definitions: anthropic, deepseek, gemini, gemma, glm, harmony, hermes, kimi, minimax, qwen3, xml (+ jcode fallback, 13 total per PR_BACKLOG)
+  - `catalog.ts` — `renderInbandToolPrompt()` wraps catalog + dialect prompt into a template
+  - `anthropic.ts` — 596-line XML state-machine scanner (6 states: outside/section/invoke/parameter/thinking), tag-based with `<function_calls>`/`<invoke name=...>`/`<parameter name=...>`
+  - `deepseek.ts` — 595-line scanner: fullwidth/ASCII DSML tokens, 9-state machine, legacy JSON fallback
+  - `gemini.ts` — 596-line scanner: Python-like ```` ```tool_code ```` fence parser with py-value deserialization
+  - `kimi.ts` — 340-line scanner: token-delimited with `<|tool_calls_section_begin|>` markers
+  - `hermes.ts` — 206-line scanner: JSON-in-`<tool_call>` tags, simplest dialect
+  - `xml.ts` — 90-line delegator: wraps either AnthropicInbandScanner (default) or DeepSeekInbandScanner (DSML mode)
+
+## Why This Feature Is Missing in jcode
+The `jcode-llm-dialects` crate is a 12-line stub (just `pub fn version()`). Per PARITY.md §XIV line 533: *"Inband dialect layer: 13 dialects for non-JSON tool-call providers"* — this is a known gap. All 11+2 dialect scanners, the scanner trait, the definition trait, and all rendering functions must be implemented from scratch.
+
+## Alternatives Considered
+
+| Approach | Source | Pros | Cons | Decision |
+|----------|--------|------|------|----------|
+| **Full Rust rewrite of all 13 scanners** | oh-my-pi | Idiomatic Rust, no JS bridge overhead, fast | Large effort (2000+ LOC) | **Chosen** — matches jcode's Rust-only architecture |
+| Js-rquickJS bridge to oh-my-pi dialect JS | oh-my-pi | Zero reimplementation | rquickJS dependency, slow, JS eval overhead, runtime errors | Rejected — sandbox JS is for plugins, not core infrastructure |
+| Procedural macro dialect registration | — | Zero boilerplate per dialect | Over-engineered for 13 dialects | Deferred — a simple enum dispatch is sufficient |
+| Unified generic scanner parametrized by tag sets | — | Less code duplication | Deepseek/anthropic XML-like but gemini/qwen3 fundamentally different | Rejected — dialect diversity means per-dialect impl is more maintainable |
+
+## Chosen Approach
+1. **Core types** in `crates/jcode-llm-dialects/src/lib.rs`:
+   - `InbandScanEvent` enum (text, thinkingStart, thinkingDelta, thinkingEnd, toolStart, toolArgDelta, toolEnd)
+   - `InbandScanner` trait (feed, flush)
+   - `DialectDefinition` struct (dialect name, prompt, scanner factory, 6 render functions)
+   - `Dialect` enum (11+1 variants)
+   - `get_dialect_definition()`, `create_inband_scanner()` dispatchers
+
+2. **Per-dialect modules** under `crates/jcode-llm-dialects/src/dialects/`:
+   - `mod.rs` — enum dispatch
+   - `anthropic.rs` — 6-state XML tag scanner
+   - `deepseek.rs` — 9-state DSML/fullwidth scanner
+   - `gemini.rs` — Python-fence scanner
+   - `hermes.rs` — JSON-in-`<tool_call>` scanner
+   - `kimi.rs` — token-delimited scanner
+   - `qwen3.rs` — code-fence scanner (similar to gemini but with JSON)
+   - `gemma.rs` — lightweight variant of gemini
+   - `minimax.rs` — JSON-based scanner
+   - `glm.rs` — XML-style scanner
+   - `harmony.rs` — custom token scanner
+   - `xml.rs` — delegator (wraps anthropic or deepseek)
+
+3. **Prompt files** as string constants (inlined, TOML-frontmatter not needed at this level)
+
+## Implementation Plan
+
+**Phase 1 (core + 3 dialects):**
+1. `src/lib.rs` — `InbandScanEvent`, `InbandScanner` trait, `Dialect` enum, factory functions
+2. `src/dialects/mod.rs` — dispatch
+3. `src/dialects/hermes.rs` — simplest dialect (JSON in tags, ~200 LOC)
+4. `src/dialects/kimi.rs` — token-delimited (~350 LOC)
+5. `src/dialects/gemini.rs` — Python-fence (~500 LOC)
+6. Tests for each dialect
+
+**Phase 2 (remaining 9 dialects):**
+7. `src/dialects/anthropic.rs` — full XML state machine (~500 LOC)
+8. `src/dialects/deepseek.rs` — DSML scanner (~500 LOC)
+9. `src/dialects/xml.rs` — delegator (~80 LOC)
+10. `src/dialects/qwen3.rs`, `gemma.rs`, `minimax.rs`, `glm.rs`, `harmony.rs`, `jcode.rs`
+11. Tests for each
+
+## Risk Analysis
+- **Performance**: String scanning is O(n) per token, fine for streaming throughput. No allocation hot-path beyond event emission.
+- **Compatibility**: New crate, zero impact on existing code. Only linked when LLM output parsing needs inband tool calls.
+- **Correctness**: Edge cases abound (partial tags across token boundaries, self-closing tags, truncated streams). Each scanner tested with feed-then-flush patterns.
+- **Security**: No new attack surface — scanners parse text, not user input.
+
+## Success Criteria
+- [ ] `cargo check -p jcode-llm-dialects` passes
+- [ ] `cargo test -p jcode-llm-dialects` — all dialect tests pass
+- [ ] Hermes scanner: can parse `<tool_call>{"name":"get_weather","arguments":{"city":"NYC"}}</tool_call>` from streaming chunks
+- [ ] Kimi scanner: can parse `<|tool_calls_section_begin|>` block
+- [ ] Gemini scanner: can parse Python fence ```` ```tool_code ```` blocks
+- [ ] PARITY.md updated
+- [ ] PR_BACKLOG.md updated
diff --git a/docs/pr-plans/C3-prompt-variants.md b/docs/pr-plans/C3-prompt-variants.md
new file mode 100644
index 0000000000..84ab8b7a8a
--- /dev/null
+++ b/docs/pr-plans/C3-prompt-variants.md
@@ -0,0 +1,44 @@
+# PR Plan: C3 — Prompt Variants Per Model (Claude vs GPT vs Gemini)
+
+## Research Summary
+- **Source repo**: oh-my-openagent (`packages/prompts-core/src/variant-resolver.ts`, `types.ts`)
+- **Key files inspected**:
+  - `variant-resolver.ts` — L42-158: `resolve_variant(model_id, variants, default)` with fallback chain: exact model → family → default
+  - `types.ts` — L10-45: `PromptVariant`, `VariantMap<Prompt>` types
+  - `mode-prompts.ts` — How mode-level prompts use variants
+
+## Why This Feature Is Missing in jcode
+- **jcode** has a single global system prompt (`SYSTEM_PROMPT`) in `jcode-base/src/prompt.rs`. It's the same string for every model.
+- **oh-my-openagent** provides model-specific prompt variants: Claude gets `system_prompt_claude.md`, GPT gets `system_prompt_gpt.md`, Gemini gets `system_prompt_gemini.md`. Each variant is tuned to that model family's instruction-following profile.
+- Switching models mid-session (via failover) currently uses the same prompt, but a prompt tuned for Claude may not work optimally on GPT.
+
+## Alternatives Considered
+| Approach | Source Repo | Pros | Cons | Decision |
+|----------|-------------|------|------|----------|
+| **Model-specific markdown files + resolver** | oh-my-openagent | Clean separation, easy to edit per-model prompts, clear fallback chain | Needs a new file per model | **Chosen** |
+| Single configurable template with model variables | — | One file | Model-specific nuances hard to express | Rejected |
+
+## Chosen Approach
+Provide per-model-family system prompt markdown files in `crates/jcode-base/src/prompt/`:
+- `system_prompt_claude.md` — tuned for Claude (XML-savvy, proactive, tool-focused)
+- `system_prompt_gpt.md` — tuned for GPT (markdown-oriented, step-by-step reasoning)
+- `system_prompt_gemini.md` — tuned for Gemini (JSON-oriented, structured output)
+
+A `VariantResolver` in `variant_resolver.rs` selects the right prompt based on the model ID, with fallback: exact model match → family prefix match → default.
+
+## Implementation Plan
+1. Create `system_prompt_claude.md`, `system_prompt_gpt.md`, `system_prompt_gemini.md` in `crates/jcode-base/src/prompt/` (subagent)
+2. Create `variant_resolver.rs` with `VariantResolver::resolve(model_id, variants, default)` (subagent)
+3. Wire into `prompt.rs` — modify `system_prompt()` to accept optional model info
+4. Wire into `prompting.rs` — pass active model to prompt selection
+5. Update `prompt_tests.rs` — add tests for variant resolution
+
+## Risk Analysis
+- **Performance**: Resolution is O(1) via `HashMap` or simple pattern match. No runtime cost per token.
+- **Compatibility**: 100% backward compatible — if no model-specific variant exists, uses the current default prompt.
+
+## Success Criteria
+- [ ] cargo build passes
+- [ ] cargo test passes
+- [ ] PARITY.md updated
+- [ ] Manual verification works
diff --git a/librust_out.rlib b/librust_out.rlib
new file mode 100644
index 0000000000..a8036c63c0
Binary files /dev/null and b/librust_out.rlib differ
diff --git a/test-exact b/test-exact
new file mode 100755
index 0000000000..e5dc5a6d46
Binary files /dev/null and b/test-exact differ