diff --git a/CHANGES b/CHANGES index a088e56..ec19d49 100644 --- a/CHANGES +++ b/CHANGES @@ -6,6 +6,18 @@ _Notes on upcoming releases will be added here_ +### What's new + +**Ordered raw input with {tooliconl}`send-keys-batch`** + +{tooliconl}`send-keys-batch` sends several raw-input operations in order and returns per-operation success or failure metadata. Choose stop-at-first-failure or continue-and-report handling, and set an optional timeout to bound a long batch and each underlying send. It is intentionally scoped to keystrokes and text input for TUIs or persistent shells; authored command completion stays with {tooliconl}`run-command`, and repeated observation stays with {tooliconl}`capture-since`. (#49, #61) + +### Fixes + +**Argument-validation failures no longer echo rejected input** + +When a tool call fails argument-schema validation, the error result and the server's invalid-argument log record now omit the rejected input values, so a secret-bearing argument can no longer surface in logs or error text. (#78) + ## libtmux-mcp 0.1.0a12 (2026-06-13) libtmux-mcp 0.1.0a12 hardens the MCP server's read-only and safety surface and adds a one-call `run_command` tool. Read-only tools can no longer trigger tmux format-job shell evaluation, an invalid safety tier fails closed instead of exposing write tools, and large successful results keep their structured payload. Panes and windows also gain liveness and active-pane metadata, and the package ships a `py.typed` marker. The fastmcp floor rises to 3.4.2 to pick up its explicit `starlette>=1.0.1` floor (CVE-2026-48710). diff --git a/README.md b/README.md index ab7611d..27435cc 100644 --- a/README.md +++ b/README.md @@ -18,7 +18,7 @@ Give your AI agent hands inside the terminal — create sessions, run commands, | **Server** | `list_servers`, `list_sessions`, `create_session`, `kill_server`, `get_server_info` | | **Session** | `list_windows`, `get_session_info`, `create_window`, `rename_session`, `select_window`, `kill_session` | | **Window** | `list_panes`, `get_window_info`, `split_window`, `rename_window`, `select_layout`, `resize_window`, `move_window`, `kill_window` | -| **Pane** | `run_command`, `send_keys`, `paste_text`, `capture_pane`, `capture_since`, `snapshot_pane`, `search_panes`, `find_pane_by_position`, `get_pane_info`, `wait_for_text`, `wait_for_content_change`, `wait_for_channel`, `signal_channel`, `display_message`, `select_pane`, `swap_pane`, `resize_pane`, `set_pane_title`, `clear_pane`, `pipe_pane`, `enter_copy_mode`, `exit_copy_mode`, `respawn_pane`, `kill_pane` | +| **Pane** | `run_command`, `send_keys`, `send_keys_batch`, `paste_text`, `capture_pane`, `capture_since`, `snapshot_pane`, `search_panes`, `find_pane_by_position`, `get_pane_info`, `wait_for_text`, `wait_for_content_change`, `wait_for_channel`, `signal_channel`, `display_message`, `select_pane`, `swap_pane`, `resize_pane`, `set_pane_title`, `clear_pane`, `pipe_pane`, `enter_copy_mode`, `exit_copy_mode`, `respawn_pane`, `kill_pane` | | **Options** | `show_option`, `set_option` | | **Environment** | `show_environment`, `set_environment` | | **Buffers** | `load_buffer`, `paste_buffer`, `show_buffer`, `delete_buffer` | @@ -88,13 +88,26 @@ terminal it is using — pytest finishing, a dev server printing its port, a deploy log settling. The difference then is not more access to tmux, but a better place to put the control loop. -The server-side moves are three: +The server-side moves are: + +**Running.** [`run_command`](https://libtmux-mcp.git-pull.com/tools/pane/run-command/) +sends an authored shell command, waits for deterministic completion, +and returns exit status plus tail-preserved output as one typed value. +The alternative is teaching every agent to compose `send-keys`, +`wait-for`, and a pane capture correctly. + +**Driving.** [`send_keys_batch`](https://libtmux-mcp.git-pull.com/tools/pane/send-keys-batch/) +sends several ordered raw-input operations for TUIs and persistent +shell interaction. It is deliberately not a workflow DSL; command +completion stays in `run_command`, and repeated observation stays in +`capture_since`. **Waiting.** [`wait_for_text`](https://libtmux-mcp.git-pull.com/tools/pane/wait-for-text/) and [`wait_for_content_change`](https://libtmux-mcp.git-pull.com/tools/pane/wait-for-content-change/) -block inside the server until the condition fires. The alternative is -the model polling `capture-pane` in a loop, paying both context tokens -and round-trip latency for every turn. +block inside the server until a condition fires for output the agent +does not author. The alternative is the model polling `capture-pane` +in a loop, paying both context tokens and round-trip latency for every +turn. **Reading.** [`snapshot_pane`](https://libtmux-mcp.git-pull.com/tools/pane/snapshot-pane/) returns content, cursor, copy-mode state, and scroll offset as one diff --git a/docs/conf.py b/docs/conf.py index 9069957..e0fd053 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -155,6 +155,10 @@ def _patched_tool_collector_tool(self: ToolCollector, **kwargs: t.Any) -> t.Any: "EnvironmentSetResult", "WaitForTextResult", "ContentChangeResult", + "RunCommandResult", + "SendKeysOperation", + "SendKeysOperationResult", + "SendKeysBatchResult", "HookEntry", "HookListResult", "BufferRef", diff --git a/docs/demo.md b/docs/demo.md index 3fd4239..6d4eeb6 100644 --- a/docs/demo.md +++ b/docs/demo.md @@ -66,11 +66,11 @@ These are the actual tool headings as they render on tool pages: ### In prose -Use {tooliconl}`search-panes` to find text across all panes. If you know which pane, use {tooliconl}`capture-pane` for one read or {tooliconl}`capture-since` for repeated observation. After running a command with {tooliconl}`send-keys`, compose `tmux wait-for -S` and call {tooliconl}`wait-for-channel` before capturing. +Use {tooliconl}`search-panes` to find text across all panes. If you know which pane, use {tooliconl}`capture-pane` for one read or {tooliconl}`capture-since` for repeated observation. For authored shell commands, use {tooliconl}`run-command` instead of manually sending, waiting, and capturing. ### Dense inline (toolref, no badges) -The fundamental pattern: {toolref}`send-keys` → {toolref}`wait-for-channel` → {toolref}`capture-pane`. For discovery: {toolref}`list-sessions` → {toolref}`list-panes` → {toolref}`get-pane-info`. +The fundamental command pattern: {toolref}`run-command` → inspect `exit_status` and `output`. For discovery: {toolref}`list-sessions` → {toolref}`list-panes` → {toolref}`get-pane-info`. ## Environment variable references @@ -87,7 +87,7 @@ Use {tooliconl}`search-panes` before {tooliconl}`capture-pane` when you don't kn ``` ```{warning} -Do not call {toolref}`capture-pane` immediately after {toolref}`send-keys` — there is a race condition. Compose `tmux wait-for -S` into the command and use {toolref}`wait-for-channel` between them. +Do not call {toolref}`capture-pane` immediately after {toolref}`send-keys` — there is a race condition. Use {toolref}`run-command` for authored commands, or {toolref}`capture-since` when input and later observation are intentionally separate. ``` ```{note} diff --git a/docs/prompts.md b/docs/prompts.md index f8c1658..f6c4747 100644 --- a/docs/prompts.md +++ b/docs/prompts.md @@ -16,8 +16,8 @@ counterpart to the longer narrative recipes in {doc}`/recipes`. :::{grid-item-card} `run_and_wait` :link: fastmcp-prompt-run-and-wait :link-type: ref -Execute a shell command and block until it finishes. Use -{tooliconl}`run-command` when exit status matters. +Execute a shell command through {tooliconl}`run-command` and inspect +its typed result. ::: :::{grid-item-card} `diagnose_failing_pane` @@ -53,15 +53,14 @@ tools instead. ```{fastmcp-prompt} run_and_wait ``` -**Use when** the agent needs to execute a single shell command and -wait for completion through an explicit tmux signal. +**Use when** the agent needs to execute a single shell command, wait +for completion, and inspect exit status plus output. **Why use this instead of `send_keys` + `capture_pane` polling?** -Each rendered call embeds a UUID-scoped ``tmux wait-for`` channel, -so concurrent agents (or parallel prompt calls from one agent) can -never cross-signal each other. The server side blocks until the -channel is signalled — strictly cheaper in agent turns than a -``capture_pane`` retry loop. +{tooliconl}`run-command` sends the command, waits through a private +tmux signal, captures tail-preserved output, and returns exit status +in one typed result. That removes the manual channel plumbing from +the common authored-command workflow. ```{fastmcp-prompt-input} run_and_wait ``` @@ -69,31 +68,32 @@ channel is signalled — strictly cheaper in agent turns than a **Sample render** (``command="pytest"``, ``pane_id="%1"``): ````markdown -Run this shell command in tmux pane %1 and block -until it finishes: +Run this shell command in tmux pane %1, wait until it +finishes, and inspect the typed result: ```python -send_keys( +result = run_command( pane_id='%1', - keys='pytest; tmux wait-for -S libtmux_mcp_wait_', + command='pytest', + timeout=60.0, + max_lines=100, ) -wait_for_channel(channel='libtmux_mcp_wait_', timeout=60.0) -capture_pane(pane_id='%1', max_lines=100) ``` -After the channel signals, read the last ~100 lines to verify the -command's behaviour. Do NOT use a `capture_pane` retry loop — -`wait_for_channel` is strictly cheaper in agent turns. +Use `result.exit_status`, `result.timed_out`, and `result.output` +to decide what happened. Do NOT use a `send_keys` + `capture_pane` +retry loop for authored commands — `run_command` already performs +deterministic completion and returns tail-preserved output. -The payload does not preserve the command's exit status. Use -{tooliconl}`run-command` instead when exit status must be returned as -structured data. +If the task needs persistent shell state or TUI keystrokes instead of +a one-shot shell command, use `send_keys` or `send_keys_batch`, then +observe later output with `capture_since`. ```` -Shell ``;`` semantics fire the ``wait-for -S`` whether ``pytest`` -succeeded or failed, so the edge-triggered signal never deadlocks the -agent on a crashed command. Status preservation is intentionally -omitted from this prompt recipe. +For custom shell composition that falls outside {tooliconl}`run-command`, +compose ``tmux wait-for -S `` yourself and call +{tooliconl}`wait-for-channel`. Keep that as the low-level escape hatch, +not the default command-running recipe. --- diff --git a/docs/quickstart.md b/docs/quickstart.md index d555b5a..fa46150 100644 --- a/docs/quickstart.md +++ b/docs/quickstart.md @@ -49,13 +49,18 @@ Search all my panes for the word "error". ## How it works -When you say "run `make test` and show me the output", the agent executes a three-step pattern: - -1. {tool}`send-keys` — send the command (composed with `tmux wait-for -S `) to a tmux pane -2. {tool}`wait-for-channel` — block deterministically until the command signals completion -3. {tool}`capture-pane` — read the terminal output - -This **send → wait → capture** sequence is the fundamental workflow. For commands the agent authors, the channel pattern is deterministic; for output the agent does not author (third-party log lines, daemon prompts, interactive supervisors), substitute {tool}`wait-for-text` for step 2. +When you say "run `make test` and show me the output", the agent follows a typed command pattern: + +1. {tool}`run-command` — send the authored shell command, wait for completion, and return exit status plus output +2. Inspect the typed result's `exit_status`, `timed_out`, and `output` fields + +This **run → inspect** sequence is the default workflow for commands +the agent authors. For custom shell composition outside +{tool}`run-command`, the lower-level escape hatch is +{tool}`send-keys` with `tmux wait-for -S ` composed into the +payload, followed by {tool}`wait-for-channel`. For output the agent +does not author (third-party log lines, daemon prompts, interactive +supervisors), use {tool}`wait-for-text` or {tool}`wait-for-content-change`. When you need to keep checking the same pane after that first read, switch to {tool}`capture-since`: the first call returns a cursor, and follow-up calls diff --git a/docs/recipes.md b/docs/recipes.md index 987b13a..e0d861d 100644 --- a/docs/recipes.md +++ b/docs/recipes.md @@ -196,19 +196,20 @@ create a new pane, then calls {tooliconl}`send-keys` in that pane: The agent calls {tooliconl}`wait-for-text` on the server pane with `pattern: "Listening on"` and `timeout: 30`. Once the wait resolves, the -agent calls {tooliconl}`send-keys` in the original pane: -`npm test -- --integration`, then {tooliconl}`wait-for-text` with -`pattern: "passed|failed|error"` and `regex: true`, then -{tooliconl}`capture-pane` to read the test results. +agent calls {tooliconl}`run-command` in the original pane with +`command: "npm test -- --integration"` and a test-appropriate +timeout, then reads `exit_status`, `timed_out`, and `output`. ```{warning} Calling {toolref}`capture-pane` immediately after {toolref}`send-keys` is a race condition. {toolref}`send-keys` returns the moment tmux accepts the keystrokes, not when the command finishes. For commands the agent authors, -compose `tmux wait-for -S ` into the command and call -{toolref}`wait-for-channel` — deterministic, race-free. For output the -agent does not author (server-startup banners, test-result lines like -the ones above), use {toolref}`wait-for-text` instead. +use {toolref}`run-command` — deterministic, typed, and race-free. For +custom shell composition outside that shape, compose +`tmux wait-for -S ` into the command and call +{toolref}`wait-for-channel`. For output the agent does not author +(server-startup banners, daemon prompts), use {toolref}`wait-for-text` +instead. ``` ### The non-obvious part @@ -393,21 +394,18 @@ long-lived process, I would not hijack it -- I would use a different pane. ### Act -The agent calls {tooliconl}`clear-pane`, then {tooliconl}`send-keys` with -`keys: "pytest; tmux wait-for -S pytest_done"`, then -{tooliconl}`wait-for-channel` with `channel: "pytest_done"`, then -{tooliconl}`capture-pane` to read the fresh output. Composing the -`tmux wait-for -S` signal directly into the shell command is the -deterministic path for authored commands. +The agent calls {tooliconl}`clear-pane`, then {tooliconl}`run-command` +with `command: "pytest"` and a test-appropriate timeout. The result +contains exit status and fresh tail-preserved output without a manual +send-wait-capture sequence. ### The non-obvious part {toolref}`clear-pane` runs two tmux commands internally (`send-keys -R` then `clear-history`) with a brief gap between them. Calling {toolref}`capture-pane` immediately after {toolref}`clear-pane` may catch -partial state. The {toolref}`wait-for-text` call after {toolref}`send-keys` -naturally provides the needed delay, so the sequence clear-send-wait-capture -is safe. +partial state. The {toolref}`run-command` call naturally provides the +needed command-completion boundary, so the sequence clear-run-inspect is safe. --- diff --git a/docs/tools/index.md b/docs/tools/index.md index 07e7b02..141d4d8 100644 --- a/docs/tools/index.md +++ b/docs/tools/index.md @@ -22,8 +22,9 @@ All tools accept an optional `socket_name` parameter for multi-server support. I **Running a command?** - {tool}`run-command` — one call to run a shell command, wait for completion, capture output, and return exit status -- {tool}`send-keys` (with `tmux wait-for -S ` composed into the keys) → {tool}`wait-for-channel` → {tool}`capture-pane` — the deterministic path for commands the agent authors -- For output the agent does not author (third-party logs, daemon prompts), use {tool}`wait-for-text` or {tool}`wait-for-content-change` between `send-keys` and `capture-pane` +- {tool}`send-keys` / {tool}`send-keys-batch` — raw interactive input for TUIs, control keys, and persistent shell state +- {tool}`wait-for-channel` — low-level custom completion when `run-command` does not fit the shell composition +- For output the agent does not author (third-party logs, daemon prompts), use {tool}`wait-for-text`, {tool}`wait-for-content-change`, or {tool}`capture-since` - Pasting multi-line text? → {tool}`paste-text` **Creating workspace structure?** @@ -221,7 +222,13 @@ Split a window into panes. :::{grid-item-card} send_keys :link: send-keys :link-type: ref -Send commands or keystrokes to a pane. +Send raw keystrokes to a pane. +::: + +:::{grid-item-card} send_keys_batch +:link: send-keys-batch +:link-type: ref +Send several ordered raw-input operations. ::: :::{grid-item-card} run_command diff --git a/docs/tools/pane/capture-since.md b/docs/tools/pane/capture-since.md index 018b43a..0530b5c 100644 --- a/docs/tools/pane/capture-since.md +++ b/docs/tools/pane/capture-since.md @@ -9,11 +9,11 @@ without paying to re-read the same scrollback every turn. The first call returns the current visible screen plus a cursor; later calls pass that cursor back and receive only rows written or rewritten after it. -**Avoid when** you control the command and only need completion — compose -`tmux wait-for -S ` into the command and call -{tooliconl}`wait-for-channel`. If you need a one-shot content + metadata view, -use {tooliconl}`snapshot-pane`; if you do not know which pane contains text, -use {tooliconl}`search-panes`. +**Avoid when** you control the command and only need completion — use +{tooliconl}`run-command`, which waits and returns exit status plus +output in one typed result. If you need a one-shot content + metadata +view, use {tooliconl}`snapshot-pane`; if you do not know which pane +contains text, use {tooliconl}`search-panes`. **Side effects:** None. Readonly. diff --git a/docs/tools/pane/index.md b/docs/tools/pane/index.md index eb4a929..c1f6c25 100644 --- a/docs/tools/pane/index.md +++ b/docs/tools/pane/index.md @@ -37,6 +37,10 @@ Evaluate a tmux format string against a target. Send keystrokes or commands to a pane. ::: +:::{grid-item-card} {tooliconl}`send-keys-batch` +Send several ordered raw-input operations. +::: + :::{grid-item-card} {tooliconl}`run-command` Run a shell command, wait, and capture output. ::: @@ -115,6 +119,7 @@ get-pane-info find-pane-by-position display-message send-keys +send-keys-batch run-command paste-text pipe-pane diff --git a/docs/tools/pane/run-command.md b/docs/tools/pane/run-command.md index 721dd6f..7cb4ffe 100644 --- a/docs/tools/pane/run-command.md +++ b/docs/tools/pane/run-command.md @@ -7,7 +7,8 @@ result with exit status, timeout state, and captured pane output. **Avoid when** you need raw interactive key driving — use -{tooliconl}`send-keys` for TUIs, key names, and partial commands. +{tooliconl}`send-keys` or {tooliconl}`send-keys-batch` for TUIs, key +names, and partial commands. **Side effects:** Sends a command to the pane's interactive shell. The command may read or write files, start processes, or access the network diff --git a/docs/tools/pane/send-keys-batch.md b/docs/tools/pane/send-keys-batch.md new file mode 100644 index 0000000..94ec66d --- /dev/null +++ b/docs/tools/pane/send-keys-batch.md @@ -0,0 +1,63 @@ +# Send keys batch + +```{fastmcp-tool} pane_tools.send_keys_batch +``` + +**Use when** you need to send several ordered raw-input operations to +one or more panes: TUI keystrokes, partial shell input, or persistent +shell interaction that should remain below the command-completion +layer. + +**Avoid when** you need to run shell commands and capture results — +use {tooliconl}`run-command` for authored commands, or combine +{tooliconl}`send-keys` with {tooliconl}`capture-since` when later +observation is intentionally separate from input. + +**Side effects:** Sends keystrokes to target panes in order. With +`on_error="stop"` the batch stops at the first failed operation and +returns that failure in the result. With `on_error="continue"` later +operations are still attempted. + +**Example:** + +```json +{ + "tool": "send_keys_batch", + "arguments": { + "operations": [ + {"pane_id": "%2", "keys": "C-c", "enter": false}, + {"pane_id": "%2", "keys": "npm run dev"} + ], + "on_error": "stop" + } +} +``` + +Response: + +```json +{ + "results": [ + { + "index": 0, + "pane_id": "%2", + "success": true, + "error": null, + "elapsed_seconds": 0.01 + }, + { + "index": 1, + "pane_id": "%2", + "success": true, + "error": null, + "elapsed_seconds": 0.01 + } + ], + "succeeded": 2, + "failed": 0, + "stopped_at": null +} +``` + +```{fastmcp-tool-input} pane_tools.send_keys_batch +``` diff --git a/docs/tools/pane/send-keys.md b/docs/tools/pane/send-keys.md index 19518a0..5f27bfe 100644 --- a/docs/tools/pane/send-keys.md +++ b/docs/tools/pane/send-keys.md @@ -3,14 +3,16 @@ ```{fastmcp-tool} pane_tools.send_keys ``` -**Use when** you need to type commands, press keys, or interact with a -terminal. This is the primary way to execute commands in tmux panes. - -**Avoid when** you need to run something and immediately capture the result — -compose `tmux wait-for -S ` into the keys and call -{tooliconl}`wait-for-channel` for deterministic completion, or fall back to -{tooliconl}`wait-for-text` / {tooliconl}`wait-for-content-change` when you -must observe output the agent does not author. +**Use when** you need to type raw input, press keys, or interact with +a terminal program. For several ordered raw-input operations, use +{tooliconl}`send-keys-batch`. + +**Avoid when** you need to run one authored shell command and +immediately capture its result — use {tooliconl}`run-command` so exit +status, timeout state, and output come back as one typed result. For +output the agent does not author, use {tooliconl}`wait-for-text` / +{tooliconl}`wait-for-content-change` or observe with +{tooliconl}`capture-since`. **Side effects:** Sends keystrokes to the pane. If `enter` is true (default), the command executes. diff --git a/docs/tools/pane/wait-for-channel.md b/docs/tools/pane/wait-for-channel.md index b0574d8..82cc8a2 100644 --- a/docs/tools/pane/wait-for-channel.md +++ b/docs/tools/pane/wait-for-channel.md @@ -1,8 +1,13 @@ # Wait for channel -tmux's `wait-for` command exposes named, server-global channels that clients can signal and block on. These give agents an explicit synchronization primitive — strictly cheaper in agent turns than polling pane content via {tooliconl}`capture-pane` or {tooliconl}`wait-for-text`. +tmux's `wait-for` command exposes named, server-global channels that +clients can signal and block on. These give agents an explicit +synchronization primitive for custom shell composition. For the common +"run this shell command and report the result" workflow, prefer +{tooliconl}`run-command`, which wraps this pattern and returns exit +status plus output. -The composition pattern: {tooliconl}`send-keys` a command followed by `; tmux wait-for -S NAME`, then call `wait_for_channel`. Shell `;` semantics fire the second statement whether the first succeeds or fails, so the edge-triggered signal never deadlocks the agent on a crashed command. +The composition pattern: {tooliconl}`send-keys` a command followed by `; tmux wait-for -S NAME`, then call {tooliconl}`wait-for-channel`. Shell `;` semantics fire the second statement whether the first succeeds or fails, so the edge-triggered signal never deadlocks the agent on a crashed command. ```python send_keys( @@ -14,14 +19,18 @@ wait_for_channel("tests_done", timeout=60) The `; tmux wait-for -S NAME` suffix is the load-bearing safety contract — `wait-for` is edge-triggered, so a crash before the signal would deadlock until the wait's `timeout`. The shell separator `;` runs the next statement unconditionally, so the signal fires on both success and failure paths. -The payload deliberately does not append `exit $?` — in an interactive shell that exits the shell itself, taking single-pane sessions down with it. If exit-status preservation matters, capture the status out-of-band (e.g. write it to a file the agent reads later, or use a dedicated scratch pane). +The payload deliberately does not append `exit $?` — in an interactive +shell that exits the shell itself, taking single-pane sessions down +with it. If exit-status preservation matters and the command fits the +standard one-shot shape, use {tooliconl}`run-command`. ```{fastmcp-tool} wait_for_tools.wait_for_channel ``` -**Use when** the shell command can reliably emit the signal (single -test runs, build scripts, dev-server boot, anything composable with -`; tmux wait-for -S name`). +**Use when** the shell command can reliably emit the signal and +{tooliconl}`run-command` does not fit the desired shell composition +(single test runs, build scripts, dev-server boot, anything composable +with `; tmux wait-for -S name`). **Avoid when** the signal cannot be guaranteed — for example, when the command might be killed externally. Use {tooliconl}`wait-for-text` diff --git a/docs/topics/architecture.md b/docs/topics/architecture.md index aa87a0f..2ac8f7e 100644 --- a/docs/topics/architecture.md +++ b/docs/topics/architecture.md @@ -18,7 +18,7 @@ src/libtmux_mcp/ server_tools.py # list_servers, list_sessions, create_session, kill_server, get_server_info session_tools.py # list_windows, create_window, rename_session, kill_session window_tools.py # list_panes, split_window, rename_window, kill_window, select_layout, resize_window - pane_tools.py # run_command, send_keys, capture_pane, capture_since, snapshot_pane, search_panes, wait_for_text + pane_tools.py # run_command, send_keys, send_keys_batch, capture_pane, capture_since, snapshot_pane, search_panes, wait_for_text buffer_tools.py # load_buffer, paste_buffer, show_buffer, delete_buffer hook_tools.py # show_hooks, show_hook option_tools.py # show_option, set_option diff --git a/docs/topics/concepts.md b/docs/topics/concepts.md index 156f7bc..4e05043 100644 --- a/docs/topics/concepts.md +++ b/docs/topics/concepts.md @@ -46,7 +46,7 @@ For pane tools, you can combine parameters to narrow the search: `session_name` Tools fall into three categories: - **Discovery** — Read-only operations: `list_sessions`, `list_windows`, `list_panes`, `capture_pane`, `capture_since`, `get_pane_info`, `find_pane_by_position`, `search_panes`, `wait_for_text`, `show_option`, `show_environment` -- **Mutation** — Create, modify, or send input: `create_session`, `create_window`, `split_window`, `send_keys`, `rename_*`, `resize_*`, `set_pane_title`, `clear_pane`, `select_layout`, `set_option`, `set_environment` +- **Mutation** — Create, modify, or send input: `create_session`, `create_window`, `split_window`, `send_keys`, {tooliconl}`send-keys-batch`, `rename_*`, `resize_*`, `set_pane_title`, `clear_pane`, `select_layout`, `set_option`, `set_environment` - **Destruction** — Remove tmux objects: `kill_server`, `kill_session`, `kill_window`, `kill_pane` These map to {ref}`safety tiers `. diff --git a/docs/topics/gotchas.md b/docs/topics/gotchas.md index edab3b9..a009c50 100644 --- a/docs/topics/gotchas.md +++ b/docs/topics/gotchas.md @@ -31,15 +31,20 @@ The `enter` parameter defaults to `true`, which is correct for commands (`make t {"tool": "capture_pane", "arguments": {"pane_id": "%0"}} ``` -The capture above may return the terminal state **before** pytest runs. Compose `tmux wait-for -S ` into the command and block on {tooliconl}`wait-for-channel` — deterministic, race-free: +The capture above may return the terminal state **before** pytest +runs. For an authored shell command, use {tooliconl}`run-command` +instead: ```json -{"tool": "send_keys", "arguments": {"keys": "pytest; tmux wait-for -S pytest_done", "pane_id": "%0"}} -{"tool": "wait_for_channel", "arguments": {"channel": "pytest_done", "timeout": 60}} -{"tool": "capture_pane", "arguments": {"pane_id": "%0"}} +{"tool": "run_command", "arguments": {"command": "pytest", "pane_id": "%0", "timeout": 60}} ``` -For output the agent does not author (third-party logs, daemon prompts, interactive supervisors), substitute {tooliconl}`wait-for-text` for `wait_for_channel`. See {ref}`recipes` for the complete pattern. +For custom shell composition outside {tooliconl}`run-command`, compose +`tmux wait-for -S ` into the command and block on +{tooliconl}`wait-for-channel`. For output the agent does not author +(third-party logs, daemon prompts, interactive supervisors), use +{tooliconl}`wait-for-text` or observe with {tooliconl}`capture-since`. +See {ref}`recipes` for complete patterns. ## Repeated `capture_pane` calls resend old output diff --git a/docs/topics/prompting.md b/docs/topics/prompting.md index bcd81eb..196b828 100644 --- a/docs/topics/prompting.md +++ b/docs/topics/prompting.md @@ -14,10 +14,11 @@ Every MCP client receives these instructions when connecting to the libtmux-mcp libtmux MCP server for programmatic tmux control. tmux hierarchy: Server > Session > Window > Pane. Use pane_id (e.g. '%1') as the preferred targeting method - it is globally unique within a tmux server. -Use send_keys to execute commands, capture_pane for one-shot reads, and -capture_since for repeated observation. All tools accept an optional -socket_name parameter for multi-server support (defaults to -LIBTMUX_SOCKET env var). +Use run_command for authored shell commands, send_keys or +send_keys_batch for raw TUI / persistent-shell input, capture_pane for +one-shot reads, and capture_since for repeated observation. All tools +accept an optional socket_name parameter for multi-server support +(defaults to LIBTMUX_SOCKET env var). IMPORTANT — metadata vs content: list_windows, list_panes, and list_sessions only search metadata (names, IDs, current command). To @@ -63,7 +64,7 @@ These natural-language prompts reliably trigger the right tool sequences: | Prompt | Agent interprets as | |--------|-------------------| -| [Run `pytest` in my build pane and show results]{.prompt} | {toolref}`send-keys` (with `tmux wait-for -S` composed in) → {toolref}`wait-for-channel` → {toolref}`capture-pane` | +| [Run `pytest` in my build pane and show results]{.prompt} | {toolref}`run-command` | | [Start the dev server and wait until it's ready]{.prompt} | {toolref}`send-keys` → {toolref}`wait-for-text` (for "listening on" — third-party output the agent doesn't author) | | [Spin up the dev server in the bottom-right pane]{.prompt} | {toolref}`find-pane-by-position` (corner=bottom-right) → {toolref}`send-keys` → {toolref}`wait-for-text` (for the server's readiness banner) | | [Check if any pane has errors]{.prompt} | {toolref}`search-panes` with pattern "error" | @@ -95,12 +96,14 @@ use tmux via the libtmux MCP server rather than running them directly. This keeps output accessible for later inspection. For authored shell commands that need status, use run_command. For -custom command completion, compose `tmux wait-for -S ` into -the shell command and call wait_for_channel — deterministic, no -polling. Use wait_for_text or wait_for_content_change for observation -flows (third-party logs, daemon prompts), and use capture_since when -you need to read the same pane repeatedly. Never capture_pane -immediately after send_keys — the command may still be running. +raw TUI input or persistent shell state, use send_keys or +send_keys_batch. For custom command completion outside run_command, +compose `tmux wait-for -S ` into the shell command and call +wait_for_channel — deterministic, no polling. Use wait_for_text or +wait_for_content_change for observation flows (third-party logs, +daemon prompts), and use capture_since when you need to read the same +pane repeatedly. Never capture_pane immediately after send_keys — the +command may still be running. ``` ### For safe agent behavior @@ -143,6 +146,6 @@ When an agent is unsure which tool to use, these rules help: 1. **Discovery first**: Call {toolref}`list-sessions` or {toolref}`list-panes` before acting on specific targets 2. **Prefer IDs**: Once you have a `pane_id`, use it for all subsequent calls — it never changes during the pane's lifetime -3. **Wait, don't poll**: For commands the agent authors, prefer {toolref}`wait-for-channel` with `tmux wait-for -S ` composed into the command — deterministic and race-free. Use {toolref}`capture-since` for repeated observation, and fall back to {toolref}`wait-for-text` or {toolref}`wait-for-content-change` for output the agent doesn't author. Never call {toolref}`capture-pane` in a retry loop. +3. **Run, wait, or observe deliberately**: For commands the agent authors, prefer {toolref}`run-command`. Use {toolref}`wait-for-channel` only for custom shell composition outside that shape. Use {toolref}`capture-since` for repeated observation, and fall back to {toolref}`wait-for-text` or {toolref}`wait-for-content-change` for output the agent doesn't author. Never call {toolref}`capture-pane` in a retry loop. 4. **Content vs. metadata**: If looking for text *in* a terminal, use {toolref}`search-panes`. If looking for pane *properties* (name, PID, path), use {toolref}`list-panes` or {toolref}`get-pane-info` 5. **Destructive tools are opt-in**: Never kill sessions, windows, or panes unless the user explicitly asks diff --git a/docs/topics/safety.md b/docs/topics/safety.md index 2590bf0..8c19137 100644 --- a/docs/topics/safety.md +++ b/docs/topics/safety.md @@ -9,7 +9,7 @@ libtmux-mcp uses a three-tier safety system to control which tools are available | Tier | Label | Access | Use case | |------|-------|--------|----------| | `readonly` | {badge}`readonly` | List, capture, search, info | Monitoring, browsing | -| `mutating` (default) | {badge}`mutating` | + create, send_keys, rename, resize | Normal agent workflow | +| `mutating` (default) | {badge}`mutating` | + create, send_keys, send_keys_batch, rename, resize | Normal agent workflow | | `destructive` | {badge}`destructive` | + kill_server, kill_session, kill_window, kill_pane | Full control | ## Configuration @@ -107,9 +107,9 @@ Mitigations: - The optional `environment` argument (`dict[str, str]`) maps to one tmux `-e KEY=VALUE` flag per item. The audit log redacts each *value* via a `{len, sha256_prefix}` digest while keeping the *keys* visible — env var names like `DATABASE_URL` are usually operator-debug-useful, but their values are the secret. The same OS-process-table caveat as `shell` applies: `respawn-pane -e DB_PASSWORD=...` may briefly appear in `ps` output before the spawned process inherits the env. - The same self-pane guard that protects the destructive kill commands also refuses to respawn the pane running the MCP server. -### `send_keys` / `paste_text` +### `send_keys` / `send_keys_batch` / `paste_text` -These can execute anything the pane's shell accepts. There is no payload validation. The audit log stores a digest of the content, not the content itself, so a secret typed via `send_keys` does not land in logs. +These can execute anything the pane's shell accepts. There is no payload validation. The audit log stores a digest of the content, not the content itself, so a secret typed via {tooliconl}`send-keys` or {tooliconl}`send-keys-batch` does not land in logs. ## Audit log diff --git a/docs/topics/troubleshooting.md b/docs/topics/troubleshooting.md index 93e5dde..0ba0783 100644 --- a/docs/topics/troubleshooting.md +++ b/docs/topics/troubleshooting.md @@ -75,7 +75,7 @@ Symptom-based guide. Find your problem, follow the steps. 2. **Special characters**: tmux interprets some key names (e.g. `C-c`, `Enter`). If sending literal text, use `literal=true`. -3. **Timing**: After {toolref}`send-keys`, prefer composing `tmux wait-for -S ` into the shell command and calling {toolref}`wait-for-channel` for deterministic completion. Use {toolref}`capture-since` for repeated observation, and use {toolref}`wait-for-text` or {toolref}`wait-for-content-change` only when waiting on output you do not author. Don't call {toolref}`capture-pane` immediately — the command may still be running. +3. **Timing**: For authored shell commands, prefer {toolref}`run-command`; it waits for completion and returns exit status plus output. Use {toolref}`send-keys` or {toolref}`send-keys-batch` for raw interactive input, {toolref}`capture-since` for repeated observation, and {toolref}`wait-for-text` or {toolref}`wait-for-content-change` only when waiting on output you do not author. Don't call {toolref}`capture-pane` immediately after raw input — the command may still be running. ## Silent startup failure diff --git a/src/libtmux_mcp/middleware.py b/src/libtmux_mcp/middleware.py index c749050..c5635ab 100644 --- a/src/libtmux_mcp/middleware.py +++ b/src/libtmux_mcp/middleware.py @@ -115,6 +115,18 @@ async def on_call_tool( # --------------------------------------------------------------------------- +def _schema_validation_error( + error: BaseException, +) -> PydanticValidationError | None: + """Return the Pydantic validation error behind a schema failure.""" + if isinstance(error, PydanticValidationError): + return error + cause = error.__cause__ + if isinstance(cause, PydanticValidationError): + return cause + return None + + def _is_schema_validation_error(error: BaseException) -> bool: """Return True for fastmcp argument-schema validation failures. @@ -129,11 +141,76 @@ def _is_schema_validation_error(error: BaseException) -> bool: layer converts output-shape failures into error results itself, so they never reach the middleware as exceptions. """ - return isinstance(error, PydanticValidationError) or isinstance( - error.__cause__, PydanticValidationError + return _schema_validation_error(error) is not None + + +def _validation_errors_without_inputs( + error: PydanticValidationError, +) -> list[dict[str, t.Any]]: + """Return validation errors without rejected input values.""" + return t.cast( + "list[dict[str, t.Any]]", + error.errors( + include_url=False, + include_context=False, + include_input=False, + ), ) +def _format_schema_validation_error(error: BaseException) -> str: + """Format a Pydantic validation error without raw input values.""" + err = _schema_validation_error(error) + if err is None: + return str(error) + count = err.error_count() + noun = "validation error" if count == 1 else "validation errors" + lines = [f"{count} {noun} for {err.title}"] + for item in _validation_errors_without_inputs(err): + loc = ".".join(str(part) for part in item.get("loc", ())) or "__root__" + msg = str(item.get("msg", "Input validation failed")) + error_type = str(item.get("type", "unknown")) + lines.extend((loc, f" {msg} [type={error_type}]")) + return "\n".join(lines) + + +def _strip_validation_error_inputs(value: t.Any) -> t.Any: + """Remove raw input payloads from structured validation errors.""" + if isinstance(value, dict): + return { + key: _strip_validation_error_inputs(item) + for key, item in value.items() + if key not in {"ctx", "input"} + } + if isinstance(value, list): + return [_strip_validation_error_inputs(item) for item in value] + if isinstance(value, tuple): + return tuple(_strip_validation_error_inputs(item) for item in value) + return value + + +class _FastMCPValidationLogFilter(logging.Filter): + """Redact FastMCP invalid-argument warning payloads.""" + + def filter(self, record: logging.LogRecord) -> bool: + if record.msg != "Invalid arguments for tool %r: %s": + return True + if not isinstance(record.args, tuple) or len(record.args) != 2: + return True + tool_name, errors = record.args + record.args = (tool_name, _strip_validation_error_inputs(errors)) + return True + + +def install_fastmcp_validation_log_filter() -> None: + """Install the FastMCP validation log redaction filter once.""" + logger = logging.getLogger("fastmcp.server.server") + if not any( + isinstance(item, _FastMCPValidationLogFilter) for item in logger.filters + ): + logger.addFilter(_FastMCPValidationLogFilter()) + + #: Scheduling flag some MCP clients (notably Gemini CLI when batching #: several tool calls in one turn) merge into the tool's arguments. #: Recognized only to *word the rejection helpfully* — the argument is @@ -153,8 +230,8 @@ def _unexpected_kwargs(error: BaseException) -> list[str]: and returns the names flagged ``unexpected_keyword_argument``. Empty list for every other failure shape. """ - err = error if isinstance(error, PydanticValidationError) else error.__cause__ - if not isinstance(err, PydanticValidationError): + err = _schema_validation_error(error) + if err is None: return [] return [ str(item["loc"][-1]) @@ -224,7 +301,11 @@ def _error_tool_result( "expected": isinstance(error, ExpectedToolError) or _is_schema_validation_error(error), } - text = str(error) + text = ( + _format_schema_validation_error(error) + if _is_schema_validation_error(error) + else str(error) + ) suggestion = getattr(error, "suggestion", None) if suggestion is None: unknown = _unexpected_kwargs(error) @@ -315,12 +396,17 @@ def _log_error(self, error: Exception, context: MiddlewareContext) -> None: # Lazy %-formatting (project logging standard) — also collapses # the stock implementation's include_traceback branch, since # ``exc_info`` accepts a bool. + error_text = ( + _format_schema_validation_error(error) + if _is_schema_validation_error(error) + else str(error) + ) self.logger.log( level, "Error in %s: %s: %s", method, error_type, - error, + error_text, exc_info=self.include_traceback, ) @@ -369,6 +455,25 @@ async def on_call_tool( {"keys", "text", "command", "value", "content", "shell", "environment"} ) +#: Nested argument containers that may contain sensitive argument names. +#: ``operations`` is used by ``send_keys_batch``; preserving pane ids and +#: booleans is useful for audit trails, but each nested ``keys`` payload +#: must be digested the same way top-level ``send_keys(keys=...)`` is. +_NESTED_ARG_LIST_NAMES: frozenset[str] = frozenset({"operations"}) + +_NONE_TYPE = type(None) + +_SEND_KEYS_OPERATION_ARG_TYPES: dict[str, tuple[type[t.Any], ...]] = { + "keys": (str,), + "pane_id": (str, _NONE_TYPE), + "session_name": (str, _NONE_TYPE), + "session_id": (str, _NONE_TYPE), + "window_id": (str, _NONE_TYPE), + "enter": (bool,), + "literal": (bool,), + "suppress_history": (bool,), +} + #: String arguments longer than this get truncated in the log summary to #: keep records bounded. Non-sensitive strings only — sensitive ones are #: replaced entirely by their digest. @@ -395,6 +500,23 @@ def _redact_digest(value: str) -> dict[str, t.Any]: } +def _redacted_value_shape(value: t.Any) -> dict[str, t.Any]: + """Return non-payload metadata for a value that cannot be logged.""" + return {"type": type(value).__name__, "redacted": True} + + +def _summarize_send_keys_operation_args(args: dict[str, t.Any]) -> dict[str, t.Any]: + """Summarize one ``send_keys_batch`` operation for audit logging.""" + summary: dict[str, t.Any] = {} + for key, value in args.items(): + expected_types = _SEND_KEYS_OPERATION_ARG_TYPES.get(key) + if expected_types is None or not isinstance(value, expected_types): + summary[key] = _redacted_value_shape(value) + else: + summary[key] = _summarize_args({key: value})[key] + return summary + + def _summarize_args(args: dict[str, t.Any]) -> dict[str, t.Any]: """Summarize tool arguments for audit logging. @@ -404,6 +526,8 @@ def _summarize_args(args: dict[str, t.Any]) -> dict[str, t.Any]: ``respawn_pane``) have each *value* digested while keys remain visible — env-var-name-like keys are operator-debug-useful and rarely sensitive, while their values usually are. + Known nested operation lists are summarized recursively so batched + tool calls keep target metadata while redacting inner payloads. Examples -------- @@ -431,6 +555,16 @@ def _summarize_args(args: dict[str, t.Any]) -> dict[str, t.Any]: summary[key] = _redact_digest(value) elif key in _SENSITIVE_ARG_NAMES and isinstance(value, dict): summary[key] = {k: _redact_digest(str(v)) for k, v in value.items()} + elif key in _NESTED_ARG_LIST_NAMES: + if isinstance(value, list): + summary[key] = [ + _summarize_send_keys_operation_args(item) + if isinstance(item, dict) + else _redacted_value_shape(item) + for item in value + ] + else: + summary[key] = _redacted_value_shape(value) elif isinstance(value, str) and len(value) > _MAX_LOGGED_STR_LEN: summary[key] = value[:_MAX_LOGGED_STR_LEN] + "..." else: diff --git a/src/libtmux_mcp/models.py b/src/libtmux_mcp/models.py index cd0853d..a568ad9 100644 --- a/src/libtmux_mcp/models.py +++ b/src/libtmux_mcp/models.py @@ -4,7 +4,7 @@ import typing as t -from pydantic import BaseModel, Field +from pydantic import BaseModel, ConfigDict, Field class SessionInfo(BaseModel): @@ -302,6 +302,79 @@ class RunCommandResult(BaseModel): ) +class SendKeysOperation(BaseModel): + """One raw-input operation for :func:`send_keys_batch`.""" + + model_config = ConfigDict(extra="forbid") + + keys: str = Field(description="Keys or text to send.") + pane_id: str | None = Field( + default=None, + description="Pane ID (e.g. '%1').", + ) + session_name: str | None = Field( + default=None, + description="Session name for pane resolution.", + ) + session_id: str | None = Field( + default=None, + description="Session ID (e.g. '$1') for pane resolution.", + ) + window_id: str | None = Field( + default=None, + description="Window ID for pane resolution.", + ) + enter: bool = Field( + default=True, + description="Whether to press Enter after sending keys.", + ) + literal: bool = Field( + default=False, + description="Whether to send keys literally with no tmux key interpretation.", + ) + suppress_history: bool = Field( + default=False, + description=( + "Suppress shell history by prepending a space where the shell " + "ignores space-prefixed commands." + ), + ) + + +class SendKeysOperationResult(BaseModel): + """Per-operation result from :func:`send_keys_batch`.""" + + index: int = Field(description="Zero-based index in the submitted operation list.") + pane_id: str | None = Field( + default=None, + description="Resolved pane ID, or None if target resolution failed.", + ) + success: bool = Field(description="True when this operation sent successfully.") + error: str | None = Field( + default=None, + description="Error message for this operation, if it failed.", + ) + elapsed_seconds: float = Field(description="Time spent on this operation.") + + +class SendKeysBatchResult(BaseModel): + """Structured result for a batch of raw-input send operations.""" + + results: list[SendKeysOperationResult] = Field( + default_factory=list, + description="Per-operation results in attempted order.", + ) + succeeded: int = Field(description="Number of operations sent successfully.") + failed: int = Field(description="Number of operations that failed.") + stopped_at: int | None = Field( + default=None, + description=( + "Index where processing stopped because on_error='stop', or None " + "when all operations were attempted." + ), + ) + + class PaneSnapshot(BaseModel): """Rich screen capture with metadata: content, cursor, mode, and scroll state.""" diff --git a/src/libtmux_mcp/prompts/recipes.py b/src/libtmux_mcp/prompts/recipes.py index e3aa1a4..b45e06a 100644 --- a/src/libtmux_mcp/prompts/recipes.py +++ b/src/libtmux_mcp/prompts/recipes.py @@ -10,8 +10,6 @@ from __future__ import annotations -import uuid - def run_and_wait( command: str, @@ -20,16 +18,12 @@ def run_and_wait( ) -> str: """Run a shell command in a tmux pane and wait for completion. - The returned template teaches the model the safe composition - pattern: shell ``;`` semantics fire ``tmux wait-for -S`` whether - the command succeeds or fails, so the edge-triggered signal - never deadlocks an agent waiting on a crashed command. See - ``docs/topics/prompting.md``. - - Each invocation embeds a fresh UUID-scoped channel name so - concurrent agents (or parallel prompt calls from a single agent) - cannot cross-signal each other on tmux's server-global channel - namespace — the channel is unique to this one prompt rendering. + The returned template teaches the high-level authored-command + primitive: ``run_command`` sends the command, waits through a + private tmux signal, captures output, and reports exit status in + one typed result. Use lower-level ``send_keys`` + + ``wait_for_channel`` only when the caller needs custom shell + composition outside this common command-completion shape. Parameters ---------- @@ -38,29 +32,28 @@ def run_and_wait( pane_id : str Target pane (e.g. ``%1``). timeout : float - Maximum seconds to wait for the signal. Default 60. + Maximum seconds to wait for command completion. Default 60. """ - channel = f"libtmux_mcp_wait_{uuid.uuid4().hex}" - shell_payload = f"{command}; tmux wait-for -S {channel}" - return f"""Run this shell command in tmux pane {pane_id} and block -until it finishes: + return f"""Run this shell command in tmux pane {pane_id}, wait until it +finishes, and inspect the typed result: ```python -send_keys( +result = run_command( pane_id={pane_id!r}, - keys={shell_payload!r}, + command={command!r}, + timeout={timeout}, + max_lines=100, ) -wait_for_channel(channel={channel!r}, timeout={timeout}) -capture_pane(pane_id={pane_id!r}, max_lines=100) ``` -After the channel signals, read the last ~100 lines to verify the -command's behaviour. Do NOT use a `capture_pane` retry loop — -`wait_for_channel` is strictly cheaper in agent turns. +Use `result.exit_status`, `result.timed_out`, and `result.output` +to decide what happened. Do NOT use a `send_keys` + `capture_pane` +retry loop for authored commands — `run_command` already performs +deterministic completion and returns tail-preserved output. -The payload does not preserve the command's exit status. Use -`run_command` instead when exit status must be returned as structured -data. +If the task needs persistent shell state or TUI keystrokes instead of +a one-shot shell command, use `send_keys` or `send_keys_batch`, then +observe later output with `capture_since`. """ diff --git a/src/libtmux_mcp/server.py b/src/libtmux_mcp/server.py index 4898dbc..849bb26 100644 --- a/src/libtmux_mcp/server.py +++ b/src/libtmux_mcp/server.py @@ -32,10 +32,12 @@ SafetyMiddleware, TailPreservingResponseLimitingMiddleware, ToolErrorResultMiddleware, + install_fastmcp_validation_log_filter, ) from libtmux_mcp.tools.buffer_tools import _MCP_BUFFER_PREFIX logger = logging.getLogger(__name__) +install_fastmcp_validation_log_filter() #: Cache-key shape used by :data:`_server_cache` and the GC helper. #: ``(socket_name, socket_path, tmux_bin)`` — see @@ -94,10 +96,10 @@ ) _INSTR_WAIT_NOT_POLL = ( - "WAIT, DON'T POLL: run_command for authored shell commands needing " + "WAIT, DON'T POLL: run_command for authored commands needing " "status; wait_for_channel for custom tmux wait-for; capture_since " "for tailing; wait_for_text/wait_for_content_change for output you " - "don't author." + "don't author; send_keys_batch for raw input." ) #: Gap-explainer: write-hook tools are intentionally absent. See module diff --git a/src/libtmux_mcp/tools/pane_tools/__init__.py b/src/libtmux_mcp/tools/pane_tools/__init__.py index 9c890fb..b29a276 100644 --- a/src/libtmux_mcp/tools/pane_tools/__init__.py +++ b/src/libtmux_mcp/tools/pane_tools/__init__.py @@ -31,6 +31,7 @@ paste_text, run_command, send_keys, + send_keys_batch, ) from libtmux_mcp.tools.pane_tools.layout import ( resize_pane, @@ -74,6 +75,7 @@ "search_panes", "select_pane", "send_keys", + "send_keys_batch", "set_pane_title", "snapshot_pane", "swap_pane", @@ -87,6 +89,11 @@ def register(mcp: FastMCP) -> None: mcp.tool(title="Send Keys", annotations=ANNOTATIONS_SHELL, tags={TAG_MUTATING})( send_keys ) + mcp.tool( + title="Send Keys Batch", + annotations=ANNOTATIONS_SHELL, + tags={TAG_MUTATING}, + )(send_keys_batch) mcp.tool(title="Run Command", annotations=ANNOTATIONS_SHELL, tags={TAG_MUTATING})( run_command ) diff --git a/src/libtmux_mcp/tools/pane_tools/io.py b/src/libtmux_mcp/tools/pane_tools/io.py index 6cb3b4f..ebaf20f 100644 --- a/src/libtmux_mcp/tools/pane_tools/io.py +++ b/src/libtmux_mcp/tools/pane_tools/io.py @@ -10,17 +10,90 @@ import subprocess import tempfile import time +import typing as t import uuid +from fastmcp.exceptions import ToolError + from libtmux_mcp._utils import ( ExpectedToolError, _get_server, + _map_exception_to_tool_error, _resolve_pane, _tmux_argv, handle_tool_errors, handle_tool_errors_async, ) -from libtmux_mcp.models import RunCommandResult +from libtmux_mcp.models import ( + RunCommandResult, + SendKeysBatchResult, + SendKeysOperation, + SendKeysOperationResult, +) + +if t.TYPE_CHECKING: + from libtmux.pane import Pane + + +def _batch_timeout_error(timeout: float) -> str: + """Return the standard send_keys_batch timeout error.""" + return f"batch execution exceeded timeout of {timeout}s" + + +def _remaining_timeout(deadline: float, timeout: float) -> float: + """Return the remaining operation budget or raise timeout.""" + remaining = deadline - time.monotonic() + if remaining <= 0: + raise ExpectedToolError(_batch_timeout_error(timeout)) + return remaining + + +def _run_timed_send_keys_argv( + argv: list[str], + *, + deadline: float, + timeout: float, +) -> None: + """Run one ``tmux send-keys`` argv within the batch deadline.""" + try: + subprocess.run( + argv, + check=True, + capture_output=True, + timeout=_remaining_timeout(deadline, timeout), + ) + except subprocess.TimeoutExpired as e: + raise ExpectedToolError(_batch_timeout_error(timeout)) from e + except subprocess.CalledProcessError as e: + stderr = e.stderr.decode(errors="replace").strip() if e.stderr else "" + msg = f"send-keys failed: {stderr or e}" + raise ExpectedToolError(msg) from e + + +def _run_timed_send_keys( + pane: Pane, + operation: SendKeysOperation, + *, + deadline: float, + timeout: float, +) -> None: + """Run ``tmux send-keys`` for one operation within the batch deadline.""" + pane_id = pane.pane_id + if pane_id is None: + msg = "resolved pane has no pane_id" + raise ExpectedToolError(msg) + + tmux_args = ["send-keys", "-t", pane_id] + if operation.literal: + tmux_args.append("-l") + tmux_args.append((" " if operation.suppress_history else "") + operation.keys) + + send_argvs = [_tmux_argv(pane.server, *tmux_args)] + if operation.enter: + send_argvs.append(_tmux_argv(pane.server, "send-keys", "-t", pane_id, "Enter")) + + for argv in send_argvs: + _run_timed_send_keys_argv(argv, deadline=deadline, timeout=timeout) @handle_tool_errors @@ -37,20 +110,16 @@ def send_keys( ) -> str: """Send keys (commands or text) to a tmux pane. - After sending, choose your synchronization primitive based on what you - control: - - - **Deterministic (preferred):** compose ``tmux wait-for -S `` - into the shell command and call ``wait_for_channel``. See the - ``run_and_wait`` prompt for the canonical safe-completion pattern. - Cheaper in agent turns and immune to baseline races. - - **Pattern-match:** call ``wait_for_text`` when the output you await - is yours to author and won't appear before the wait locks its - baseline (e.g. a sentinel ``echo`` after a long command). Fast - ``echo`` statements can race the baseline read; reserve this for - output the agent does not control. - - **Any change:** call ``wait_for_content_change`` when you don't know - the output shape. + Use this for raw interactive input: TUI keys, control sequences, + partial shell input, or persistent shell state. Use ``send_keys_batch`` + when you need several ordered raw-input operations. + + For authored shell commands that need completion, exit status, or + captured output, use ``run_command`` instead. For custom completion + outside that shape, compose ``tmux wait-for -S `` into the + shell command and call ``wait_for_channel``. For repeated observation + after input, prefer ``capture_since``; reserve ``wait_for_text`` and + ``wait_for_content_change`` for output the agent does not author. Do NOT call ``capture_pane`` immediately — both the read and the pattern-match paths race the pane's PTY draw. @@ -99,6 +168,157 @@ def send_keys( return f"Keys sent to pane {pane.pane_id}" +@handle_tool_errors +def send_keys_batch( + operations: list[SendKeysOperation], + on_error: t.Literal["stop", "continue"] = "stop", + timeout: float | None = None, + socket_name: str | None = None, +) -> SendKeysBatchResult: + """Send an ordered batch of raw key/text operations to tmux panes. + + Use this for bulk TUI or persistent-shell input where each item is the + same kind of low-level terminal interaction as :func:`send_keys`. For + authored shell commands that need exit status and captured output, use + :func:`run_command` instead. For repeated observation after sending input, + use :func:`capture_since` with its returned cursor. + + This tool intentionally does not compose heterogeneous operations such + as send → wait → capture. Keeping the batch homogeneous preserves clear + per-operation error attribution and avoids embedding a workflow DSL in + the MCP tool surface. + + Parameters + ---------- + operations : list of SendKeysOperation + Ordered raw-input operations to send. + on_error : {"stop", "continue"} + Whether to stop at the first failed operation or keep attempting + later operations. Default "stop". + timeout : float, optional + Maximum time in seconds to allow the batch to run before aborting. + socket_name : str, optional + tmux socket name. + + Returns + ------- + SendKeysBatchResult + Per-operation results with success/error counts and stop index. + """ + if not operations: + msg = "operations must not be empty" + raise ExpectedToolError(msg) + if on_error not in {"stop", "continue"}: + msg = "on_error must be 'stop' or 'continue'" + raise ExpectedToolError(msg) + + server = _get_server(socket_name=socket_name) + results: list[SendKeysOperationResult] = [] + stopped_at: int | None = None + batch_started = time.monotonic() + deadline = batch_started + timeout if timeout is not None else None + + for index, operation in enumerate(operations): + if deadline is not None and time.monotonic() > deadline: + assert timeout is not None + results.append( + SendKeysOperationResult( + index=index, + pane_id=operation.pane_id, + success=False, + error=_batch_timeout_error(timeout), + elapsed_seconds=0.0, + ) + ) + if on_error == "stop": + stopped_at = index + break + continue + + started = time.monotonic() + pane_id: str | None = None + try: + pane = _resolve_pane( + server, + pane_id=operation.pane_id, + session_name=operation.session_name, + session_id=operation.session_id, + window_id=operation.window_id, + ) + pane_id = pane.pane_id + if pane_id is None: + results.append( + SendKeysOperationResult( + index=index, + pane_id=None, + success=False, + error="resolved pane has no pane_id", + elapsed_seconds=time.monotonic() - started, + ) + ) + if on_error == "stop": + stopped_at = index + break + continue + if deadline is None: + pane.send_keys( + operation.keys, + enter=operation.enter, + suppress_history=operation.suppress_history, + literal=operation.literal, + ) + else: + assert timeout is not None + _run_timed_send_keys( + pane, + operation, + deadline=deadline, + timeout=timeout, + ) + except Exception as e: + elapsed = time.monotonic() - started + tool_err = ( + e + if isinstance(e, ToolError) + else _map_exception_to_tool_error("send_keys_batch", e) + ) + error = str(tool_err) + suggestion = getattr(tool_err, "suggestion", None) + if suggestion: + error = f"{error}\n{suggestion}" + results.append( + SendKeysOperationResult( + index=index, + pane_id=pane_id, + success=False, + error=error, + elapsed_seconds=elapsed, + ) + ) + if on_error == "stop": + stopped_at = index + break + continue + + results.append( + SendKeysOperationResult( + index=index, + pane_id=pane_id, + success=True, + elapsed_seconds=time.monotonic() - started, + ) + ) + + succeeded = sum(result.success for result in results) + failed = len(results) - succeeded + return SendKeysBatchResult( + results=results, + succeeded=succeeded, + failed=failed, + stopped_at=stopped_at, + ) + + @handle_tool_errors_async async def run_command( command: str, @@ -431,7 +651,8 @@ def clear_pane( ) -> str: """Clear the contents of a tmux pane. - Use before send_keys + capture_pane to get a clean capture without prior output. + Use before a fresh run_command call or raw-input observation workflow + when prior scrollback would make the result harder to inspect. Parameters ---------- diff --git a/src/libtmux_mcp/tools/pane_tools/wait.py b/src/libtmux_mcp/tools/pane_tools/wait.py index 0397a69..9df3504 100644 --- a/src/libtmux_mcp/tools/pane_tools/wait.py +++ b/src/libtmux_mcp/tools/pane_tools/wait.py @@ -160,11 +160,12 @@ async def wait_for_text( (``echo``, prompt-return after ``^C``) can land *before* this tool snapshots the baseline, and the match is then invisible to the wait. The race is small but real on CI and over remote sockets. - For commands you author, prefer the channel pattern: append + For commands you author, prefer ``run_command`` so completion, + exit status, and output arrive as one typed result. For custom + shell composition outside that shape, append ``; tmux wait-for -S `` to your ``send_keys`` payload and - call ``wait_for_channel`` instead. The ``run_and_wait`` prompt at - ``libtmux_mcp.prompts.recipes`` shows the safe composition. - Reserve ``wait_for_text`` for output you do not control + call ``wait_for_channel`` instead. Reserve ``wait_for_text`` for + output you do not control (third-party process logs, daemon prompts, interactive supervisors). diff --git a/tests/test_middleware.py b/tests/test_middleware.py index 6f398ff..f033851 100644 --- a/tests/test_middleware.py +++ b/tests/test_middleware.py @@ -176,6 +176,33 @@ class CommandRedactionFixture(t.NamedTuple): command: str +class MalformedOperationAuditFixture(t.NamedTuple): + """Test fixture for malformed send_keys_batch audit entries.""" + + test_id: str + operation: object + forbidden_text: str + expected_type: str + + +class MalformedOperationsPayloadAuditFixture(t.NamedTuple): + """Test fixture for malformed send_keys_batch audit payloads.""" + + test_id: str + args: dict[str, object] + forbidden_text: str + expected_shape: t.Literal["redacted_container", "redacted_unknown_field"] + + +class BatchSchemaValidationRedactionFixture(t.NamedTuple): + """Test fixture for schema-validation redaction of batch payloads.""" + + test_id: str + arguments: dict[str, t.Any] + secret: str + expected_fragments: tuple[str, ...] + + COMMAND_REDACTION_FIXTURES: list[CommandRedactionFixture] = [ CommandRedactionFixture( test_id="credential_bearing", @@ -188,6 +215,62 @@ class CommandRedactionFixture(t.NamedTuple): ] +MALFORMED_OPERATION_AUDIT_FIXTURES: list[MalformedOperationAuditFixture] = [ + MalformedOperationAuditFixture( + test_id="string_operation", + operation="keys=printf SECRET_COMMAND", + forbidden_text="SECRET_COMMAND", + expected_type="str", + ), + MalformedOperationAuditFixture( + test_id="list_operation", + operation=["keys=printf SECRET_COMMAND"], + forbidden_text="SECRET_COMMAND", + expected_type="list", + ), +] + + +MALFORMED_OPERATIONS_PAYLOAD_AUDIT_FIXTURES: list[ + MalformedOperationsPayloadAuditFixture +] = [ + MalformedOperationsPayloadAuditFixture( + test_id="operations_object", + args={"operations": {"keys": "printf SECRET_OBJECT", "pane_id": "%1"}}, + forbidden_text="SECRET_OBJECT", + expected_shape="redacted_container", + ), + MalformedOperationsPayloadAuditFixture( + test_id="unknown_operation_field", + args={"operations": [{"key": "printf SECRET_KEY", "pane_id": "%1"}]}, + forbidden_text="SECRET_KEY", + expected_shape="redacted_unknown_field", + ), +] + + +BATCH_SCHEMA_VALIDATION_REDACTION_FIXTURES: list[ + BatchSchemaValidationRedactionFixture +] = [ + BatchSchemaValidationRedactionFixture( + test_id="unknown_operation_field", + arguments={ + "operations": [{"key": "printf SECRET_SCHEMA_KEY", "pane_id": "%1"}] + }, + secret="SECRET_SCHEMA_KEY", + expected_fragments=("operations.0.key", "extra_forbidden"), + ), + BatchSchemaValidationRedactionFixture( + test_id="operations_object", + arguments={ + "operations": {"keys": "printf SECRET_SCHEMA_OBJECT", "pane_id": "%1"} + }, + secret="SECRET_SCHEMA_OBJECT", + expected_fragments=("operations", "list_type"), + ), +] + + @pytest.mark.parametrize( CommandRedactionFixture._fields, COMMAND_REDACTION_FIXTURES, @@ -234,6 +317,83 @@ def test_summarize_args_redacts_sensitive_dict_values() -> None: assert summary["pane_id"] == "%1" +def test_summarize_args_redacts_send_keys_batch_operations() -> None: + """send_keys_batch operation payloads are digested inside the list.""" + args: dict[str, t.Any] = { + "operations": [ + {"keys": "psql -U admin -W supersecret mydb", "pane_id": "%1"}, + {"keys": "printf public", "pane_id": "%2", "enter": False}, + ], + "on_error": "continue", + } + + summary = _summarize_args(args) + rendered = str(summary) + + assert "supersecret" not in rendered + assert "printf public" not in rendered + first = summary["operations"][0] + second = summary["operations"][1] + assert first["pane_id"] == "%1" + assert second["pane_id"] == "%2" + assert second["enter"] is False + for operation in (first, second): + assert isinstance(operation["keys"], dict) + assert "len" in operation["keys"] + assert "sha256_prefix" in operation["keys"] + + +@pytest.mark.parametrize( + MalformedOperationAuditFixture._fields, + MALFORMED_OPERATION_AUDIT_FIXTURES, + ids=[fixture.test_id for fixture in MALFORMED_OPERATION_AUDIT_FIXTURES], +) +def test_summarize_args_redacts_malformed_send_keys_batch_operation_entries( + test_id: str, + operation: object, + forbidden_text: str, + expected_type: str, +) -> None: + """Malformed send_keys_batch operation entries do not leak raw payloads.""" + assert test_id + summary = _summarize_args({"operations": [operation]}) + rendered = str(summary) + + assert forbidden_text not in rendered + item = summary["operations"][0] + assert isinstance(item, dict) + assert item["type"] == expected_type + assert item["redacted"] is True + + +@pytest.mark.parametrize( + MalformedOperationsPayloadAuditFixture._fields, + MALFORMED_OPERATIONS_PAYLOAD_AUDIT_FIXTURES, + ids=[fixture.test_id for fixture in MALFORMED_OPERATIONS_PAYLOAD_AUDIT_FIXTURES], +) +def test_summarize_args_redacts_malformed_send_keys_batch_operation_payloads( + test_id: str, + args: dict[str, object], + forbidden_text: str, + expected_shape: t.Literal["redacted_container", "redacted_unknown_field"], +) -> None: + """Malformed send_keys_batch operation payloads do not leak raw values.""" + assert test_id + summary = _summarize_args(args) + rendered = str(summary) + + assert forbidden_text not in rendered + operations = summary["operations"] + if expected_shape == "redacted_container": + assert operations == {"type": "dict", "redacted": True} + else: + assert isinstance(operations, list) + item = operations[0] + assert isinstance(item, dict) + assert item["pane_id"] == "%1" + assert item["key"] == {"type": "str", "redacted": True} + + def test_summarize_args_truncates_long_non_sensitive_strings() -> None: """Non-sensitive strings over the cap get truncated with a marker.""" args = {"output_path": "x" * 500} @@ -242,6 +402,58 @@ def test_summarize_args_truncates_long_non_sensitive_strings() -> None: assert len(summary["output_path"]) < 500 +@pytest.mark.parametrize( + BatchSchemaValidationRedactionFixture._fields, + BATCH_SCHEMA_VALIDATION_REDACTION_FIXTURES, + ids=[fixture.test_id for fixture in BATCH_SCHEMA_VALIDATION_REDACTION_FIXTURES], +) +def test_send_keys_batch_schema_validation_redacts_inputs( + test_id: str, + arguments: dict[str, t.Any], + secret: str, + expected_fragments: tuple[str, ...], + caplog: pytest.LogCaptureFixture, + monkeypatch: pytest.MonkeyPatch, +) -> None: + """Malformed batch schema errors do not echo raw key payloads.""" + from fastmcp import Client + + from libtmux_mcp.server import build_mcp_server + + assert test_id + for logger_name in ("fastmcp", "fastmcp.server.server", "fastmcp.errors"): + monkeypatch.setattr(logging.getLogger(logger_name), "propagate", True) + + async def _call() -> t.Any: + async with Client(build_mcp_server()) as client: + return await client.call_tool( + "send_keys_batch", + arguments, + raise_on_error=False, + ) + + with ( + caplog.at_level(logging.WARNING, logger="fastmcp.server.server"), + caplog.at_level(logging.WARNING, logger="fastmcp.errors"), + ): + result = asyncio.run(_call()) + + assert result.is_error is True + result_text = result.content[0].text + logs_text = "\n".join( + record.getMessage() + for record in caplog.records + if record.name in {"fastmcp.server.server", "fastmcp.errors"} + ) + + assert secret not in result_text + assert secret not in logs_text + assert "input_value" not in result_text + assert "'input':" not in logs_text + for fragment in expected_fragments: + assert fragment in result_text or fragment in logs_text + + class _RecordingCallNext: """Minimal async callable that records invocation and returns a value.""" diff --git a/tests/test_pane_tools.py b/tests/test_pane_tools.py index 8d340e3..8a565c3 100644 --- a/tests/test_pane_tools.py +++ b/tests/test_pane_tools.py @@ -5,9 +5,11 @@ import contextlib import pathlib import shlex +import subprocess import time import typing as t +import pydantic import pytest from fastmcp.exceptions import ToolError from libtmux import exc as libtmux_exc @@ -19,6 +21,7 @@ PaneContentMatch, PaneSnapshot, SearchPanesResult, + SendKeysOperation, WaitForTextResult, ) from libtmux_mcp.tools.pane_tools import ( @@ -79,6 +82,14 @@ class RunCommandPaneTargetFixture(t.NamedTuple): expected_output: str +class SendKeysOperationValidationFixture(t.NamedTuple): + """Test fixture for send_keys_batch operation validation.""" + + test_id: str + payload: dict[str, object] + expected_field: str + + class RunCommandHistoryFixture(t.NamedTuple): """Test fixture for run_command shell history suppression.""" @@ -123,6 +134,88 @@ class RunCommandHistoryFixture(t.NamedTuple): ] +SEND_KEYS_OPERATION_VALIDATION_FIXTURES: list[SendKeysOperationValidationFixture] = [ + SendKeysOperationValidationFixture( + test_id="unknown_pane_alias", + payload={"keys": "printf SECRET", "pane": "%2"}, + expected_field="pane", + ), + SendKeysOperationValidationFixture( + test_id="misspelled_pane_id", + payload={"keys": "printf SECRET", "pan_id": "%2"}, + expected_field="pan_id", + ), +] + + +class SendKeysBatchSuggestionFixture(t.NamedTuple): + """Test fixture for send_keys_batch suggestion preservation.""" + + test_id: str + operations: list[SendKeysOperation] + expected_error_snippet: str + + +SEND_KEYS_BATCH_SUGGESTION_FIXTURES: list[SendKeysBatchSuggestionFixture] = [ + SendKeysBatchSuggestionFixture( + test_id="missing_pane_id", + operations=[SendKeysOperation(keys="echo", pane_id="%invalid_pane")], + expected_error_snippet="Call list_panes to discover valid pane ids.", + ), +] + + +class SendKeysBatchTimeoutFixture(t.NamedTuple): + """Test fixture for send_keys_batch timeout.""" + + test_id: str + operations: list[dict[str, t.Any]] + timeout: float + expected_succeeded: int + expected_failed: int + expected_error_snippet: str + + +SEND_KEYS_BATCH_TIMEOUT_FIXTURES: list[SendKeysBatchTimeoutFixture] = [ + SendKeysBatchTimeoutFixture( + test_id="timeout_second_operation", + operations=[ + {"keys": "echo 1"}, + {"keys": "echo 2"}, + ], + timeout=0.05, + expected_succeeded=1, + expected_failed=1, + expected_error_snippet="timeout", + ), +] + + +class SendKeysBatchInProgressTimeoutFixture(t.NamedTuple): + """Test fixture for send_keys_batch in-progress send timeout.""" + + test_id: str + timeout: float + blocked_seconds: float + expected_succeeded: int + expected_failed: int + expected_error_snippet: str + + +SEND_KEYS_BATCH_IN_PROGRESS_TIMEOUT_FIXTURES: list[ + SendKeysBatchInProgressTimeoutFixture +] = [ + SendKeysBatchInProgressTimeoutFixture( + test_id="single_operation_stalls", + timeout=0.05, + blocked_seconds=0.1, + expected_succeeded=0, + expected_failed=1, + expected_error_snippet="timeout", + ), +] + + def test_send_keys(mcp_server: Server, mcp_pane: Pane) -> None: """send_keys sends keys to a pane.""" result = send_keys( @@ -133,20 +226,274 @@ def test_send_keys(mcp_server: Server, mcp_pane: Pane) -> None: assert "sent" in result.lower() -def test_send_keys_docstring_cross_links_wait_for_channel() -> None: - """``send_keys`` docstring steers agents at ``wait_for_channel`` first. +def test_send_keys_batch_sends_operations_in_order( + mcp_server: Server, mcp_pane: Pane +) -> None: + """send_keys_batch sends ordered raw-input operations and reports each one.""" + import asyncio + + from libtmux_mcp.models import SendKeysBatchResult, SendKeysOperation + from libtmux_mcp.tools.pane_tools import send_keys_batch + from libtmux_mcp.tools.wait_for_tools import wait_for_channel + + channel = "mcp_test_send_keys_batch_order" + result = send_keys_batch( + operations=[ + SendKeysOperation( + keys="printf 'BATCH_FIRST\\n'", + pane_id=mcp_pane.pane_id, + ), + SendKeysOperation( + keys=f"printf 'BATCH_SECOND\\n'; tmux wait-for -S {channel}", + pane_id=mcp_pane.pane_id, + ), + ], + socket_name=mcp_server.socket_name, + ) + + assert isinstance(result, SendKeysBatchResult) + assert result.succeeded == 2 + assert result.failed == 0 + assert result.stopped_at is None + assert [item.index for item in result.results] == [0, 1] + assert all(item.success for item in result.results) + assert all(item.pane_id == mcp_pane.pane_id for item in result.results) + + asyncio.run( + wait_for_channel(channel, timeout=5.0, socket_name=mcp_server.socket_name) + ) + capture = "\n".join(mcp_pane.capture_pane()) + assert capture.index("BATCH_FIRST") < capture.index("BATCH_SECOND") - Agents read tool descriptions when picking a synchronization primitive. - After the baseline-anchor design landed, ``send_keys`` → - ``wait_for_text`` can race for fast commands (the baseline locks after - the keys are buffered), and the channel pattern is strictly cheaper - for command completion. The docstring must therefore mention both - ``wait_for_channel`` and ``run_and_wait`` so the agent can find the - safe pattern without a separate docs lookup. - """ + +def test_send_keys_batch_continues_after_operation_error( + mcp_server: Server, mcp_pane: Pane +) -> None: + """send_keys_batch can keep later operations after a target failure.""" + import asyncio + + from libtmux_mcp.models import SendKeysOperation + from libtmux_mcp.tools.pane_tools import send_keys_batch + from libtmux_mcp.tools.wait_for_tools import wait_for_channel + + channel = "mcp_test_send_keys_batch_continue" + result = send_keys_batch( + operations=[ + SendKeysOperation( + keys="printf 'BATCH_BEFORE\\n'", + pane_id=mcp_pane.pane_id, + ), + SendKeysOperation(keys="printf 'BATCH_MISSING\\n'", pane_id="%999999"), + SendKeysOperation( + keys=f"printf 'BATCH_AFTER\\n'; tmux wait-for -S {channel}", + pane_id=mcp_pane.pane_id, + ), + ], + on_error="continue", + socket_name=mcp_server.socket_name, + ) + + assert result.succeeded == 2 + assert result.failed == 1 + assert result.stopped_at is None + assert [item.success for item in result.results] == [True, False, True] + assert result.results[1].pane_id is None + assert "Pane not found" in (result.results[1].error or "") + + asyncio.run( + wait_for_channel(channel, timeout=5.0, socket_name=mcp_server.socket_name) + ) + capture = "\n".join(mcp_pane.capture_pane()) + assert "BATCH_BEFORE" in capture + assert "BATCH_AFTER" in capture + assert "BATCH_MISSING" not in capture + + +def test_send_keys_batch_stops_after_operation_error( + mcp_server: Server, mcp_pane: Pane +) -> None: + """send_keys_batch defaults to stop-on-error without raising.""" + from libtmux_mcp.models import SendKeysOperation + from libtmux_mcp.tools.pane_tools import send_keys_batch + + result = send_keys_batch( + operations=[ + SendKeysOperation( + keys="printf 'BATCH_STOP_BEFORE\\n'", + pane_id=mcp_pane.pane_id, + ), + SendKeysOperation(keys="printf 'BATCH_STOP_MISSING\\n'", pane_id="%999999"), + SendKeysOperation( + keys="printf 'BATCH_STOP_AFTER\\n'", + pane_id=mcp_pane.pane_id, + ), + ], + socket_name=mcp_server.socket_name, + ) + + assert result.succeeded == 1 + assert result.failed == 1 + assert result.stopped_at == 1 + assert len(result.results) == 2 + assert [item.success for item in result.results] == [True, False] + capture = "\n".join(mcp_pane.capture_pane()) + assert "BATCH_STOP_AFTER" not in capture + + +def test_send_keys_batch_rejects_empty_operations(mcp_server: Server) -> None: + """send_keys_batch requires at least one operation.""" + from libtmux_mcp.tools.pane_tools import send_keys_batch + + with pytest.raises(ToolError, match="operations must not be empty"): + send_keys_batch(operations=[], socket_name=mcp_server.socket_name) + + +@pytest.mark.parametrize( + SendKeysBatchSuggestionFixture._fields, + SEND_KEYS_BATCH_SUGGESTION_FIXTURES, + ids=[fixture.test_id for fixture in SEND_KEYS_BATCH_SUGGESTION_FIXTURES], +) +def test_send_keys_batch_preserves_error_suggestions( + test_id: str, + operations: list[SendKeysOperation], + expected_error_snippet: str, + mcp_server: Server, +) -> None: + """send_keys_batch preserves exception suggestions in the error string.""" + assert test_id + from libtmux_mcp.tools.pane_tools import send_keys_batch + + result = send_keys_batch( + operations=operations, + socket_name=mcp_server.socket_name, + ) + assert len(result.results) == 1 + error_msg = result.results[0].error + assert error_msg is not None + assert expected_error_snippet in error_msg + + +@pytest.mark.parametrize( + SendKeysBatchTimeoutFixture._fields, + SEND_KEYS_BATCH_TIMEOUT_FIXTURES, + ids=[fixture.test_id for fixture in SEND_KEYS_BATCH_TIMEOUT_FIXTURES], +) +def test_send_keys_batch_timeout( + test_id: str, + operations: list[dict[str, t.Any]], + timeout: float, + expected_succeeded: int, + expected_failed: int, + expected_error_snippet: str, + mcp_server: Server, + mcp_pane: Pane, + monkeypatch: pytest.MonkeyPatch, +) -> None: + """send_keys_batch aborts if execution exceeds timeout.""" + assert test_id + from libtmux_mcp.models import SendKeysOperation + from libtmux_mcp.tools.pane_tools import send_keys_batch + + call_count = 0 + + def timed_send_keys( + *args: t.Any, **kwargs: t.Any + ) -> subprocess.CompletedProcess[str]: + nonlocal call_count + call_count += 1 + if call_count == 3: + raise subprocess.TimeoutExpired(cmd="tmux", timeout=timeout) + return subprocess.CompletedProcess(args=["tmux"], returncode=0) + + monkeypatch.setattr( + "libtmux_mcp.tools.pane_tools.io.subprocess.run", + timed_send_keys, + ) + + op_models = [] + for op in operations: + op["pane_id"] = mcp_pane.pane_id + op_models.append(SendKeysOperation(**op)) + + result = send_keys_batch( + operations=op_models, + timeout=timeout, + socket_name=mcp_server.socket_name, + ) + assert result.succeeded == expected_succeeded + assert result.failed == expected_failed + assert expected_error_snippet in (result.results[-1].error or "").lower() + + +@pytest.mark.parametrize( + SendKeysBatchInProgressTimeoutFixture._fields, + SEND_KEYS_BATCH_IN_PROGRESS_TIMEOUT_FIXTURES, + ids=[fixture.test_id for fixture in SEND_KEYS_BATCH_IN_PROGRESS_TIMEOUT_FIXTURES], +) +def test_send_keys_batch_timeout_bounds_in_progress_send( + test_id: str, + timeout: float, + blocked_seconds: float, + expected_succeeded: int, + expected_failed: int, + expected_error_snippet: str, + mcp_server: Server, + mcp_pane: Pane, + monkeypatch: pytest.MonkeyPatch, +) -> None: + """send_keys_batch fails a send that blocks past the batch timeout.""" + assert test_id + from libtmux import Pane + + from libtmux_mcp.models import SendKeysOperation + from libtmux_mcp.tools.pane_tools import send_keys_batch + + def stalled_send_keys(*args: t.Any, **kwargs: t.Any) -> None: + time.sleep(blocked_seconds) + + def timed_out_run(*args: t.Any, **kwargs: t.Any) -> t.NoReturn: + raise subprocess.TimeoutExpired(cmd="tmux", timeout=timeout) + + monkeypatch.setattr(Pane, "send_keys", stalled_send_keys) + monkeypatch.setattr("libtmux_mcp.tools.pane_tools.io.subprocess.run", timed_out_run) + + result = send_keys_batch( + operations=[ + SendKeysOperation(keys="echo stalled", pane_id=mcp_pane.pane_id), + ], + timeout=timeout, + socket_name=mcp_server.socket_name, + ) + assert result.succeeded == expected_succeeded + assert result.failed == expected_failed + assert expected_error_snippet in (result.results[0].error or "").lower() + + +@pytest.mark.parametrize( + SendKeysOperationValidationFixture._fields, + SEND_KEYS_OPERATION_VALIDATION_FIXTURES, + ids=[fixture.test_id for fixture in SEND_KEYS_OPERATION_VALIDATION_FIXTURES], +) +def test_send_keys_operation_rejects_unknown_fields( + test_id: str, + payload: dict[str, object], + expected_field: str, +) -> None: + """send_keys_batch operation validation rejects unsupported fields.""" + assert test_id + with pytest.raises(pydantic.ValidationError) as excinfo: + SendKeysOperation.model_validate(payload) + + assert expected_field in str(excinfo.value) + + +def test_send_keys_docstring_routes_authored_commands_to_run_command() -> None: + """``send_keys`` docstring keeps raw input below command completion.""" assert send_keys.__doc__ is not None + assert "run_command" in send_keys.__doc__ + assert "send_keys_batch" in send_keys.__doc__ + assert "capture_since" in send_keys.__doc__ assert "wait_for_channel" in send_keys.__doc__ - assert "run_and_wait" in send_keys.__doc__ @pytest.mark.parametrize( @@ -4203,6 +4550,7 @@ def test_paste_text_does_not_leak_named_buffer( # Shell-driving tools: the command the caller sends can reach # arbitrary external state, so the interaction is open-world. ("send_keys", True), + ("send_keys_batch", True), ("run_command", True), ("paste_text", True), ("pipe_pane", True), diff --git a/tests/test_prompts.py b/tests/test_prompts.py index 389d37c..04ee5c7 100644 --- a/tests/test_prompts.py +++ b/tests/test_prompts.py @@ -50,83 +50,40 @@ def test_prompts_as_tools_enabled_by_env( def test_run_and_wait_returns_string_template() -> None: - """``run_and_wait`` prompt produces a string with the safe idiom. - - The rendered payload must NOT contain ``exit`` in the shell command - portion: an interactive-shell ``exit`` after the signal kills the - parent shell, which destroys single-pane tmux sessions. The signal - fires unconditionally via shell ``;`` semantics whether the command - succeeds or fails — the wait-for primitive doesn't need an exit to - preserve safety. - """ + """``run_and_wait`` prompt teaches the typed command primitive.""" from libtmux_mcp.prompts.recipes import run_and_wait text = run_and_wait(command="pytest", pane_id="%1", timeout=30.0) - assert "tmux wait-for -S libtmux_mcp_wait_" in text - assert "wait_for_channel" in text - # The shell payload (between `keys=` and the closing quote) must - # not append ``exit`` after the wait-for — that would kill the - # parent shell in an interactive pane. Check the rendered keys= - # line for the absence of the exit suffix. - keys_line = next( - line for line in text.splitlines() if line.strip().startswith("keys=") - ) - assert "; exit" not in keys_line - assert "exit $" not in keys_line - - -def test_run_and_wait_channel_is_uuid_scoped() -> None: - """Each ``run_and_wait`` call embeds a unique wait-for channel. - - Regression guard for the critical bug where every call hardcoded - ``mcp_done``, so concurrent agents racing on tmux's server-global - channel namespace would cross-signal each other. Now the channel - is ``libtmux_mcp_wait_`` (full 128-bit UUID, fresh per - invocation) and consistent within one invocation — the name that - appears in the ``send_keys`` payload must match the - ``wait_for_channel`` call. - """ - import re - + assert "run_command" in text + assert "exit_status" in text + assert "timed_out" in text + assert "output" in text + assert "wait_for_channel" not in text + assert "tmux wait-for -S" not in text + assert "send_keys(" not in text + assert "capture_pane(" not in text + + +def test_run_and_wait_does_not_render_manual_channel_recipe() -> None: + """``run_and_wait`` leaves channel plumbing to ``run_command``.""" from libtmux_mcp.prompts.recipes import run_and_wait - first = run_and_wait(command="pytest", pane_id="%1") - second = run_and_wait(command="pytest", pane_id="%1") - - pattern = re.compile(r"libtmux_mcp_wait_[0-9a-f]{32}") - first_matches = pattern.findall(first) - second_matches = pattern.findall(second) - - # Two occurrences per rendering: one inside send_keys, one in - # wait_for_channel. Both must be the SAME channel name within a - # single rendering (consistency). - assert len(first_matches) == 2 - assert first_matches[0] == first_matches[1] - assert len(second_matches) == 2 - assert second_matches[0] == second_matches[1] - - # And the two renderings must differ from each other (uniqueness). - assert first_matches[0] != second_matches[0] + text = run_and_wait(command="pytest", pane_id="%1") + assert "libtmux_mcp_wait_" not in text + assert "result = run_command(" in text def test_run_and_wait_handles_quoted_commands() -> None: - """Single quotes in the command don't corrupt the rendered keys=... - - Regression guard for the fragile ``keys='{command}; ...'`` wrap — - a command like ``python -c 'print(1)'`` closed the surrounding - single-quote prematurely, producing a syntactically invalid - ``send_keys`` call in the prompt output. The fix uses ``repr()`` - so Python picks a quote style that round-trips safely. - """ + """Single quotes in the command don't corrupt the rendered call.""" import ast from libtmux_mcp.prompts.recipes import run_and_wait text = run_and_wait(command="python -c 'print(1)'", pane_id="%1") - # Extract the ``keys=`` argument as a Python literal and confirm + # Extract the ``command=`` argument as a Python literal and confirm # it parses back to a string containing the original command. - keys_line = next(line for line in text.splitlines() if "keys=" in line) - _, _, payload = keys_line.partition("keys=") + command_line = next(line for line in text.splitlines() if "command=" in line) + _, _, payload = command_line.partition("command=") payload = payload.rstrip(",").strip() parsed = ast.literal_eval(payload) assert isinstance(parsed, str) diff --git a/tests/test_server.py b/tests/test_server.py index 55d89d0..023e406 100644 --- a/tests/test_server.py +++ b/tests/test_server.py @@ -183,6 +183,7 @@ async def main(): print(json.dumps({ "list_sessions": "list_sessions" in names, "send_keys": "send_keys" in names, + "send_keys_batch": "send_keys_batch" in names, "kill_pane": "kill_pane" in names, })) @@ -202,6 +203,7 @@ async def main(): assert result == { "list_sessions": True, "send_keys": False, + "send_keys_batch": False, "kill_pane": False, } @@ -247,21 +249,17 @@ def test_base_instructions_surface_flagship_read_tools() -> None: assert "capture_since" in _BASE_INSTRUCTIONS -def test_base_instructions_prefer_wait_over_poll() -> None: - """_BASE_INSTRUCTIONS names the wait family with the right primacy. - - ``wait_for_channel`` is the deterministic primitive (composes - ``tmux wait-for -S``) and should appear first; ``wait_for_text`` - and ``wait_for_content_change`` are the fallbacks for output the - agent doesn't author. Making the channel primitive discoverable - from the instructions steers agents off the polling-scraper path - for command-completion synchronization. - """ +def test_base_instructions_prefer_typed_completion_over_polling() -> None: + """_BASE_INSTRUCTIONS names typed completion and observation primitives.""" assert "run_command" in _BASE_INSTRUCTIONS assert "wait_for_channel" in _BASE_INSTRUCTIONS assert "capture_since" in _BASE_INSTRUCTIONS assert "wait_for_text" in _BASE_INSTRUCTIONS assert "wait_for_content_change" in _BASE_INSTRUCTIONS + assert "send_keys_batch" in _BASE_INSTRUCTIONS + assert _BASE_INSTRUCTIONS.index("run_command") < _BASE_INSTRUCTIONS.index( + "wait_for_channel" + ) # The channel primitive should be named before the fallbacks so an # agent that scans top-to-bottom encounters the cheaper option first. assert _BASE_INSTRUCTIONS.index("wait_for_channel") < _BASE_INSTRUCTIONS.index( @@ -618,6 +616,7 @@ def test_readonly_hint_visible_only_on_readonly_tier( _VERBS_OF_ART = frozenset( [ "send_keys", + "send_keys_batch", "capture_pane", "capture_since", "snapshot_pane",