diff --git a/CHANGES b/CHANGES index 587a7aba..d292e116 100644 --- a/CHANGES +++ b/CHANGES @@ -6,6 +6,38 @@ _Notes on upcoming releases will be added here_ +### What's new + +**One-call command completion with {tooliconl}`run-command`** + +{tooliconl}`run-command` runs a shell command in a pane, waits for it to finish, and returns the exit status, timeout state, and tail-preserved output in a single call — no manual {tooliconl}`send-keys` + {tooliconl}`wait-for-channel` + {tooliconl}`capture-pane` sequence. The command runs in a subshell so its state changes don't leak into later calls, and `suppress_history` keeps secret-bearing commands out of shell history where the shell ignores space-prefixed input. (#73) + +**Richer, typed pane and window metadata** + +{tooliconl}`snapshot-pane` now reports `pane_pid`, `pane_dead`, and `alternate_on` for liveness and alternate-screen decisions, and window results carry `active_pane_id` for reliable follow-up targeting. The package also ships a `py.typed` marker so downstream type checkers see its inline annotations. (#75) + +### Fixes + +**Read-only tools no longer evaluate tmux `#()` format jobs** + +{tooliconl}`search-panes` and {tooliconl}`display-message` are advertised as read-only, but tmux `#(...)` formats schedule shell jobs during expansion. Both now reject or route around `#()` so a read-only call can never spawn a shell. (#68, #69) + +**Invalid `LIBTMUX_SAFETY` fails closed** + +An unrecognized `LIBTMUX_SAFETY` value now falls back to `readonly` instead of `mutating`, so a typo in the safety tier can no longer expose write tools the operator meant to hide. (#71) + +**Large structured results keep their structured payload** + +The global response backstop was truncating big successful results into text-only responses before tool-level caps ran, dropping the structured metadata schema-bearing tools depend on. It now matches FastMCP's 1 MB default, leaving per-tool line caps to handle terminal truncation. (#70) + +**{tooliconl}`clear-pane` clears scrollback reliably** + +{tooliconl}`clear-pane` now uses libtmux's single-call reset path; the previous two-call sequence could leave scrollback intact. Its annotations also disclose that it is destructive and non-idempotent. (#74) + +**Stdio transport pinned at startup** + +The server runs with an explicit stdio transport so an inherited FastMCP transport environment can't change its startup surface and break stdio clients, and `--help` / `--version` resolve locally without starting the server. (#72) + ## libtmux-mcp 0.1.0a11 (2026-06-06) libtmux-mcp 0.1.0a11 redesigns how tool failures reach agents. Error messages now arrive exactly as raised — no more `Internal error:` mangling — with structured detail and recovery hints that tell agents what to do next, from stale pane ids to stray arguments leaked by client schedulers. Expected, agent-correctable failures log at WARNING so ERROR records always mean an operator should look. The fastmcp floor rises to 3.4.0 to build on its error-result and log-level support. diff --git a/README.md b/README.md index 077477ab..ab7611d6 100644 --- a/README.md +++ b/README.md @@ -15,12 +15,14 @@ Give your AI agent hands inside the terminal — create sessions, run commands, | Module | Tools | |--------|-------| -| **Server** | `list_sessions`, `create_session`, `kill_server`, `get_server_info` | +| **Server** | `list_servers`, `list_sessions`, `create_session`, `kill_server`, `get_server_info` | | **Session** | `list_windows`, `get_session_info`, `create_window`, `rename_session`, `select_window`, `kill_session` | | **Window** | `list_panes`, `get_window_info`, `split_window`, `rename_window`, `select_layout`, `resize_window`, `move_window`, `kill_window` | -| **Pane** | `send_keys`, `paste_text`, `capture_pane`, `capture_since`, `snapshot_pane`, `search_panes`, `get_pane_info`, `wait_for_text`, `wait_for_content_change`, `display_message`, `select_pane`, `swap_pane`, `resize_pane`, `set_pane_title`, `clear_pane`, `pipe_pane`, `enter_copy_mode`, `exit_copy_mode`, `respawn_pane`, `kill_pane` | +| **Pane** | `run_command`, `send_keys`, `paste_text`, `capture_pane`, `capture_since`, `snapshot_pane`, `search_panes`, `find_pane_by_position`, `get_pane_info`, `wait_for_text`, `wait_for_content_change`, `wait_for_channel`, `signal_channel`, `display_message`, `select_pane`, `swap_pane`, `resize_pane`, `set_pane_title`, `clear_pane`, `pipe_pane`, `enter_copy_mode`, `exit_copy_mode`, `respawn_pane`, `kill_pane` | | **Options** | `show_option`, `set_option` | | **Environment** | `show_environment`, `set_environment` | +| **Buffers** | `load_buffer`, `paste_buffer`, `show_buffer`, `delete_buffer` | +| **Hooks** | `show_hooks`, `show_hook` | ## Quickstart @@ -108,7 +110,7 @@ re-sending the same scrollback to the model on every check. and declines self-destructive operations — [`kill_session`](https://libtmux-mcp.git-pull.com/tools/session/kill-session/) on itself fails loudly instead of silently terminating the host environment the agent is running in. [`LIBTMUX_SAFETY`](https://libtmux-mcp.git-pull.com/configuration/#envvar-LIBTMUX_SAFETY) -(`read`, `read+send`, `read+send+kill`) hides whole tiers from the +(`readonly`, `mutating`, `destructive`) hides whole tiers from the client's tool list before any prompt is built. ## Documentation diff --git a/docs/prompts.md b/docs/prompts.md index 40453e43..f8c16584 100644 --- a/docs/prompts.md +++ b/docs/prompts.md @@ -16,7 +16,8 @@ counterpart to the longer narrative recipes in {doc}`/recipes`. :::{grid-item-card} `run_and_wait` :link: fastmcp-prompt-run-and-wait :link-type: ref -Execute a shell command and block until it finishes, preserving exit status. +Execute a shell command and block until it finishes. Use +{tooliconl}`run-command` when exit status matters. ::: :::{grid-item-card} `diagnose_failing_pane` @@ -53,7 +54,7 @@ tools instead. ``` **Use when** the agent needs to execute a single shell command and -must know whether it succeeded before deciding the next step. +wait for completion through an explicit tmux signal. **Why use this instead of `send_keys` + `capture_pane` polling?** Each rendered call embeds a UUID-scoped ``tmux wait-for`` channel, @@ -84,18 +85,15 @@ After the channel signals, read the last ~100 lines to verify the command's behaviour. Do NOT use a `capture_pane` retry loop — `wait_for_channel` is strictly cheaper in agent turns. -The payload does not preserve the command's exit status: doing so -in an interactive shell would require exiting the shell (which kills -the pane) or routing through an out-of-band file or tmux variable. -If you need the status, inspect the captured output for -command-specific success markers. +The payload does not preserve the command's exit status. Use +{tooliconl}`run-command` instead when exit status must be returned as +structured data. ```` Shell ``;`` semantics fire the ``wait-for -S`` whether ``pytest`` succeeded or failed, so the edge-triggered signal never deadlocks the agent on a crashed command. Status preservation is intentionally -omitted: chaining ``exit $status`` after the signal would exit the -interactive shell itself, destroying single-pane sessions. +omitted from this prompt recipe. --- diff --git a/docs/tools/index.md b/docs/tools/index.md index 007fec29..07e7b02d 100644 --- a/docs/tools/index.md +++ b/docs/tools/index.md @@ -21,6 +21,7 @@ All tools accept an optional `socket_name` parameter for multi-server support. I - Already know the `pane_id` → use it directly **Running a command?** +- {tool}`run-command` — one call to run a shell command, wait for completion, capture output, and return exit status - {tool}`send-keys` (with `tmux wait-for -S ` composed into the keys) → {tool}`wait-for-channel` → {tool}`capture-pane` — the deterministic path for commands the agent authors - For output the agent does not author (third-party logs, daemon prompts), use {tool}`wait-for-text` or {tool}`wait-for-content-change` between `send-keys` and `capture-pane` - Pasting multi-line text? → {tool}`paste-text` @@ -223,6 +224,12 @@ Split a window into panes. Send commands or keystrokes to a pane. ::: +:::{grid-item-card} run_command +:link: run-command +:link-type: ref +Run a shell command and report exit status. +::: + :::{grid-item-card} rename_session :link: rename-session :link-type: ref diff --git a/docs/tools/pane/clear-pane.md b/docs/tools/pane/clear-pane.md index fc8df08e..fe8c080d 100644 --- a/docs/tools/pane/clear-pane.md +++ b/docs/tools/pane/clear-pane.md @@ -5,7 +5,7 @@ **Use when** you want a clean terminal before capturing output. -**Side effects:** Clears the pane's visible content. +**Side effects:** Clears the pane's visible content and scrollback. **Example:** diff --git a/docs/tools/pane/index.md b/docs/tools/pane/index.md index 03a06c00..eb4a929d 100644 --- a/docs/tools/pane/index.md +++ b/docs/tools/pane/index.md @@ -37,6 +37,10 @@ Evaluate a tmux format string against a target. Send keystrokes or commands to a pane. ::: +:::{grid-item-card} {tooliconl}`run-command` +Run a shell command, wait, and capture output. +::: + :::{grid-item-card} {tooliconl}`paste-text` Paste multi-line text via tmux buffer. ::: @@ -111,6 +115,7 @@ get-pane-info find-pane-by-position display-message send-keys +run-command paste-text pipe-pane select-pane diff --git a/docs/tools/pane/run-command.md b/docs/tools/pane/run-command.md new file mode 100644 index 00000000..721dd6f3 --- /dev/null +++ b/docs/tools/pane/run-command.md @@ -0,0 +1,47 @@ +# Run command + +```{fastmcp-tool} pane_tools.run_command +``` + +**Use when** you need to run a shell command in a pane and get a typed +result with exit status, timeout state, and captured pane output. + +**Avoid when** you need raw interactive key driving — use +{tooliconl}`send-keys` for TUIs, key names, and partial commands. + +**Side effects:** Sends a command to the pane's interactive shell. The +command may read or write files, start processes, or access the network +depending on what the shell command does. Each command runs in a subshell, +so directory or environment changes do not persist across calls. +Set `suppress_history=true` for secret-bearing commands on shells that +honor leading-space history suppression. + +**Example:** + +```json +{ + "tool": "run_command", + "arguments": { + "command": "pytest -q", + "pane_id": "%2", + "timeout": 60 + } +} +``` + +Response: + +```json +{ + "pane_id": "%2", + "exit_status": 0, + "timed_out": false, + "elapsed_seconds": 4.2, + "output": ["..."], + "output_truncated": false, + "output_truncated_lines": 0 +} +``` + +```{fastmcp-tool-input} pane_tools.run_command +``` diff --git a/docs/tools/pane/snapshot-pane.md b/docs/tools/pane/snapshot-pane.md index 3659e820..84a383a0 100644 --- a/docs/tools/pane/snapshot-pane.md +++ b/docs/tools/pane/snapshot-pane.md @@ -43,6 +43,9 @@ Response: "pane_at_top": true, "pane_at_bottom": true, "pane_tty": "/dev/pts/5", + "pane_pid": "12345", + "pane_dead": false, + "alternate_on": false, "pane_in_mode": false, "pane_mode": null, "scroll_position": null, diff --git a/docs/tools/session/create-window.md b/docs/tools/session/create-window.md index 3ae3eb09..b3954532 100644 --- a/docs/tools/session/create-window.md +++ b/docs/tools/session/create-window.md @@ -32,7 +32,8 @@ Response: "window_layout": "b25f,80x24,0,0,5", "window_active": "1", "window_width": "80", - "window_height": "24" + "window_height": "24", + "active_pane_id": "%5" } ``` diff --git a/docs/tools/session/list-windows.md b/docs/tools/session/list-windows.md index 0b3fb3ee..c4b99e31 100644 --- a/docs/tools/session/list-windows.md +++ b/docs/tools/session/list-windows.md @@ -35,7 +35,8 @@ Response: "window_layout": "c195,80x24,0,0[80x12,0,0,0,80x11,0,13,1]", "window_active": "1", "window_width": "80", - "window_height": "24" + "window_height": "24", + "active_pane_id": "%0" }, { "window_id": "@1", @@ -47,7 +48,8 @@ Response: "window_layout": "b25f,80x24,0,0,2", "window_active": "0", "window_width": "80", - "window_height": "24" + "window_height": "24", + "active_pane_id": "%2" } ] ``` diff --git a/docs/tools/session/select-window.md b/docs/tools/session/select-window.md index a6b6119c..da635a20 100644 --- a/docs/tools/session/select-window.md +++ b/docs/tools/session/select-window.md @@ -33,7 +33,8 @@ Response: "window_layout": "b25f,80x24,0,0,2", "window_active": "1", "window_width": "80", - "window_height": "24" + "window_height": "24", + "active_pane_id": "%2" } ``` diff --git a/docs/tools/window/get-window-info.md b/docs/tools/window/get-window-info.md index 1357a391..c6ce085f 100644 --- a/docs/tools/window/get-window-info.md +++ b/docs/tools/window/get-window-info.md @@ -36,7 +36,8 @@ Response: "window_layout": "7f9f,80x24,0,0[80x15,0,0,0,80x8,0,16,1]", "window_active": "1", "window_width": "80", - "window_height": "24" + "window_height": "24", + "active_pane_id": "%1" } ``` diff --git a/docs/tools/window/move-window.md b/docs/tools/window/move-window.md index 699a7567..58397e02 100644 --- a/docs/tools/window/move-window.md +++ b/docs/tools/window/move-window.md @@ -33,7 +33,8 @@ Response: "window_layout": "b25f,80x24,0,0,2", "window_active": "0", "window_width": "80", - "window_height": "24" + "window_height": "24", + "active_pane_id": "%2" } ``` diff --git a/docs/tools/window/rename-window.md b/docs/tools/window/rename-window.md index 7b9bd4bd..9ad6ec0d 100644 --- a/docs/tools/window/rename-window.md +++ b/docs/tools/window/rename-window.md @@ -32,7 +32,8 @@ Response: "window_layout": "7f9f,80x24,0,0[80x15,0,0,0,80x8,0,16,1]", "window_active": "1", "window_width": "80", - "window_height": "24" + "window_height": "24", + "active_pane_id": "%0" } ``` diff --git a/docs/tools/window/resize-window.md b/docs/tools/window/resize-window.md index 94c23bab..a1942759 100644 --- a/docs/tools/window/resize-window.md +++ b/docs/tools/window/resize-window.md @@ -33,7 +33,8 @@ Response: "window_layout": "baaa,120x40,0,0[120x20,0,0,0,120x19,0,21,1]", "window_active": "1", "window_width": "120", - "window_height": "40" + "window_height": "40", + "active_pane_id": "%0" } ``` diff --git a/docs/tools/window/select-layout.md b/docs/tools/window/select-layout.md index c866db07..22844c75 100644 --- a/docs/tools/window/select-layout.md +++ b/docs/tools/window/select-layout.md @@ -33,7 +33,8 @@ Response: "window_layout": "even-vertical,80x24,0,0[80x12,0,0,0,80x11,0,13,1]", "window_active": "1", "window_width": "80", - "window_height": "24" + "window_height": "24", + "active_pane_id": "%0" } ``` diff --git a/docs/topics/architecture.md b/docs/topics/architecture.md index fa1d5c6c..aa87a0f2 100644 --- a/docs/topics/architecture.md +++ b/docs/topics/architecture.md @@ -15,10 +15,12 @@ src/libtmux_mcp/ models.py # Pydantic output models middleware.py # Safety, audit, retry, and error-result middleware tools/ - server_tools.py # list_sessions, create_session, kill_server, get_server_info + server_tools.py # list_servers, list_sessions, create_session, kill_server, get_server_info session_tools.py # list_windows, create_window, rename_session, kill_session window_tools.py # list_panes, split_window, rename_window, kill_window, select_layout, resize_window - pane_tools.py # send_keys, capture_pane, capture_since, resize_pane, kill_pane, set_pane_title, get_pane_info, clear_pane, search_panes, wait_for_text + pane_tools.py # run_command, send_keys, capture_pane, capture_since, snapshot_pane, search_panes, wait_for_text + buffer_tools.py # load_buffer, paste_buffer, show_buffer, delete_buffer + hook_tools.py # show_hooks, show_hook option_tools.py # show_option, set_option env_tools.py # show_environment, set_environment resources/ diff --git a/docs/topics/completion.md b/docs/topics/completion.md index 44ce631f..c4b44bf2 100644 --- a/docs/topics/completion.md +++ b/docs/topics/completion.md @@ -2,11 +2,10 @@ # Completion -libtmux-mcp inherits FastMCP's built-in +The [MCP completion](https://modelcontextprotocol.io/specification/2025-11-25/server/utilities/completion) -behaviour. We don't hand-author completion providers — the argument -shapes on our prompts and resource templates are what the client -sees. +protocol lets clients ask a server for argument suggestions. libtmux-mcp +does not currently register custom completion handlers. ## What the spec does @@ -19,22 +18,16 @@ session picker popup when filling ``session_name=`` on ## What libtmux-mcp currently exposes - **Prompt arguments** — the four recipes ({doc}`/prompts`) - advertise their argument names and types. FastMCP derives a default - completion shape from the Python signatures: - ``str`` arguments accept free text, ``float`` arguments accept - numeric strings, no enum / list suggestions. + advertise their argument names and types through their schemas. - **Resource template parameters** — {doc}`/resources` URIs carry ``{session_name}``, ``{window_index}``, ``{pane_id}``, and ``{?socket_name}`` - placeholders. Completion suggestions are again derived from the - function signatures' types, not from live tmux state. + placeholders. ```{warning} -libtmux-mcp does **not** currently wire completion back to live -tmux enumeration — i.e. the completion for ``session_name`` will not -return the names of sessions that exist on the server right now. -Adding that requires a dedicated FastMCP completion handler; -tracked as a potential enhancement. +Clients should not rely on ``completion/complete`` returning live tmux +suggestions, schema-derived examples, or enum-like values today. +Adding live suggestions requires dedicated completion handlers. ``` ## Workarounds for clients that need live enumeration diff --git a/docs/topics/gotchas.md b/docs/topics/gotchas.md index 99158f52..edab3b9f 100644 --- a/docs/topics/gotchas.md +++ b/docs/topics/gotchas.md @@ -64,13 +64,11 @@ However, they reset when the tmux **server** restarts. Do not cache pane IDs acr ## `suppress_history` requires shell support -The `suppress_history` parameter on `send_keys` prepends a space before the command, which prevents it from being saved in shell history. This only works if the shell's `HISTCONTROL` variable includes `ignorespace` (the default for bash, but not universal across all shells). - -## `clear_pane` is not fully atomic - -`clear_pane` runs two tmux commands in sequence: `send-keys -R` (reset terminal) then `clear-history` (clear scrollback). There is a brief gap between them where partial content may be visible. - -For most use cases this is not a problem. If you need guaranteed clean state, add a small delay before the next `capture_pane`. +The `suppress_history` parameter on {tooliconl}`send-keys` and +{tooliconl}`run-command` prepends a space before the command, which prevents it +from being saved in shell history. This only works if the shell's `HISTCONTROL` +variable includes `ignorespace` (the default for bash, but not universal across +all shells). ## Gemini CLI injects `wait_for_previous` into tool arguments diff --git a/docs/topics/pagination.md b/docs/topics/pagination.md index 10ad5e4b..31520479 100644 --- a/docs/topics/pagination.md +++ b/docs/topics/pagination.md @@ -2,23 +2,23 @@ # Pagination -libtmux-mcp follows the -[MCP pagination spec](https://modelcontextprotocol.io/specification/2025-11-25/server/utilities/pagination): -``tools/list``, ``prompts/list``, ``resources/list``, and -``resources/templates/list`` all return an opaque ``nextCursor`` when -a page is truncated, and accept ``cursor`` on the next call to -resume. +The +[MCP pagination spec](https://modelcontextprotocol.io/specification/2025-11-25/server/utilities/pagination) +defines opaque cursors for list-style protocol calls. FastMCP supports +that protocol pagination when a server is configured with +``list_page_size``. libtmux-mcp does not currently configure +protocol-level list pagination, so its registry lists normally return +as one page under FastMCP's defaults. ## Where cursors and pages show up ### Protocol-level list calls -FastMCP handles ``tools/list`` / ``prompts/list`` / ``resources/list`` -/ ``resources/templates/list`` pagination automatically. Neither -libtmux-mcp nor the agent needs to do anything: the server chooses -a sensible page size, encodes the cursor in an opaque base64 blob, -and replays state from it. Callers only need to thread through -``nextCursor`` if they consume the raw MCP protocol. +``tools/list`` / ``prompts/list`` / ``resources/list`` / +``resources/templates/list`` are registry-list calls. In this server's +current configuration, clients should expect those lists to arrive in +one response unless libtmux-mcp later enables FastMCP's +``list_page_size`` setting. ### Tool-level result paging on ``search_panes`` @@ -61,8 +61,7 @@ matches. ## Why separate paths Protocol-level cursors are for **collections the server owns -end-to-end**: the tool / prompt / resource registries. The server -knows what it has, so an opaque cursor is cheap. +end-to-end**: the tool / prompt / resource registries. Tool-level paging and observation cursors are for **state derived from live tmux panes**. Capturing every pane's contents and running a diff --git a/docs/topics/prompting.md b/docs/topics/prompting.md index c7450a8f..bcd81eb6 100644 --- a/docs/topics/prompting.md +++ b/docs/topics/prompting.md @@ -94,12 +94,13 @@ When executing long-running commands (servers, builds, test suites), use tmux via the libtmux MCP server rather than running them directly. This keeps output accessible for later inspection. -For command completion, compose `tmux wait-for -S ` into the -shell command and call wait_for_channel — deterministic, no polling. -Use wait_for_text or wait_for_content_change for observation flows -(third-party logs, daemon prompts), and use capture_since when you -need to read the same pane repeatedly. Never capture_pane immediately -after send_keys — the command may still be running. +For authored shell commands that need status, use run_command. For +custom command completion, compose `tmux wait-for -S ` into +the shell command and call wait_for_channel — deterministic, no +polling. Use wait_for_text or wait_for_content_change for observation +flows (third-party logs, daemon prompts), and use capture_since when +you need to read the same pane repeatedly. Never capture_pane +immediately after send_keys — the command may still be running. ``` ### For safe agent behavior diff --git a/docs/topics/safety.md b/docs/topics/safety.md index df003702..2590bf04 100644 --- a/docs/topics/safety.md +++ b/docs/topics/safety.md @@ -149,7 +149,7 @@ Each tool carries MCP tool annotations that hint at its behavior: | {ref}`resize-pane` | {badge}`mutating` | false | false | true | | {ref}`resize-window` | {badge}`mutating` | false | false | true | | {ref}`set-pane-title` | {badge}`mutating` | false | false | true | -| {ref}`clear-pane` | {badge}`mutating` | false | false | true | +| {ref}`clear-pane` | {badge}`mutating` | false | true | false | | {ref}`select-layout` | {badge}`mutating` | false | false | true | | {ref}`set-option` | {badge}`mutating` | false | false | true | | {ref}`set-environment` | {badge}`mutating` | false | false | true | diff --git a/src/libtmux_mcp/__init__.py b/src/libtmux_mcp/__init__.py index effd7104..2ef7622e 100644 --- a/src/libtmux_mcp/__init__.py +++ b/src/libtmux_mcp/__init__.py @@ -2,18 +2,36 @@ from __future__ import annotations +import argparse +import sys +import typing as t + from .__about__ import __version__ __all__ = ["__version__"] -def main() -> None: +def _build_parser() -> argparse.ArgumentParser: + """Build the local command-line parser.""" + parser = argparse.ArgumentParser( + prog="libtmux-mcp", + description="Run the libtmux MCP server over stdio.", + ) + parser.add_argument( + "--version", + action="version", + version=f"libtmux-mcp {__version__}", + ) + return parser + + +def main(argv: t.Sequence[str] | None = None) -> None: """Entry point for the libtmux MCP server.""" + _build_parser().parse_args(argv) + try: from libtmux_mcp.server import run_server except ImportError: - import sys - print( "libtmux-mcp requires fastmcp. Install with: pip install libtmux-mcp", file=sys.stderr, diff --git a/src/libtmux_mcp/_utils.py b/src/libtmux_mcp/_utils.py index 8c1bfea9..f6a239b5 100644 --- a/src/libtmux_mcp/_utils.py +++ b/src/libtmux_mcp/_utils.py @@ -364,11 +364,12 @@ def _caller_is_strictly_on_server( "openWorldHint": False, } #: Annotations for tools that move user-supplied payloads into a shell -#: context. Five consumers today: +#: context. Six consumers today: #: -#: * ``send_keys``, ``paste_text``, ``pipe_pane`` — the canonical -#: shell-driving tools; caller's keys/text/stream reaches the shell -#: prompt or pipes into an external command respectively. +#: * ``send_keys``, ``run_command``, ``paste_text``, ``pipe_pane`` — the +#: canonical shell-driving tools; caller's keys/command/text/stream +#: reaches the shell prompt or pipes into an external command +#: respectively. #: * ``load_buffer``, ``paste_buffer`` — ``load_buffer`` stages content #: into a tmux paste buffer; ``paste_buffer`` pushes that content #: into a target pane where the shell receives it as input. The two @@ -410,13 +411,10 @@ def _caller_is_strictly_on_server( #: visible to default-profile agents) but whose default behaviour can #: terminate processes or otherwise lose state. #: -#: ``respawn_pane`` is the canonical user: tier=mutating because shell -#: recovery is part of the normal agent workflow; ``destructiveHint=True`` -#: because ``kill=True`` (the default) sends ``SPAWN_KILL`` to the existing -#: process (`cmd-respawn-pane.c:78-79`); ``idempotentHint=False`` because -#: repeated calls kill repeated processes — the MCP spec defines idempotent -#: as "calling repeatedly with the same arguments will have no additional -#: effect" (`mcp/types.py:1276-1282`). +#: Canonical users include ``respawn_pane`` and ``clear_pane``: +#: tier=mutating because shell recovery and scrollback cleanup are part +#: of normal agent workflows, while the hints still disclose process +#: termination or state loss. #: #: Distinct from :data:`ANNOTATIONS_DESTRUCTIVE` (same hint values) because #: the tier tag differs: ``ANNOTATIONS_DESTRUCTIVE`` is paired with @@ -894,6 +892,9 @@ def _serialize_window(window: Window) -> WindowInfo: from libtmux_mcp.models import WindowInfo assert window.window_id is not None + active_pane = getattr(window, "active_pane", None) + active_pane_id = active_pane.pane_id if active_pane is not None else None + return WindowInfo( window_id=window.window_id, window_name=window.window_name, @@ -905,6 +906,7 @@ def _serialize_window(window: Window) -> WindowInfo: window_active=getattr(window, "window_active", None), window_width=getattr(window, "window_width", None), window_height=getattr(window, "window_height", None), + active_pane_id=active_pane_id, ) diff --git a/src/libtmux_mcp/middleware.py b/src/libtmux_mcp/middleware.py index dfc751e7..c7490502 100644 --- a/src/libtmux_mcp/middleware.py +++ b/src/libtmux_mcp/middleware.py @@ -68,7 +68,7 @@ class SafetyMiddleware(Middleware): """ def __init__(self, max_tier: str = TAG_MUTATING) -> None: - self.max_level = _TIER_LEVELS.get(max_tier, 1) + self.max_level = _TIER_LEVELS.get(max_tier, 0) def _is_allowed(self, tags: set[str]) -> bool: """Return True if the tool's tags fall within the allowed tier. @@ -348,9 +348,10 @@ async def on_call_tool( # --------------------------------------------------------------------------- #: Argument names that carry user-supplied payloads we never want in logs. -#: ``keys`` (send_keys), ``text`` (paste_text), ``value`` (set_environment), -#: ``content`` (load_buffer), ``shell`` (respawn_pane), and ``environment`` -#: (respawn_pane) can contain commands, secrets, or arbitrary large strings. +#: ``keys`` (send_keys), ``text`` (paste_text), ``command`` (run_command), +#: ``value`` (set_environment), ``content`` (load_buffer), ``shell`` +#: (respawn_pane), and ``environment`` (respawn_pane) can contain commands, +#: secrets, or arbitrary large strings. #: Matched by exact name, case-sensitive, to mirror the tool signatures. #: #: ``environment`` is dict-shaped (``dict[str, str]``); the redaction logic @@ -365,7 +366,7 @@ async def on_call_tool( #: via the OS process table and tmux's ``pane_current_command`` metadata #: until the spawned shell takes over — see ``docs/topics/safety.md``. _SENSITIVE_ARG_NAMES: frozenset[str] = frozenset( - {"keys", "text", "value", "content", "shell", "environment"} + {"keys", "text", "command", "value", "content", "shell", "environment"} ) #: String arguments longer than this get truncated in the log summary to @@ -506,11 +507,10 @@ async def on_call_tool( # --------------------------------------------------------------------------- #: Default byte ceiling for :class:`TailPreservingResponseLimitingMiddleware`. -#: Chosen strictly above the per-tool ``max_lines`` caps (500 lines x -#: ~100 bytes/line) so normal operation does not trip the middleware — -#: it only fires when a tool forgot to declare its own cap or the user -#: opted out via ``max_lines=None``. -DEFAULT_RESPONSE_LIMIT_BYTES = 50_000 +#: Matches FastMCP's stock 1 MB default so normal schema-bearing tool +#: responses stay below this global backstop. Tool-level caps remain +#: responsible for terminal-specific truncation metadata. +DEFAULT_RESPONSE_LIMIT_BYTES = 1_000_000 class ReadonlyRetryMiddleware(Middleware): diff --git a/src/libtmux_mcp/models.py b/src/libtmux_mcp/models.py index 2dd3846e..cd0853db 100644 --- a/src/libtmux_mcp/models.py +++ b/src/libtmux_mcp/models.py @@ -44,6 +44,10 @@ class WindowInfo(BaseModel): ) window_width: str | None = Field(default=None, description="Width in columns") window_height: str | None = Field(default=None, description="Height in rows") + active_pane_id: str | None = Field( + default=None, + description="Pane id (``%N``) of the window's active pane.", + ) class PaneInfo(BaseModel): @@ -274,6 +278,30 @@ class CaptureSinceResult(BaseModel): ) +class RunCommandResult(BaseModel): + """Result of running a shell command in a pane.""" + + pane_id: str = Field(description="Pane ID that received the command") + exit_status: int | None = Field( + default=None, + description="Shell exit status, or None when the command timed out", + ) + timed_out: bool = Field(description="True when the wait timed out") + elapsed_seconds: float = Field(description="Time spent waiting in seconds") + output: list[str] = Field( + default_factory=list, + description="Tail-preserved pane output after the wait completes", + ) + output_truncated: bool = Field( + default=False, + description="True when output was tail-preserved to stay within max_lines", + ) + output_truncated_lines: int = Field( + default=0, + description="Number of pane lines dropped from the head when truncating", + ) + + class PaneSnapshot(BaseModel): """Rich screen capture with metadata: content, cursor, mode, and scroll state.""" @@ -322,6 +350,15 @@ class PaneSnapshot(BaseModel): default=None, description="TTY device path of the pane (e.g. '/dev/pts/5').", ) + pane_pid: str | None = Field(default=None, description="Process ID") + pane_dead: bool | None = Field( + default=None, + description="True when tmux reports the pane process has exited.", + ) + alternate_on: bool | None = Field( + default=None, + description="True when the pane is using the alternate screen.", + ) pane_in_mode: bool = Field(description="True if pane is in copy-mode or view-mode") pane_mode: str | None = Field( default=None, description="Mode name (e.g. 'copy-mode') or None if normal" diff --git a/src/libtmux_mcp/prompts/recipes.py b/src/libtmux_mcp/prompts/recipes.py index c0a651a7..e3aa1a44 100644 --- a/src/libtmux_mcp/prompts/recipes.py +++ b/src/libtmux_mcp/prompts/recipes.py @@ -58,11 +58,9 @@ def run_and_wait( command's behaviour. Do NOT use a `capture_pane` retry loop — `wait_for_channel` is strictly cheaper in agent turns. -The payload does not preserve the command's exit status: doing so -in an interactive shell would require exiting the shell (which kills -the pane) or routing through an out-of-band file or tmux variable. -If you need the status, inspect the captured output for -command-specific success markers. +The payload does not preserve the command's exit status. Use +`run_command` instead when exit status must be returned as structured +data. """ diff --git a/src/libtmux_mcp/py.typed b/src/libtmux_mcp/py.typed new file mode 100644 index 00000000..8b137891 --- /dev/null +++ b/src/libtmux_mcp/py.typed @@ -0,0 +1 @@ + diff --git a/src/libtmux_mcp/server.py b/src/libtmux_mcp/server.py index 6c4a81e6..4898dbca 100644 --- a/src/libtmux_mcp/server.py +++ b/src/libtmux_mcp/server.py @@ -94,9 +94,10 @@ ) _INSTR_WAIT_NOT_POLL = ( - "WAIT, DON'T POLL: prefer wait_for_channel (compose `tmux wait-for -S`) " - "for command completion; capture_since for repeated observation. " - "Else wait_for_text/wait_for_content_change for output you don't author." + "WAIT, DON'T POLL: run_command for authored shell commands needing " + "status; wait_for_channel for custom tmux wait-for; capture_since " + "for tailing; wait_for_text/wait_for_content_change for output you " + "don't author." ) #: Gap-explainer: write-hook tools are intentionally absent. See module @@ -151,7 +152,7 @@ def _build_instructions(safety_level: str = TAG_MUTATING) -> str: # Safety tier context parts.append( f"\n\nSafety level: {safety_level} " - "(readonly: read; mutating: read+send; destructive: read+send+kill). " + "(values: readonly, mutating, destructive). " "Set LIBTMUX_SAFETY; off-tier tools are hidden." ) @@ -187,14 +188,21 @@ def _build_instructions(safety_level: str = TAG_MUTATING) -> str: return "".join(parts) -_safety_level = os.environ.get("LIBTMUX_SAFETY", TAG_MUTATING) -if _safety_level not in VALID_SAFETY_LEVELS: +def _resolve_safety_level(value: str | None) -> str: + """Return the effective safety level for a ``LIBTMUX_SAFETY`` value.""" + if value is None: + return TAG_MUTATING + if value in VALID_SAFETY_LEVELS: + return value logger.warning( "invalid LIBTMUX_SAFETY=%r, falling back to %s", - _safety_level, - TAG_MUTATING, + value, + TAG_READONLY, ) - _safety_level = TAG_MUTATING + return TAG_READONLY + + +_safety_level = _resolve_safety_level(os.environ.get("LIBTMUX_SAFETY")) #: Tools covered by the tail-preserving response limiter. Only tools #: whose output is terminal scrollback benefit from this backstop; @@ -368,4 +376,4 @@ def build_mcp_server() -> FastMCP: def run_server() -> None: """Run the MCP server.""" server = build_mcp_server() - server.run() + server.run(transport="stdio") diff --git a/src/libtmux_mcp/tools/pane_tools/__init__.py b/src/libtmux_mcp/tools/pane_tools/__init__.py index 65269b92..9c890fb4 100644 --- a/src/libtmux_mcp/tools/pane_tools/__init__.py +++ b/src/libtmux_mcp/tools/pane_tools/__init__.py @@ -29,6 +29,7 @@ capture_pane, clear_pane, paste_text, + run_command, send_keys, ) from libtmux_mcp.tools.pane_tools.layout import ( @@ -69,6 +70,7 @@ "register", "resize_pane", "respawn_pane", + "run_command", "search_panes", "select_pane", "send_keys", @@ -85,6 +87,9 @@ def register(mcp: FastMCP) -> None: mcp.tool(title="Send Keys", annotations=ANNOTATIONS_SHELL, tags={TAG_MUTATING})( send_keys ) + mcp.tool(title="Run Command", annotations=ANNOTATIONS_SHELL, tags={TAG_MUTATING})( + run_command + ) mcp.tool(title="Capture Pane", annotations=ANNOTATIONS_RO, tags={TAG_READONLY})( capture_pane ) @@ -115,9 +120,11 @@ def register(mcp: FastMCP) -> None: annotations=ANNOTATIONS_RO, tags={TAG_READONLY}, )(find_pane_by_position) - mcp.tool(title="Clear Pane", annotations=ANNOTATIONS_MUTATING, tags={TAG_MUTATING})( - clear_pane - ) + mcp.tool( + title="Clear Pane", + annotations=ANNOTATIONS_MUTATING_DESTRUCTIVE, + tags={TAG_MUTATING}, + )(clear_pane) mcp.tool(title="Search Panes", annotations=ANNOTATIONS_RO, tags={TAG_READONLY})( search_panes ) diff --git a/src/libtmux_mcp/tools/pane_tools/io.py b/src/libtmux_mcp/tools/pane_tools/io.py index 65eccd60..6cb3b4f4 100644 --- a/src/libtmux_mcp/tools/pane_tools/io.py +++ b/src/libtmux_mcp/tools/pane_tools/io.py @@ -2,10 +2,14 @@ from __future__ import annotations +import asyncio import contextlib import pathlib +import re +import shlex import subprocess import tempfile +import time import uuid from libtmux_mcp._utils import ( @@ -14,7 +18,9 @@ _resolve_pane, _tmux_argv, handle_tool_errors, + handle_tool_errors_async, ) +from libtmux_mcp.models import RunCommandResult @handle_tool_errors @@ -66,8 +72,8 @@ def send_keys( literal : bool Whether to send keys literally (no tmux interpretation). Default False. suppress_history : bool - Whether to suppress shell history by prepending a space. - Only works in shells that support HISTCONTROL. Default False. + Suppress shell history by prepending a space; only effective where + the shell ignores space-prefixed commands. Default False. socket_name : str, optional tmux socket name. @@ -93,6 +99,146 @@ def send_keys( return f"Keys sent to pane {pane.pane_id}" +@handle_tool_errors_async +async def run_command( + command: str, + pane_id: str | None = None, + session_name: str | None = None, + session_id: str | None = None, + window_id: str | None = None, + timeout: float = 30.0, + max_lines: int | None = None, + suppress_history: bool = False, + socket_name: str | None = None, +) -> RunCommandResult: + """Run a shell command in a pane, wait for completion, and capture output. + + Use for the common terminal workflow: run this command, wait until it + completes, then report whether it succeeded. The command is sent to + the pane's interactive shell, followed by a private ``tmux wait-for`` + signal and a private pane option carrying the shell exit status. + + The command runs in a subshell, so ``cd``, ``export`` and other shell + state changes do not persist to later calls. + + Parameters + ---------- + command : str + Shell command to run in the target pane. + pane_id : str, optional + Pane ID (e.g. '%1'). + session_name : str, optional + Session name for pane resolution. + session_id : str, optional + Session ID (e.g. '$1') for pane resolution. + window_id : str, optional + Window ID for pane resolution. + timeout : float + Maximum seconds to wait for command completion. + max_lines : int or None + Maximum pane output lines to return. Defaults to all captured + visible output; pass a small value for a tail-only summary. + suppress_history : bool + Suppress shell history by prepending a space; only effective where + the shell ignores space-prefixed commands. Default False. + socket_name : str, optional + tmux socket name. + + Returns + ------- + RunCommandResult + Typed command result with exit status, timeout state, and + tail-preserved pane output. + """ + if not command.strip(): + msg = "command must not be empty" + raise ExpectedToolError(msg) + if timeout <= 0: + msg = "timeout must be positive" + raise ExpectedToolError(msg) + + server = _get_server(socket_name=socket_name) + pane = _resolve_pane( + server, + pane_id=pane_id, + session_name=session_name, + session_id=session_id, + window_id=window_id, + ) + command_id = uuid.uuid4().hex[:10] + channel = f"r_{command_id}" + status_option = f"@s_{command_id}" + target_pane_id = pane.pane_id + if target_pane_id is None: + msg = "resolved pane has no pane_id" + raise ExpectedToolError(msg) + status_cmd = shlex.join( + _tmux_argv(server, "set-option", "-p", "-t", target_pane_id, status_option) + ) + signal_cmd = shlex.join(_tmux_argv(server, "wait-for", "-S", channel)) + history_prefix = " " if suppress_history else "" + payload = "\n".join( + ( + f"{history_prefix}(", + command.rstrip(), + (f'); s=$?; {status_cmd} "$s"; {signal_cmd}'), + ) + ) + + started = time.monotonic() + await asyncio.to_thread(pane.send_keys, payload, enter=True, literal=True) + + timed_out = False + wait_argv = _tmux_argv(server, "wait-for", channel) + try: + await asyncio.to_thread( + subprocess.run, + wait_argv, + check=True, + capture_output=True, + timeout=timeout, + ) + except subprocess.TimeoutExpired: + timed_out = True + except subprocess.CalledProcessError as e: + stderr = e.stderr.decode(errors="replace").strip() if e.stderr else "" + msg = f"wait-for failed for run_command channel {channel!r}: {stderr or e}" + raise ExpectedToolError(msg) from e + + elapsed = time.monotonic() - started + exit_status: int | None = None + if not timed_out: + status = pane.cmd("show-option", "-p", "-v", status_option).stdout + status_text = status[0].strip() if status else "" + try: + exit_status = int(status_text) + except ValueError as e: + msg = f"run_command could not read exit status from {status_option!r}" + raise ExpectedToolError(msg) from e + with contextlib.suppress(Exception): + pane.cmd("set-option", "-p", "-u", status_option) + + # join_wrapped keeps the per-call markers on one logical row so the + # filter's exact-marker match survives a wide prompt; it also strips + # sync fragments that still wrap across rows. + raw_lines = await asyncio.to_thread(pane.capture_pane, join_wrapped=True) + visible_lines = _filter_run_command_internal_lines( + raw_lines, + channel=channel, + status_option=status_option, + ) + kept_lines, truncated, dropped = _truncate_lines_tail(visible_lines, max_lines) + return RunCommandResult( + pane_id=target_pane_id, + exit_status=exit_status, + timed_out=timed_out, + elapsed_seconds=elapsed, + output=kept_lines, + output_truncated=truncated, + output_truncated_lines=dropped, + ) + + #: Default line cap applied to :func:`capture_pane` and similar scrollback #: readers. Large enough to cover typical prompt + a few screens of output, #: small enough that a pathological pane (e.g. 50K lines of ``tail -f``) @@ -139,6 +285,74 @@ def _truncate_lines_tail( return lines[-max_lines:], True, dropped +def _filter_run_command_internal_lines( + lines: list[str], channel: str, status_option: str +) -> list[str]: + """Drop private run_command synchronization rows from captured output. + + The current call is matched by exact channel/status markers. Older + wrapped fragments are matched by private wrapper shape so prior + scrollback does not leak into output. + """ + shell_arg = r"(?:'[^']*'|\S+)" + tmux_prefix = rf"(?:\S*/)?tmux(?:\s+-[LS]\s+{shell_arg})*\s+" + target_pane_arg = rf"(?:\s+-t\s+{shell_arg})?" + status_line_re = re.compile( + r"(?:__libtmux_mcp_status|s)=\$\?;\s*" + + tmux_prefix + + r"set-option -p" + + target_pane_arg + + r"\s+" + + r"(?P@libtmux_mcp_status_|@s_)" + + r"(?P[0-9a-fA-F]+)(?![0-9A-Za-z_])" + ) + wait_line_re = re.compile( + r'[0-9a-fA-F]*\s*"\$(?:__libtmux_mcp_status|s)";\s*' + + tmux_prefix + + r"wait-for -S " + + r"(?Plibtmux_mcp_run_|r_)" + + r"(?P[0-9a-fA-F]*)(?![0-9A-Za-z_])" + ) + internal_markers = (channel, status_option) + hex_chars = frozenset("0123456789abcdefABCDEF") + kept: list[str] = [] + drop_hex_continuation = False + + def expected_private_id_length(prefix: str) -> int: + return 32 if "libtmux_mcp" in prefix else 10 + + for line in lines: + stripped = line.strip() + if ( + drop_hex_continuation + and 8 <= len(stripped) <= 32 + and all(char in hex_chars for char in stripped) + ): + drop_hex_continuation = False + continue + + if any(marker in line for marker in internal_markers): + drop_hex_continuation = False + continue + + status_match = status_line_re.search(line) + wait_match = wait_line_re.search(line) + if status_match or wait_match: + drop_hex_continuation = False + for match in (status_match, wait_match): + if match is None: + continue + private_id = match.group("id") + expected_len = expected_private_id_length(match.group("prefix")) + if len(private_id) < expected_len: + drop_hex_continuation = True + continue + + drop_hex_continuation = False + kept.append(line) + return kept + + @handle_tool_errors def capture_pane( pane_id: str | None = None, @@ -218,7 +432,6 @@ def clear_pane( """Clear the contents of a tmux pane. Use before send_keys + capture_pane to get a clean capture without prior output. - Note: this is two tmux commands with a brief gap — not fully atomic. Parameters ---------- @@ -246,15 +459,7 @@ def clear_pane( session_id=session_id, window_id=window_id, ) - # Two separate calls — pane.reset() in libtmux 0.56.0 still sends - # `send-keys -R \; clear-history` as one call but subprocess doesn't - # interpret \; as a tmux command separator, so clear-history never - # runs. The bare `-R` send is left as a raw cmd() because - # Pane.send_keys requires a cmd string and would emit an extra - # empty key alongside the reset flag. - # See: https://github.com/tmux-python/libtmux/issues/650 - pane.cmd("send-keys", "-R") - pane.clear_history() + pane.reset() return f"Pane cleared: {pane.pane_id}" diff --git a/src/libtmux_mcp/tools/pane_tools/meta.py b/src/libtmux_mcp/tools/pane_tools/meta.py index 1ef8467d..fe147f28 100644 --- a/src/libtmux_mcp/tools/pane_tools/meta.py +++ b/src/libtmux_mcp/tools/pane_tools/meta.py @@ -3,6 +3,7 @@ from __future__ import annotations from libtmux_mcp._utils import ( + ExpectedToolError, _coerce_bool, _coerce_int, _compute_is_caller, @@ -55,6 +56,10 @@ def display_message( str Expanded format string result. """ + if "#(" in format_string: + msg = "tmux format jobs (#(...)) are not allowed in display_message" + raise ExpectedToolError(msg) + server = _get_server(socket_name=socket_name) pane = _resolve_pane( server, @@ -157,6 +162,9 @@ def snapshot_pane( "#{pane_at_top}", "#{pane_at_bottom}", "#{pane_tty}", + "#{pane_pid}", + "#{pane_dead}", + "#{alternate_on}", ] fmt = _SEP.join(_FMT_VARS) stdout = pane.display_message(fmt, get_text=True) @@ -196,6 +204,9 @@ def snapshot_pane( pane_at_top=_coerce_bool(parts[17]), pane_at_bottom=_coerce_bool(parts[18]), pane_tty=parts[19] if parts[19] else None, + pane_pid=parts[20] if parts[20] else None, + pane_dead=_coerce_bool(parts[21]), + alternate_on=_coerce_bool(parts[22]), is_caller=_compute_is_caller(pane), content_truncated=truncated, content_truncated_lines=dropped, diff --git a/src/libtmux_mcp/tools/pane_tools/search.py b/src/libtmux_mcp/tools/pane_tools/search.py index cab6f12d..65b0e015 100644 --- a/src/libtmux_mcp/tools/pane_tools/search.py +++ b/src/libtmux_mcp/tools/pane_tools/search.py @@ -170,15 +170,15 @@ def search_panes( # 2. tmux format-string injection — ``#{C:pattern}`` is a tmux # format block. ``}`` in the pattern closes the block early # (evaluated as truthy, matching every pane as a false - # positive); ``#{`` inside the pattern starts a nested format - # variable. tmux provides no escape mechanism for these bytes - # inside the format block, so the only safe option is to route - # around: when the raw pattern contains either sequence, fall - # through to the slow Python-regex path. This applies whether - # or not ``regex`` is True — the injection risk is tmux-side, - # not regex-side. + # positive); ``#{`` starts a nested format variable; ``#(`` + # runs a format job (shell command). tmux provides no escape + # mechanism for these bytes inside the format block, so the + # only safe option is to route around: when the raw pattern + # contains any of these sequences, fall through to the slow + # Python-regex path. This applies whether or not ``regex`` is + # True — the injection risk is tmux-side, not regex-side. _REGEX_META = re.compile(r"[\\.*+?{}()\[\]|^$]") - _TMUX_FORMAT_INJECTION = re.compile(r"\}|#\{") + _TMUX_FORMAT_INJECTION = re.compile(r"\}|#\{|#\(") if _TMUX_FORMAT_INJECTION.search(pattern): is_plain_text = False elif regex: diff --git a/tests/docs/test_topic_contracts.py b/tests/docs/test_topic_contracts.py new file mode 100644 index 00000000..2a7cee6d --- /dev/null +++ b/tests/docs/test_topic_contracts.py @@ -0,0 +1,48 @@ +"""Tests for docs topic claims that describe runtime contracts.""" + +from __future__ import annotations + +import pathlib +import typing as t + +import pytest + + +class TopicContractFixture(t.NamedTuple): + """Fixture for forbidden stale docs claims.""" + + test_id: str + relative_path: str + forbidden_text: str + + +TOPIC_CONTRACT_FIXTURES: list[TopicContractFixture] = [ + TopicContractFixture( + "completion_fastmcp_builtin", + "topics/completion.md", + "inherits FastMCP's built-in", + ), + TopicContractFixture( + "pagination_automatic", + "topics/pagination.md", + "pagination automatically", + ), +] + + +@pytest.mark.parametrize( + TopicContractFixture._fields, + TOPIC_CONTRACT_FIXTURES, + ids=[f.test_id for f in TOPIC_CONTRACT_FIXTURES], +) +def test_topic_docs_do_not_overclaim_runtime_features( + docs_dir: pathlib.Path, + test_id: str, + relative_path: str, + forbidden_text: str, +) -> None: + """Topic docs do not describe unsupported FastMCP runtime behavior.""" + assert test_id + text = (docs_dir / relative_path).read_text(encoding="utf-8") + + assert forbidden_text not in text diff --git a/tests/test_cli.py b/tests/test_cli.py new file mode 100644 index 00000000..4b5b0b8a --- /dev/null +++ b/tests/test_cli.py @@ -0,0 +1,44 @@ +"""Tests for the libtmux-mcp console entry point.""" + +from __future__ import annotations + +import typing as t + +import pytest + +from libtmux_mcp import __version__, main + + +class CliFlagFixture(t.NamedTuple): + """Test fixture for local CLI options.""" + + test_id: str + argv: list[str] + expected_stdout: str + + +CLI_FLAG_FIXTURES: list[CliFlagFixture] = [ + CliFlagFixture("help", ["--help"], "usage:"), + CliFlagFixture("version", ["--version"], __version__), +] + + +@pytest.mark.parametrize( + CliFlagFixture._fields, + CLI_FLAG_FIXTURES, + ids=[f.test_id for f in CLI_FLAG_FIXTURES], +) +def test_main_local_flags_exit_without_starting_server( + test_id: str, + argv: list[str], + expected_stdout: str, + capsys: pytest.CaptureFixture[str], +) -> None: + """Local CLI flags exit before starting the MCP server.""" + assert test_id + + with pytest.raises(SystemExit) as exc_info: + main(argv) + + assert exc_info.value.code == 0 + assert expected_stdout in capsys.readouterr().out diff --git a/tests/test_middleware.py b/tests/test_middleware.py index 9384c388..6f398ff8 100644 --- a/tests/test_middleware.py +++ b/tests/test_middleware.py @@ -124,10 +124,10 @@ def test_safety_middleware_default_tier() -> None: def test_safety_middleware_invalid_tier_falls_back() -> None: - """SafetyMiddleware falls back to mutating for unknown tiers.""" + """SafetyMiddleware falls back to readonly for unknown tiers.""" mw = SafetyMiddleware(max_tier="nonexistent") assert mw._is_allowed({TAG_READONLY}) is True - assert mw._is_allowed({TAG_MUTATING}) is True + assert mw._is_allowed({TAG_MUTATING}) is False assert mw._is_allowed({TAG_DESTRUCTIVE}) is False @@ -149,6 +149,7 @@ def test_summarize_args_redacts_sensitive_keys() -> None: args: dict[str, t.Any] = { "keys": "rm -rf /", "text": "hello world", + "command": "psql -U user -W secret123 mydb", "value": "supersecret", "content": "buffer payload", "shell": "psql -U user -W secret123 mydb", @@ -156,7 +157,7 @@ def test_summarize_args_redacts_sensitive_keys() -> None: "bracket": True, } summary = _summarize_args(args) - for sensitive in ("keys", "text", "value", "content", "shell"): + for sensitive in ("keys", "text", "command", "value", "content", "shell"): assert isinstance(summary[sensitive], dict) assert "len" in summary[sensitive] assert "sha256_prefix" in summary[sensitive] @@ -168,6 +169,39 @@ def test_summarize_args_redacts_sensitive_keys() -> None: assert summary["bracket"] is True +class CommandRedactionFixture(t.NamedTuple): + """Test fixture for _summarize_args redaction of run_command's command.""" + + test_id: str + command: str + + +COMMAND_REDACTION_FIXTURES: list[CommandRedactionFixture] = [ + CommandRedactionFixture( + test_id="credential_bearing", + command="psql -U admin -W supersecret mydb", + ), + CommandRedactionFixture( + test_id="plain", + command="ls -la /tmp", + ), +] + + +@pytest.mark.parametrize( + CommandRedactionFixture._fields, + COMMAND_REDACTION_FIXTURES, + ids=[f.test_id for f in COMMAND_REDACTION_FIXTURES], +) +def test_summarize_args_redacts_command(test_id: str, command: str) -> None: + """run_command's command payload is digested, not logged in cleartext.""" + summary = _summarize_args({"command": command}) + assert isinstance(summary["command"], dict) + assert "len" in summary["command"] + assert "sha256_prefix" in summary["command"] + assert command not in str(summary["command"]) + + def test_summarize_args_redacts_sensitive_dict_values() -> None: """Dict-shaped sensitive args keep keys but digest values per-entry. @@ -936,6 +970,21 @@ class LimiterErrorFixture(t.NamedTuple): ] +class LimiterSuccessFixture(t.NamedTuple): + """Test fixture for schema-bearing successful limiter responses.""" + + test_id: str + payload_size: int + + +LIMITER_SUCCESS_FIXTURES: list[LimiterSuccessFixture] = [ + LimiterSuccessFixture( + test_id="schema_success_above_old_backstop", + payload_size=30_000, + ), +] + + class _LimiterOut(pydantic.BaseModel): """Output model giving the ``limited_fail`` probe an output schema. @@ -947,7 +996,9 @@ class _LimiterOut(pydantic.BaseModel): value: str -def _limiter_probe_server() -> t.Any: +def _limiter_probe_server( + *, max_size: int = 300, schema_success_payload_size: int = 0 +) -> t.Any: """Build a FastMCP instance with the limiter wrapping error conversion. Mirrors the production ordering (limiter outside @@ -967,8 +1018,8 @@ def _limiter_probe_server() -> t.Any: name="limiter-probe", middleware=[ TailPreservingResponseLimitingMiddleware( - max_size=300, - tools=["limited_fail", "limited_ok"], + max_size=max_size, + tools=["limited_fail", "limited_ok", "limited_model_ok"], ), ToolErrorResultMiddleware(transform_errors=True), ], @@ -992,6 +1043,10 @@ def limited_fail(payload: str) -> _LimiterOut: def limited_ok() -> str: return "y" * 5000 + @probe.tool + def limited_model_ok() -> _LimiterOut: + return _LimiterOut(value="y" * schema_success_payload_size) + return probe @@ -1039,6 +1094,38 @@ async def _call() -> t.Any: assert meta["error_type"] == "ExpectedToolError" +@pytest.mark.parametrize( + LimiterSuccessFixture._fields, + LIMITER_SUCCESS_FIXTURES, + ids=[f.test_id for f in LIMITER_SUCCESS_FIXTURES], +) +def test_response_limiter_preserves_schema_success_below_default_backstop( + test_id: str, + payload_size: int, +) -> None: + """Schema-bearing successes below the global backstop stay structured.""" + from fastmcp import Client + + from libtmux_mcp.middleware import DEFAULT_RESPONSE_LIMIT_BYTES + + probe = _limiter_probe_server( + max_size=DEFAULT_RESPONSE_LIMIT_BYTES, + schema_success_payload_size=payload_size, + ) + + async def _call() -> t.Any: + async with Client(probe) as client: + return await client.call_tool( + "limited_model_ok", + raise_on_error=False, + ) + + result = asyncio.run(_call()) + + assert test_id + assert result.structured_content == {"value": "y" * payload_size} + + class UnknownArgSuggestionFixture(t.NamedTuple): """Test fixture for synthesized unexpected-argument suggestions.""" diff --git a/tests/test_package_metadata.py b/tests/test_package_metadata.py new file mode 100644 index 00000000..19fd13f1 --- /dev/null +++ b/tests/test_package_metadata.py @@ -0,0 +1,14 @@ +"""Tests for package metadata files.""" + +from __future__ import annotations + +import pathlib + +import libtmux_mcp + + +def test_package_contains_py_typed_marker() -> None: + """The installed package advertises inline typing via ``py.typed``.""" + package_dir = pathlib.Path(libtmux_mcp.__file__).parent + + assert (package_dir / "py.typed").is_file() diff --git a/tests/test_pane_tools.py b/tests/test_pane_tools.py index 2b925ed8..8d340e36 100644 --- a/tests/test_pane_tools.py +++ b/tests/test_pane_tools.py @@ -2,6 +2,10 @@ from __future__ import annotations +import contextlib +import pathlib +import shlex +import time import typing as t import pytest @@ -45,6 +49,78 @@ from libtmux.pane import Pane from libtmux.server import Server from libtmux.session import Session + from libtmux.window import Window + + +class RunCommandFixture(t.NamedTuple): + """Test fixture for run_command exit-status cases.""" + + test_id: str + command: str + expected_status: int + expected_output: str + + +class RunCommandStatusIsolationFixture(t.NamedTuple): + """Test fixture for shell-state changes before run_command's trailer.""" + + test_id: str + command: str + expected_status: int + expected_output: str | None + + +class RunCommandPaneTargetFixture(t.NamedTuple): + """Test fixture for run_command pane-targeted status handoff.""" + + test_id: str + command: str + expected_status: int + expected_output: str + + +class RunCommandHistoryFixture(t.NamedTuple): + """Test fixture for run_command shell history suppression.""" + + test_id: str + secret: str + + +RUN_COMMAND_FIXTURES: list[RunCommandFixture] = [ + RunCommandFixture("success", "printf 'RUN_COMMAND_OK\\n'", 0, "RUN_COMMAND_OK"), + RunCommandFixture( + "failure", + "printf 'RUN_COMMAND_FAIL\\n'; false", + 1, + "RUN_COMMAND_FAIL", + ), +] + + +RUN_COMMAND_STATUS_ISOLATION_FIXTURES: list[RunCommandStatusIsolationFixture] = [ + RunCommandStatusIsolationFixture( + "path_mutation", + "PATH=/tmp; printf 'RUN_COMMAND_PATH_OK\\n'", + 0, + "RUN_COMMAND_PATH_OK", + ), + RunCommandStatusIsolationFixture("errexit_false", "set -e; false", 1, None), +] + + +RUN_COMMAND_PANE_TARGET_FIXTURES: list[RunCommandPaneTargetFixture] = [ + RunCommandPaneTargetFixture( + "missing_tmux_pane_env_in_inactive_target", + "printf 'RUN_COMMAND_TARGET_OK\\n'", + 0, + "RUN_COMMAND_TARGET_OK", + ), +] + + +RUN_COMMAND_HISTORY_FIXTURES: list[RunCommandHistoryFixture] = [ + RunCommandHistoryFixture("bash_ignorespace", "RUN_COMMAND_HISTORY_SECRET"), +] def test_send_keys(mcp_server: Server, mcp_pane: Pane) -> None: @@ -73,6 +149,506 @@ def test_send_keys_docstring_cross_links_wait_for_channel() -> None: assert "run_and_wait" in send_keys.__doc__ +@pytest.mark.parametrize( + RunCommandFixture._fields, + RUN_COMMAND_FIXTURES, + ids=[f.test_id for f in RUN_COMMAND_FIXTURES], +) +def test_run_command_reports_exit_status( + mcp_server: Server, + mcp_pane: Pane, + test_id: str, + command: str, + expected_status: int, + expected_output: str, +) -> None: + """run_command waits for completion and reports shell exit status.""" + import asyncio + + from libtmux_mcp.models import RunCommandResult + from libtmux_mcp.tools.pane_tools import run_command + + assert test_id + + result = asyncio.run( + run_command( + command=command, + pane_id=mcp_pane.pane_id, + timeout=5.0, + socket_name=mcp_server.socket_name, + ) + ) + + assert isinstance(result, RunCommandResult) + assert result.pane_id == mcp_pane.pane_id + assert result.exit_status == expected_status + assert result.timed_out is False + assert any(expected_output in line for line in result.output) + + +def test_run_command_timeout_reports_without_killing_shell( + mcp_server: Server, mcp_pane: Pane +) -> None: + """run_command timeout returns while the interactive shell remains usable.""" + import asyncio + + from libtmux_mcp.tools.pane_tools import run_command + + marker = "RUN_COMMAND_TIMEOUT_FINISHED" + result = asyncio.run( + run_command( + command=f"sleep 0.5; printf '{marker}\\n'", + pane_id=mcp_pane.pane_id, + timeout=0.05, + socket_name=mcp_server.socket_name, + ) + ) + + assert result.timed_out is True + assert result.exit_status is None + + retry_until( + lambda: any(marker in line for line in mcp_pane.capture_pane()), + 2, + raises=True, + ) + + +@pytest.mark.parametrize( + RunCommandStatusIsolationFixture._fields, + RUN_COMMAND_STATUS_ISOLATION_FIXTURES, + ids=[f.test_id for f in RUN_COMMAND_STATUS_ISOLATION_FIXTURES], +) +def test_run_command_reports_status_after_shell_state_change( + mcp_server: Server, + mcp_pane: Pane, + test_id: str, + command: str, + expected_status: int, + expected_output: str | None, +) -> None: + """run_command reports status after user commands mutate shell state.""" + import asyncio + + from libtmux_mcp.tools.pane_tools import run_command + + assert test_id + result = asyncio.run( + run_command( + command=command, + pane_id=mcp_pane.pane_id, + timeout=2.0, + socket_name=mcp_server.socket_name, + ) + ) + + assert result.exit_status == expected_status + assert result.timed_out is False + if expected_output is not None: + assert any(expected_output in line for line in result.output) + + +@pytest.mark.parametrize( + RunCommandPaneTargetFixture._fields, + RUN_COMMAND_PANE_TARGET_FIXTURES, + ids=[f.test_id for f in RUN_COMMAND_PANE_TARGET_FIXTURES], +) +def test_run_command_status_option_targets_resolved_pane( + mcp_server: Server, + mcp_window: Window, + mcp_pane: Pane, + test_id: str, + command: str, + expected_status: int, + expected_output: str, +) -> None: + """run_command status storage targets the pane the command ran in.""" + import asyncio + + from libtmux_mcp.tools.pane_tools import run_command + + assert test_id + assert mcp_pane.pane_id is not None + target_pane = mcp_window.split(attach=False) + assert target_pane.pane_id is not None + mcp_window.select_pane(mcp_pane.pane_id) + + target_pane.send_keys("exec env -u TMUX_PANE bash --noprofile --norc", enter=True) + retry_until( + lambda: any("bash-" in line for line in target_pane.capture_pane()), + 3, + raises=True, + ) + + result = None + try: + result = asyncio.run( + run_command( + command=command, + pane_id=target_pane.pane_id, + timeout=5.0, + socket_name=mcp_server.socket_name, + ) + ) + finally: + with contextlib.suppress(libtmux_exc.LibTmuxException): + mcp_window.select_pane(mcp_pane.pane_id) + with contextlib.suppress(libtmux_exc.LibTmuxException): + target_pane.kill() + + assert result is not None + assert result.exit_status == expected_status + assert result.timed_out is False + assert any(expected_output in line for line in result.output) + + +@pytest.mark.parametrize( + RunCommandHistoryFixture._fields, + RUN_COMMAND_HISTORY_FIXTURES, + ids=[f.test_id for f in RUN_COMMAND_HISTORY_FIXTURES], +) +def test_run_command_suppress_history( + mcp_server: Server, + mcp_pane: Pane, + tmp_path: pathlib.Path, + test_id: str, + secret: str, +) -> None: + """run_command suppresses shell history for secret-bearing commands.""" + import asyncio + + from libtmux_mcp.tools.pane_tools import run_command + + assert test_id + histfile = tmp_path / "bash_history" + mcp_pane.send_keys("exec bash --noprofile --norc", enter=True) + retry_until( + lambda: any("bash-" in line for line in mcp_pane.capture_pane()), + 2, + raises=True, + ) + + setup = ( + f"HISTFILE={shlex.quote(str(histfile))}; " + "HISTCONTROL=ignorespace; set -o history; " + "history -c; history -w" + ) + asyncio.run( + run_command( + command=setup, + pane_id=mcp_pane.pane_id, + timeout=2.0, + suppress_history=True, + socket_name=mcp_server.socket_name, + ) + ) + asyncio.run( + run_command( + command=f"printf '{secret}\\n'", + pane_id=mcp_pane.pane_id, + timeout=2.0, + suppress_history=True, + socket_name=mcp_server.socket_name, + ) + ) + asyncio.run( + run_command( + command="history -w", + pane_id=mcp_pane.pane_id, + timeout=2.0, + suppress_history=True, + socket_name=mcp_server.socket_name, + ) + ) + + assert secret not in histfile.read_text() + + +def test_run_command_tail_preserves_output(mcp_server: Server, mcp_pane: Pane) -> None: + """run_command output is tail-preserved when max_lines is small.""" + import asyncio + + from libtmux_mcp.tools.pane_tools import run_command + + result = asyncio.run( + run_command( + command=( + "for i in $(seq 1 6); do printf 'RUN_COMMAND_TRUNC_%s\\n' \"$i\"; done" + ), + pane_id=mcp_pane.pane_id, + timeout=5.0, + max_lines=2, + socket_name=mcp_server.socket_name, + ) + ) + + assert result.output_truncated is True + assert result.output_truncated_lines > 0 + assert len(result.output) == 2 + assert any("RUN_COMMAND_TRUNC_6" in line for line in result.output) + + +def test_run_command_tail_preserves_output_with_wrapped_private_prompt( + mcp_server: Server, mcp_pane: Pane +) -> None: + """run_command keeps command output when a long shell prompt wraps internals.""" + import asyncio + + from libtmux_mcp.tools.pane_tools import run_command + + long_prompt = "runner@runnervm123456:/home/runner/work/libtmux-mcp/libtmux-mcp$ " + mcp_pane.cmd("resize-pane", "-x", "80") + mcp_pane.send_keys("exec bash --noprofile --norc", enter=True) + retry_until( + lambda: any("bash-" in line for line in mcp_pane.capture_pane()), + 2, + raises=True, + ) + mcp_pane.send_keys(f"PS1={shlex.quote(long_prompt)}", enter=True) + retry_until( + lambda: any(long_prompt.rstrip() in line for line in mcp_pane.capture_pane()), + 2, + raises=True, + ) + + result = asyncio.run( + run_command( + command=( + "for i in $(seq 1 6); do printf 'RUN_COMMAND_WRAP_%s\\n' \"$i\"; done" + ), + pane_id=mcp_pane.pane_id, + timeout=5.0, + max_lines=2, + socket_name=mcp_server.socket_name, + ) + ) + + assert result.output_truncated is True + assert len(result.output) == 2 + assert any("RUN_COMMAND_WRAP_6" in line for line in result.output) + + +class FilterInternalLinesFixture(t.NamedTuple): + """Fixture for legitimate output that resembles wrapper text.""" + + test_id: str + line: str + + +FILTER_KEEP_FIXTURES: list[FilterInternalLinesFixture] = [ + FilterInternalLinesFixture("mentions_mcp_status", "grep -n mcp_status app.log"), + FilterInternalLinesFixture("mentions_set_option", 'echo "tmux set-option -p x"'), + FilterInternalLinesFixture("mentions_wait_for", "run tmux wait-for -S done first"), + FilterInternalLinesFixture("mentions_prefix", "ns=libtmux_mcp_ is reserved"), +] + + +FILTER_WRAPPER_LIKE_KEEP_FIXTURES: list[FilterInternalLinesFixture] = [ + FilterInternalLinesFixture( + "tmux_script_status_option", + "s=$?; tmux set-option -p @s_myapp_status 1", + ), + FilterInternalLinesFixture( + "empty_short_status_prefix", + "s=$?; tmux set-option -p @s_ 1", + ), +] + + +class FilterDropFixture(t.NamedTuple): + """Fixture for private run_command synchronisation fragments.""" + + test_id: str + lines: list[str] + channel: str + status_option: str + + +class FilterCurrentSyncLineFixture(t.NamedTuple): + """Fixture for output after the current run_command sync line.""" + + test_id: str + output_lines: list[str] + + +_CURRENT_ID = "deadbeefdeadbeefdeadbeefdeadbeef" +_PREVIOUS_ID = "feedfacefeedfacefeedfacefeedface" +_SHORT_CURRENT_ID = "e743e5084b" +_SHORT_PREVIOUS_ID = "f00dbeef12" + + +FILTER_DROP_FIXTURES: list[FilterDropFixture] = [ + FilterDropFixture( + "current_wrapped_long_marker", + [ + "RUN_OK", + "∙ }; __libtmux_mcp_status=$?; tmux set-option -p " + f"@libtmux_mcp_status_{_CURRENT_ID[:10]}", + f'{_CURRENT_ID[10:]} "$__libtmux_mcp_status"; ' + "tmux wait-for -S libtmux_mcp_run_", + _CURRENT_ID, + ], + f"libtmux_mcp_run_{_CURRENT_ID}", + f"@libtmux_mcp_status_{_CURRENT_ID}", + ), + FilterDropFixture( + "previous_wrapped_long_marker", + [ + "RUN_OK", + "∙ }; __libtmux_mcp_status=$?; tmux set-option -p " + f"@libtmux_mcp_status_{_PREVIOUS_ID[:10]}", + f'{_PREVIOUS_ID[10:]} "$__libtmux_mcp_status"; ' + "tmux wait-for -S libtmux_mcp_run_", + _PREVIOUS_ID, + ], + f"libtmux_mcp_run_{_CURRENT_ID}", + f"@libtmux_mcp_status_{_CURRENT_ID}", + ), + FilterDropFixture( + "current_short_marker", + [ + "RUN_OK", + f'∙ }}; s=$?; tmux set-option -p @s_{_SHORT_CURRENT_ID} "$s"; ' + f"tmux wait-for -S r_{_SHORT_CURRENT_ID}", + ], + f"r_{_SHORT_CURRENT_ID}", + f"@s_{_SHORT_CURRENT_ID}", + ), + FilterDropFixture( + "previous_wrapped_short_marker", + [ + "RUN_OK", + f"∙ }}; s=$?; tmux set-option -p @s_{_SHORT_PREVIOUS_ID[:6]}", + f'{_SHORT_PREVIOUS_ID[6:]} "$s"; tmux wait-for -S r_{_SHORT_PREVIOUS_ID}', + ], + f"r_{_SHORT_CURRENT_ID}", + f"@s_{_SHORT_CURRENT_ID}", + ), + FilterDropFixture( + "previous_targeted_short_marker", + [ + "RUN_OK", + f"∙ }}; s=$?; tmux -L dev set-option -p -t %1 @s_{_SHORT_PREVIOUS_ID[:6]}", + ( + f'{_SHORT_PREVIOUS_ID[6:]} "$s"; ' + f"tmux -L dev wait-for -S r_{_SHORT_PREVIOUS_ID}" + ), + ], + f"r_{_SHORT_CURRENT_ID}", + f"@s_{_SHORT_CURRENT_ID}", + ), +] + + +FILTER_CURRENT_SYNC_KEEP_FIXTURES: list[FilterCurrentSyncLineFixture] = [ + FilterCurrentSyncLineFixture("single_hex_output", ["abcdef1234", "DONE"]), + FilterCurrentSyncLineFixture( + "consecutive_hex_output", + ["abcdef1234", "feedface99", "DONE"], + ), +] + + +@pytest.mark.parametrize( + FilterInternalLinesFixture._fields, + FILTER_KEEP_FIXTURES, + ids=[f.test_id for f in FILTER_KEEP_FIXTURES], +) +def test_filter_run_command_keeps_legitimate_output(test_id: str, line: str) -> None: + """Legitimate output without private markers survives filtering.""" + from libtmux_mcp.tools.pane_tools.io import _filter_run_command_internal_lines + + command_id = "deadbeefdeadbeefdeadbeefdeadbeef" + channel = f"libtmux_mcp_run_{command_id}" + status_option = f"@libtmux_mcp_status_{command_id}" + + assert test_id + kept = _filter_run_command_internal_lines( + [line], channel=channel, status_option=status_option + ) + assert kept == [line] + + +@pytest.mark.parametrize( + FilterInternalLinesFixture._fields, + FILTER_WRAPPER_LIKE_KEEP_FIXTURES, + ids=[f.test_id for f in FILTER_WRAPPER_LIKE_KEEP_FIXTURES], +) +def test_filter_run_command_keeps_wrapper_like_output(test_id: str, line: str) -> None: + """Legitimate tmux-looking command output survives filtering.""" + from libtmux_mcp.tools.pane_tools.io import _filter_run_command_internal_lines + + assert test_id + kept = _filter_run_command_internal_lines( + [line], + channel=f"r_{_SHORT_CURRENT_ID}", + status_option=f"@s_{_SHORT_CURRENT_ID}", + ) + assert kept == [line] + + +def test_filter_run_command_drops_sync_line() -> None: + """The joined private synchronisation line is removed from output.""" + from libtmux_mcp.tools.pane_tools.io import _filter_run_command_internal_lines + + command_id = "deadbeefdeadbeefdeadbeefdeadbeef" + channel = f"libtmux_mcp_run_{command_id}" + status_option = f"@libtmux_mcp_status_{command_id}" + sync_line = ( + f"}}; __libtmux_mcp_status=$?; tmux set-option -p {status_option} " + f'"$__libtmux_mcp_status"; tmux wait-for -S {channel}' + ) + kept = _filter_run_command_internal_lines( + ["RUN_OK", sync_line], channel=channel, status_option=status_option + ) + assert kept == ["RUN_OK"] + + +@pytest.mark.parametrize( + FilterCurrentSyncLineFixture._fields, + FILTER_CURRENT_SYNC_KEEP_FIXTURES, + ids=[f.test_id for f in FILTER_CURRENT_SYNC_KEEP_FIXTURES], +) +def test_filter_run_command_keeps_hex_output_after_current_sync_line( + test_id: str, output_lines: list[str] +) -> None: + """Hex-like output after the current sync line survives filtering.""" + from libtmux_mcp.tools.pane_tools.io import _filter_run_command_internal_lines + + channel = f"r_{_SHORT_CURRENT_ID}" + status_option = f"@s_{_SHORT_CURRENT_ID}" + sync_line = ( + f'); s=$?; tmux set-option -p {status_option} "$s"; tmux wait-for -S {channel}' + ) + kept = _filter_run_command_internal_lines( + [sync_line, *output_lines], + channel=channel, + status_option=status_option, + ) + assert test_id + assert kept == output_lines + + +@pytest.mark.parametrize( + FilterDropFixture._fields, + FILTER_DROP_FIXTURES, + ids=[f.test_id for f in FILTER_DROP_FIXTURES], +) +def test_filter_run_command_drops_sync_fragments( + test_id: str, lines: list[str], channel: str, status_option: str +) -> None: + """Private synchronisation fragments are removed from output.""" + from libtmux_mcp.tools.pane_tools.io import _filter_run_command_internal_lines + + kept = _filter_run_command_internal_lines( + lines, + channel=channel, + status_option=status_option, + ) + assert test_id + assert kept == ["RUN_OK"] + + def test_capture_pane(mcp_server: Server, mcp_pane: Pane) -> None: """capture_pane returns pane content.""" result = capture_pane( @@ -743,6 +1319,33 @@ def test_clear_pane(mcp_server: Server, mcp_pane: Pane) -> None: ) +def test_clear_pane_uses_libtmux_reset( + mcp_server: Server, mcp_pane: Pane, monkeypatch: pytest.MonkeyPatch +) -> None: + """clear_pane delegates to libtmux's atomic Pane.reset path. + + This test uses monkeypatch because the visible terminal state can + look identical for the old two-IPC implementation and the fixed + one-call libtmux reset; the regression is the call boundary. + """ + from libtmux.pane import Pane as LibtmuxPane + + reset_calls: list[str | None] = [] + + def fake_reset(self: LibtmuxPane) -> LibtmuxPane: + reset_calls.append(self.pane_id) + return self + + monkeypatch.setattr(LibtmuxPane, "reset", fake_reset) + + clear_pane( + pane_id=mcp_pane.pane_id, + socket_name=mcp_server.socket_name, + ) + + assert reset_calls == [mcp_pane.pane_id] + + def test_resize_pane_dimensions(mcp_server: Server, mcp_pane: Pane) -> None: """resize_pane resizes a pane with height/width.""" result = resize_pane( @@ -1298,35 +1901,47 @@ def test_search_panes_literal_input_skips_slow_path_probe( assert any(m.pane_id == mcp_pane.pane_id for m in result.matches) +class SearchFastPathFixture(t.NamedTuple): + """Fixture for ``search_panes`` fast-path eligibility cases.""" + + test_id: str + pattern: str + regex: bool + expected_fast_path: bool + + +SEARCH_FAST_PATH_FIXTURES: list[SearchFastPathFixture] = [ + # Literal input with regex metacharacters — the earlier bug's + # target case. Raw input is glob-safe for tmux, fast path. + SearchFastPathFixture("literal_regex_chars", "192.168.1.1", False, True), + # Literal with no metacharacters — always fast path. + SearchFastPathFixture("plain_literal", "plain_marker", False, True), + # Regex with no metacharacters — fast path still fine. + SearchFastPathFixture("plain_regex", "plain_marker", True, True), + # Regex with metacharacters — legitimately slow path. + SearchFastPathFixture("regex_group", r"err(or|no)", True, False), + # Regex dot-star — slow path. + SearchFastPathFixture("regex_dot_star", r".*", True, False), + # tmux format-injection bytes in a literal — MUST fall to slow + # path regardless of regex flag, because tmux's #{C:...} format + # block has no escape for `}` (premature close), `#{` (nested + # format-variable evaluation), or `#(` (format job execution). + SearchFastPathFixture("literal_close_brace", "foo}", False, False), + SearchFastPathFixture("literal_nested_format", "log #{err}", False, False), + SearchFastPathFixture("literal_format_job", "#(printf ok)", False, False), + # Same hazards with regex=True — still slow path; tmux sees the + # raw pattern either way. + SearchFastPathFixture("regex_close_brace", "x}y", True, False), + SearchFastPathFixture("regex_nested_format", "a#{b}", True, False), +] + + @pytest.mark.parametrize( - ("pattern", "regex", "expected_fast_path"), - [ - # Literal input with regex metacharacters — the earlier bug's - # target case. Raw input is glob-safe for tmux, fast path. - ("192.168.1.1", False, True), - # Literal with no metacharacters — always fast path. - ("plain_marker", False, True), - # Regex with no metacharacters — fast path still fine. - ("plain_marker", True, True), - # Regex with metacharacters — legitimately slow path. - (r"err(or|no)", True, False), - # Regex dot-star — slow path. - (r".*", True, False), - # tmux format-injection bytes in a literal — MUST fall to slow - # path regardless of regex flag, because tmux's #{C:...} format - # block has no escape for `}` (premature close) or `#{` (nested - # format-variable evaluation). - ("foo}", False, False), - ("log #{err}", False, False), - # Same hazards with regex=True — still slow path; tmux sees the - # raw pattern either way. - ("x}y", True, False), - ("a#{b}", True, False), - ], + "fixture", + SEARCH_FAST_PATH_FIXTURES, + ids=lambda fixture: fixture.test_id, ) -def test_search_panes_fast_path_decision( - pattern: str, regex: bool, expected_fast_path: bool -) -> None: +def test_search_panes_fast_path_decision(fixture: SearchFastPathFixture) -> None: """Unit-test the ``is_plain_text`` decision on pattern + regex flag. Mirrors the exact expression in ``search_panes`` so a future @@ -1337,14 +1952,16 @@ def test_search_panes_fast_path_decision( """ import re as _re + test_id, pattern, regex, expected_fast_path = fixture _regex_meta = _re.compile(r"[\\.*+?{}()\[\]|^$]") - _tmux_format_injection = _re.compile(r"\}|#\{") + _tmux_format_injection = _re.compile(r"\}|#\{|#\(") if _tmux_format_injection.search(pattern): is_plain_text = False elif regex: is_plain_text = not _regex_meta.search(pattern) else: is_plain_text = True + assert test_id assert is_plain_text is expected_fast_path @@ -1425,6 +2042,25 @@ def test_search_panes_nested_format_variable_is_neutralized( assert m.pane_id == mcp_pane.pane_id +def test_search_panes_hash_paren_format_job_is_neutralized( + mcp_server: Server, mcp_session: Session, tmp_path: pathlib.Path +) -> None: + """Literal patterns containing ``#(`` do not start tmux format jobs.""" + marker = tmp_path / "search_panes_format_job_marker" + pattern = f"#(printf ok > {shlex.quote(str(marker))})" + + result = search_panes( + pattern=pattern, + regex=False, + session_name=mcp_session.session_name, + socket_name=mcp_server.socket_name, + ) + + assert isinstance(result.matches, list) + time.sleep(0.5) + assert not marker.exists() + + def test_search_panes_numeric_pane_id_ordering( mcp_server: Server, mcp_session: Session ) -> None: @@ -2897,6 +3533,21 @@ def test_snapshot_pane(mcp_server: Server, mcp_pane: Pane) -> None: assert result.content_truncated_lines == 0 +def test_snapshot_pane_returns_liveness_metadata( + mcp_server: Server, mcp_pane: Pane +) -> None: + """snapshot_pane returns process, dead-pane, and alternate-screen metadata.""" + result = snapshot_pane( + pane_id=mcp_pane.pane_id, + socket_name=mcp_server.socket_name, + ) + + assert result.pane_pid is not None + assert result.pane_pid.isdigit() + assert result.pane_dead is False + assert isinstance(result.alternate_on, bool) + + def test_snapshot_pane_truncates_content(mcp_server: Server, mcp_pane: Pane) -> None: """snapshot_pane reports truncation via model fields, not in-band header. @@ -3401,6 +4052,23 @@ def test_display_message_zoomed_flag(mcp_server: Server, mcp_session: Session) - assert result in ("0", "1") +def test_display_message_rejects_format_jobs( + mcp_server: Server, mcp_pane: Pane, tmp_path: pathlib.Path +) -> None: + """display_message rejects tmux format jobs before tmux evaluates them.""" + marker = tmp_path / "display_message_format_job_marker" + + with pytest.raises(ToolError, match=r"#\("): + display_message( + format_string=f"#(printf ok > {shlex.quote(str(marker))})", + pane_id=mcp_pane.pane_id, + socket_name=mcp_server.socket_name, + ) + + time.sleep(0.5) + assert not marker.exists() + + # --------------------------------------------------------------------------- # enter_copy_mode / exit_copy_mode tests # --------------------------------------------------------------------------- @@ -3535,6 +4203,7 @@ def test_paste_text_does_not_leak_named_buffer( # Shell-driving tools: the command the caller sends can reach # arbitrary external state, so the interaction is open-world. ("send_keys", True), + ("run_command", True), ("paste_text", True), ("pipe_pane", True), # Create-style tools: allocate tmux objects only. Not open-world @@ -3605,6 +4274,25 @@ def test_respawn_pane_advertises_destructive_non_idempotent() -> None: assert tool.annotations.readOnlyHint is False +def test_clear_pane_advertises_destructive_non_idempotent() -> None: + """``clear_pane`` registers as mutating-tier with destructive hints.""" + import asyncio + + from fastmcp import FastMCP + + from libtmux_mcp.tools import pane_tools + + mcp = FastMCP(name="test-clear-pane-annotations") + pane_tools.register(mcp) + + tool = asyncio.run(mcp.get_tool("clear_pane")) + assert tool is not None, "clear_pane should be registered" + assert tool.annotations is not None, "clear_pane should carry annotations" + assert tool.annotations.destructiveHint is True + assert tool.annotations.idempotentHint is False + assert tool.annotations.readOnlyHint is False + + # --------------------------------------------------------------------------- # Typed-output regression guard # --------------------------------------------------------------------------- diff --git a/tests/test_server.py b/tests/test_server.py index 6835ac37..55d89d05 100644 --- a/tests/test_server.py +++ b/tests/test_server.py @@ -2,6 +2,11 @@ from __future__ import annotations +import json +import os +import subprocess +import sys +import textwrap import typing as t import pytest @@ -82,6 +87,23 @@ class BuildInstructionsFixture(t.NamedTuple): ] +class SafetyLevelFixture(t.NamedTuple): + """Test fixture for server safety-level resolution.""" + + test_id: str + env_value: str | None + expected_level: str + + +SAFETY_LEVEL_FIXTURES: list[SafetyLevelFixture] = [ + SafetyLevelFixture("unset_defaults_mutating", None, TAG_MUTATING), + SafetyLevelFixture("valid_readonly", TAG_READONLY, TAG_READONLY), + SafetyLevelFixture("valid_mutating", TAG_MUTATING, TAG_MUTATING), + SafetyLevelFixture("valid_destructive", TAG_DESTRUCTIVE, TAG_DESTRUCTIVE), + SafetyLevelFixture("invalid_fails_closed", "read", TAG_READONLY), +] + + @pytest.mark.parametrize( BuildInstructionsFixture._fields, BUILD_INSTRUCTIONS_FIXTURES, @@ -129,6 +151,80 @@ def test_build_instructions( assert f"Safety level: {expect_safety_in_text}" in result +@pytest.mark.parametrize( + SafetyLevelFixture._fields, + SAFETY_LEVEL_FIXTURES, + ids=[f.test_id for f in SAFETY_LEVEL_FIXTURES], +) +def test_resolve_safety_level( + test_id: str, + env_value: str | None, + expected_level: str, +) -> None: + """Safety env values resolve to the server's effective tier.""" + from libtmux_mcp.server import _resolve_safety_level + + assert test_id + assert _resolve_safety_level(env_value) == expected_level + + +def test_invalid_safety_env_hides_mutating_tools() -> None: + """Invalid ``LIBTMUX_SAFETY`` values expose readonly tools only.""" + code = textwrap.dedent( + """ + import asyncio + import json + + from libtmux_mcp.server import build_mcp_server + + async def main(): + tools = await build_mcp_server().list_tools() + names = {tool.name for tool in tools} + print(json.dumps({ + "list_sessions": "list_sessions" in names, + "send_keys": "send_keys" in names, + "kill_pane": "kill_pane" in names, + })) + + asyncio.run(main()) + """ + ) + env = {**os.environ, "LIBTMUX_SAFETY": "read"} + proc = subprocess.run( + [sys.executable, "-c", code], + check=True, + capture_output=True, + env=env, + text=True, + ) + result = json.loads(proc.stdout) + + assert result == { + "list_sessions": True, + "send_keys": False, + "kill_pane": False, + } + + +def test_run_server_pins_stdio_transport(monkeypatch: pytest.MonkeyPatch) -> None: + """run_server passes an explicit stdio transport to FastMCP.""" + from libtmux_mcp import server as server_mod + + class FakeServer: + transport: str | None = None + + def run(self, *, transport: str | None = None) -> None: + self.transport = transport + + fake = FakeServer() + + monkeypatch.setattr(server_mod, "build_mcp_server", lambda: fake) + + server_mod.run_server() + + assert fake.transport == "stdio" + + def test_base_instructions_content() -> None: """_BASE_INSTRUCTIONS contains key guidance for the LLM.""" assert "tmux hierarchy" in _BASE_INSTRUCTIONS @@ -161,6 +257,7 @@ def test_base_instructions_prefer_wait_over_poll() -> None: from the instructions steers agents off the polling-scraper path for command-completion synchronization. """ + assert "run_command" in _BASE_INSTRUCTIONS assert "wait_for_channel" in _BASE_INSTRUCTIONS assert "capture_since" in _BASE_INSTRUCTIONS assert "wait_for_text" in _BASE_INSTRUCTIONS diff --git a/tests/test_session_tools.py b/tests/test_session_tools.py index fe3a2848..b3364fcf 100644 --- a/tests/test_session_tools.py +++ b/tests/test_session_tools.py @@ -72,6 +72,20 @@ def test_create_window(mcp_server: Server, mcp_session: Session) -> None: assert result.window_name == "mcp_test_win" +def test_create_window_returns_active_pane_id( + mcp_server: Server, mcp_session: Session +) -> None: + """create_window returns the new window's active pane id.""" + result = create_window( + session_name=mcp_session.session_name, + window_name="mcp_active_pane_id", + socket_name=mcp_server.socket_name, + ) + + assert result.active_pane_id is not None + assert result.active_pane_id.startswith("%") + + def test_create_window_invalid_direction( mcp_server: Server, mcp_session: Session ) -> None: diff --git a/tests/test_window_tools.py b/tests/test_window_tools.py index c029608a..ba893d51 100644 --- a/tests/test_window_tools.py +++ b/tests/test_window_tools.py @@ -48,6 +48,22 @@ def test_get_window_info(mcp_server: Server, mcp_session: Session) -> None: assert result.session_id == mcp_session.session_id +def test_get_window_info_returns_active_pane_id( + mcp_server: Server, mcp_session: Session +) -> None: + """get_window_info returns the window's active pane id.""" + window = mcp_session.active_window + result = get_window_info( + window_id=window.window_id, + socket_name=mcp_server.socket_name, + ) + active_pane = window.active_pane + + assert result.active_pane_id is not None + assert active_pane is not None + assert result.active_pane_id == active_pane.pane_id + + def test_get_window_info_by_index(mcp_server: Server, mcp_session: Session) -> None: """get_window_info resolves by window_index when session is named.""" window = mcp_session.active_window