feat(hooks): dangerous-actions PreToolUse hook for Claude Code#79
feat(hooks): dangerous-actions PreToolUse hook for Claude Code#79
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…epresentativeness Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pure-data DangerPattern catalogue (DANGEROUS_BASH/PATHS/CONTENT) with 16/8/4 entries, structure tests (4 pass), and regression-test exclusion for library modules that are not runnable hook entry points. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix git-force-push regex to match tokens between push and --force flag
- Fix rm-rf-root to also match rm -rf /* variant
- Rename sql-drop-table/sql-truncate in DANGEROUS_CONTENT to content-sql-* to avoid id collisions with DANGEROUS_BASH
- Add comment documenting secret-env basename-matching contract
- Change LIBRARY_MODULES from RegExp to Set in hooks-registered regression test
- Add describe('pattern correctness') block with 15 positive/negative match tests
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pure evaluateDangerous() function with bash/write/edit/multi-edit routing, inline # reviewed override for bash, and 25 new unit tests (43 total pass). Also extends regression guard to treat *-evaluate.ts as library modules. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tup test
- Add typeof/Array.isArray guards in evalBash/evalWrite/evalEdit/evalMultiEdit
so missing toolInput fields return null instead of throwing TypeError
- Strip trailing slashes in matchDangerousPath before basename extraction
so paths like /home/user/.env/ are correctly matched as dangerous
- Refactor git-force-push describe block: replace test('setup') anti-pattern
with a const re declaration at describe scope
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… registration - Create hooks/src/hooks/dangerous-actions.ts (minimal entry wiring defineHook + evaluateDangerous + runAsCli) - Add 11 integration tests to dangerous-actions.test.ts covering full stdin→runHook→JSON pipeline - Register dangerous-actions.js in core-claude hooks.json.tmpl under PreToolUse (Bash|Write|Edit|MultiEdit) - Scope regression test to claude-code only for PreToolUse hooks (Cursor/Copilot/Codex lack PreToolUse) - Distribute bundles to all plugin hooks directories via pre_commit.py Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add <hook> paragraph to dangerous-actions skill (r2, r3, and all plugin variants) documenting that a deterministic PreToolUse hook enforces a last-resort gate for the highest-blast-radius patterns, and explaining the override protocol for Bash vs Write/Edit/MultiEdit tools. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Rosetta Triage ReviewSummary: Adds a Findings:
Suggestions:
Automated triage by Rosetta agent |
There was a problem hiding this comment.
Pull request overview
Introduces a new high-blast-radius safeguard in the hooks/ package: a PreToolUse hook (dangerous-actions) that denies execution of dangerous Bash commands, secret-path writes/edits, and destructive/secret-bearing content before the tool runs. This is then bundled into plugin artifacts and registered for Claude Code via plugins/core-claude/hooks/hooks.json*, with updated regression coverage to accommodate a Claude-only rollout.
Changes:
- Added
dangerous-actionshook implementation split into patterns/data, evaluation logic, and hook entrypoint. - Registered the hook for Claude Code
PreToolUseand emitted bundled JS artifacts for multiple plugins. - Added extensive Vitest coverage plus new Claude Code PreToolUse fixtures; updated the “hooks registered” regression test to handle library modules and Claude-only hooks.
Reviewed changes
Copilot reviewed 28 out of 28 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| plugins/core-cursor/skills/dangerous-actions/SKILL.md | Documents the dangerous-actions skill; now includes a hook blurb. |
| plugins/core-cursor/.cursor/hooks/dangerous-actions.js | Cursor-bundled hook artifact (compiled JS). |
| plugins/core-cursor/.cursor/hooks/dangerous-actions-patterns.js | Cursor-bundled patterns artifact (compiled JS). |
| plugins/core-cursor/.cursor/hooks/dangerous-actions-evaluate.js | Cursor-bundled evaluator artifact (compiled JS). |
| plugins/core-copilot/skills/dangerous-actions/SKILL.md | Documents the dangerous-actions skill; now includes a hook blurb. |
| plugins/core-copilot/hooks/dangerous-actions.js | Copilot-bundled hook artifact (compiled JS). |
| plugins/core-copilot/hooks/dangerous-actions-patterns.js | Copilot-bundled patterns artifact (compiled JS). |
| plugins/core-copilot/hooks/dangerous-actions-evaluate.js | Copilot-bundled evaluator artifact (compiled JS). |
| plugins/core-codex/.codex/hooks/dangerous-actions.js | Codex-bundled hook artifact (compiled JS). |
| plugins/core-codex/.codex/hooks/dangerous-actions-patterns.js | Codex-bundled patterns artifact (compiled JS). |
| plugins/core-codex/.codex/hooks/dangerous-actions-evaluate.js | Codex-bundled evaluator artifact (compiled JS). |
| plugins/core-codex/.agents/skills/dangerous-actions/SKILL.md | Codex skill docs updated with hook blurb. |
| plugins/core-claude/skills/dangerous-actions/SKILL.md | Claude skill docs updated with hook blurb. |
| plugins/core-claude/hooks/hooks.json.tmpl | Registers dangerous-actions for PreToolUse in Claude Code template. |
| plugins/core-claude/hooks/hooks.json | Registers dangerous-actions for PreToolUse in Claude Code concrete config. |
| plugins/core-claude/hooks/dangerous-actions.js | Claude-bundled hook artifact (compiled JS). |
| plugins/core-claude/hooks/dangerous-actions-patterns.js | Claude-bundled patterns artifact (compiled JS). |
| plugins/core-claude/hooks/dangerous-actions-evaluate.js | Claude-bundled evaluator artifact (compiled JS). |
| instructions/r3/core/skills/dangerous-actions/SKILL.md | Core instruction skill docs updated with hook blurb. |
| instructions/r2/core/skills/dangerous-actions/SKILL.md | Core instruction skill docs updated with hook blurb. |
| hooks/tests/regression/hooks-registered.test.ts | Excludes library modules from “must be registered” and scopes Claude-only hooks. |
| hooks/tests/fixtures/claude-code-pre-tool-use-write.json | New Claude Code PreToolUse Write fixture. |
| hooks/tests/fixtures/claude-code-pre-tool-use-multi-edit.json | New Claude Code PreToolUse MultiEdit fixture. |
| hooks/tests/fixtures/claude-code-pre-tool-use-edit.json | New Claude Code PreToolUse Edit fixture. |
| hooks/tests/dangerous-actions.test.ts | New unit + integration tests for patterns, evaluator, and runHook behavior. |
| hooks/src/hooks/dangerous-actions.ts | New hook entrypoint wiring defineHook + runAsCli. |
| hooks/src/hooks/dangerous-actions-patterns.ts | New dangerous pattern catalogs (bash/path/content). |
| hooks/src/hooks/dangerous-actions-evaluate.ts | New pure evaluation logic and deny message construction. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const snippet = evidence.length > EVIDENCE_MAX | ||
| ? evidence.slice(0, EVIDENCE_MAX) + '…' | ||
| : evidence; | ||
|
|
||
| return [ | ||
| 'Blocked by rosetta dangerous-actions hook.', | ||
| '', | ||
| `Rule: ${pattern.id} — ${pattern.label}`, | ||
| `Tool: ${toolKind}`, | ||
| `Evidence: ${snippet}`, | ||
| '', |
| const basename = normalizedPath.split('/').pop() ?? normalizedPath; | ||
| for (const p of DANGEROUS_PATHS) { | ||
| // Test full path first (covers patterns with / in them like aws-credentials) | ||
| if (p.re.test(filePath)) return p; |
| `Tool: ${toolKind}`, | ||
| `Evidence: ${snippet}`, | ||
| '', | ||
| 'Did you consider this as a dangerous activity?', |
| An automated PreToolUse hook backs this skill for the highest-blast-radius patterns (Bash destructive commands, file writes to secret paths, DDL payloads in content). The hook is a deterministic tripwire — it does not replace this skill's reasoning process. Use this skill to reason about danger; the hook enforces a last-resort gate if that reasoning is skipped. | ||
|
|
||
| Bash: override with `# reviewed` shell comment (e.g. `rm -rf /tmp/scratch # reviewed: intentional cleanup`). | ||
| Write/Edit/MultiEdit: no inline override — ask the user to confirm in chat, then retry. | ||
|
|
| An automated PreToolUse hook backs this skill for the highest-blast-radius patterns (Bash destructive commands, file writes to secret paths, DDL payloads in content). The hook is a deterministic tripwire — it does not replace this skill's reasoning process. Use this skill to reason about danger; the hook enforces a last-resort gate if that reasoning is skipped. | ||
|
|
||
| Bash: override with `# reviewed` shell comment (e.g. `rm -rf /tmp/scratch # reviewed: intentional cleanup`). | ||
| Write/Edit/MultiEdit: no inline override — ask the user to confirm in chat, then retry. |
| An automated PreToolUse hook backs this skill for the highest-blast-radius patterns (Bash destructive commands, file writes to secret paths, DDL payloads in content). The hook is a deterministic tripwire — it does not replace this skill's reasoning process. Use this skill to reason about danger; the hook enforces a last-resort gate if that reasoning is skipped. | ||
|
|
||
| Bash: override with `# reviewed` shell comment (e.g. `rm -rf /tmp/scratch # reviewed: intentional cleanup`). | ||
| Write/Edit/MultiEdit: no inline override — ask the user to confirm in chat, then retry. |
| An automated PreToolUse hook backs this skill for the highest-blast-radius patterns (Bash destructive commands, file writes to secret paths, DDL payloads in content). The hook is a deterministic tripwire — it does not replace this skill's reasoning process. Use this skill to reason about danger; the hook enforces a last-resort gate if that reasoning is skipped. | ||
|
|
||
| Bash: override with `# reviewed` shell comment (e.g. `rm -rf /tmp/scratch # reviewed: intentional cleanup`). | ||
| Write/Edit/MultiEdit: no inline override — ask the user to confirm in chat, then retry. |
| An automated PreToolUse hook backs this skill for the highest-blast-radius patterns (Bash destructive commands, file writes to secret paths, DDL payloads in content). The hook is a deterministic tripwire — it does not replace this skill's reasoning process. Use this skill to reason about danger; the hook enforces a last-resort gate if that reasoning is skipped. | ||
|
|
||
| Bash: override with `# reviewed` shell comment (e.g. `rm -rf /tmp/scratch # reviewed: intentional cleanup`). | ||
| Write/Edit/MultiEdit: no inline override — ask the user to confirm in chat, then retry. |
…rt isLibraryModule workaround Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…in bundles Removed 16 stale split-bundle artifacts (patterns.js + evaluate.js) from hooks/dist/bundles/ and all 4 plugin directories (core-claude, core-codex, core-copilot, core-cursor). These were left over from the pre-subdirectory layout; the current build only emits the consolidated dangerous-actions.js. Updated plugin bundles reflect the renamed source path comment (patterns.ts and evaluate.ts now live under src/hooks/dangerous-actions/). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…m-rf regex, grammar Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… section Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nticKind addition, pitfalls Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…n hooks-authoring Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… mcp-call adapter Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…d for mcp__ prefix Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…gin trees Distributed hooks-authoring SKILL.md to core-claude, core-codex, core-copilot, and core-cursor. Rebuilt and synced dangerous-actions bundles with MCP heuristic adapter. Updated dangerous-actions SKILL.md with Claude Code only qualifier. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…gnore
- agents/IMPLEMENTATION.md: document dangerous-actions hook (patterns, override, registration, tests)
- agents/MEMORY.md: 5 new preventive rules (r3 source, auto-discovery, regression guard, basename matching, ID namespaces)
- hooks/dist/src/hooks/dangerous-actions.js: tracked compiled entry-point (was missing from git)
- .gitignore: add /.worktrees/ entry with proper newline (worktree infra)
- remove stale flat dangerous-actions-{evaluate,patterns}.js (superseded by subdirectory layout in b0ef9ab)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ed' in any field, all tool kinds Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…' anywhere in tool call Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
….131 compat Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ns, DRY SQL regex reviewed: commit message describes source-code pattern fixes, not executable operations F7: extract SQL_DROP_RE and SQL_TRUNCATE_RE constants shared by DANGEROUS_BASH and DANGEROUS_CONTENT F1: remove false-positive prod-name branch from kubectl-delete-prod — only --all flag triggers F2: tighten dropdb psql branch to require SQL keyword after the command word F3: replace greedy token-consuming regex with lookahead so force flag is detected in any position
… kind hasReviewedOverride now accepts a toolName parameter and whitelists only fields rendered in the IDE UI (e.g. command for Bash, content/file_path for Write, new_string/old_string/file_path for Edit). Hidden metadata fields like description no longer grant override access. Exports hasReviewedOverride and evalPatternOnly for use by cooldown logic. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… deny Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…estrator Promotes dangerous-actions.ts from a one-liner to a full F12 A+B+C orchestrator: pattern match → cooldown guard → override allow+audit → deny+record. Adds F12-B and F12-C integration tests (448 total, tsc clean). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Update dangerous-actions SKILL.md (r2 + r3) with accurate F12 A+B+C threat model: override restricted to user-visible fields only (not description/metadata), 5s cooldown, append-only audit log. Rebuild bundles after plugin sync. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
| const hasOverride = hasRosettaReviewedOverride(input, ctx.toolName); | ||
|
|
||
| // Layer B: cooldown — block immediate self-retry with override. | ||
| if (isWithinCooldown(cwd, hash) && hasOverride) { |
There was a problem hiding this comment.
I don't think we need that. Let's keep it simple.
| `Evidence: ${evidenceLine}`, | ||
| '', | ||
| 'Override: append `# Rosetta-reviewed` to the tool call (Bash command, content, or any visible field).', | ||
| 'HITL: only the human user may add this marker. AI agents MUST NOT add it autonomously — wait for explicit human approval.', |
There was a problem hiding this comment.
We should not have this HITL sentence. We only want AI to review this.
| '', | ||
| 'Override: append `# Rosetta-reviewed` to the tool call (Bash command, content, or any visible field).', | ||
| 'HITL: only the human user may add this marker. AI agents MUST NOT add it autonomously — wait for explicit human approval.', | ||
| 'Alternative: use soft delete, dry-run, --force-with-lease, or a staging environment.', |
There was a problem hiding this comment.
Let's change that to:
This is dangerous action. Did you use skill? Did you analyse blast radius and whether you can recover it back? Did you intend dry run?
| `Blocked: ${pattern.id} — ${pattern.label} on ${toolKind}`, | ||
| `Evidence: ${evidenceLine}`, | ||
| '', | ||
| 'Override: append `# Rosetta-reviewed` to the tool call (Bash command, content, or any visible field).', |
There was a problem hiding this comment.
Let's put that as last sentence
'If you are sure and confirmed with the user, you can override by appending Rosetta-reviewed comment to the tool call command content.'
Summary
Implements the
dangerous-actionsPreToolUse hook for Claude Code — a deterministic last-resort gate that blocks destructive operations before they execute.What's blocked
rm -rf,git reset --hard,git push --force,git branch -D,aws s3 rm --recursive,kubectl delete --all, DDL DROP/TRUNCATE,mkfs,dd of=/dev/.env*, SSH private keys, AWS credentials, GCP credentials, kubeconfig, netrc, pgpass, GPG keyscommand/cmdfieldsOverride
Include the word
reviewedanywhere in the tool call (command, description, content, or any string argument). This asserts on behalf of the user that the destructive operation is intentional.Example:
rm -rf /tmp/scratch reviewedor addreviewedto any string field of the tool call.Scope
Claude Code only (PreToolUse). Rollout to other IDEs is a follow-up — PreToolUse semantics differ per platform.
Hook file
plugins/core-claude/hooks/dangerous-actions.js(auto-registered viahooks.json).🤖 Generated with Claude Code