Skip to content

feat(hooks): dangerous-actions PreToolUse hook for Claude Code#79

Open
sharkich wants to merge 33 commits intov3from
feat/hooks-dangerous-actions
Open

feat(hooks): dangerous-actions PreToolUse hook for Claude Code#79
sharkich wants to merge 33 commits intov3from
feat/hooks-dangerous-actions

Conversation

@sharkich
Copy link
Copy Markdown
Contributor

@sharkich sharkich commented May 5, 2026

Summary

Implements the dangerous-actions PreToolUse hook for Claude Code — a deterministic last-resort gate that blocks destructive operations before they execute.

What's blocked

Category Examples
Bash destructive rm -rf, git reset --hard, git push --force, git branch -D, aws s3 rm --recursive, kubectl delete --all, DDL DROP/TRUNCATE, mkfs, dd of=/dev/
Secret file writes .env*, SSH private keys, AWS credentials, GCP credentials, kubeconfig, netrc, pgpass, GPG keys
Dangerous content AWS access key IDs, PEM private keys, SQL DROP/TRUNCATE in payload
MCP shell calls Same bash patterns applied to MCP tool command/cmd fields

Override

Include the word reviewed anywhere in the tool call (command, description, content, or any string argument). This asserts on behalf of the user that the destructive operation is intentional.

Example: rm -rf /tmp/scratch reviewed or add reviewed to any string field of the tool call.

Scope

Claude Code only (PreToolUse). Rollout to other IDEs is a follow-up — PreToolUse semantics differ per platform.

Hook file

plugins/core-claude/hooks/dangerous-actions.js (auto-registered via hooks.json).

🤖 Generated with Claude Code

sharkich and others added 8 commits May 5, 2026 10:08
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…epresentativeness

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pure-data DangerPattern catalogue (DANGEROUS_BASH/PATHS/CONTENT) with 16/8/4
entries, structure tests (4 pass), and regression-test exclusion for library
modules that are not runnable hook entry points.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix git-force-push regex to match tokens between push and --force flag
- Fix rm-rf-root to also match rm -rf /* variant
- Rename sql-drop-table/sql-truncate in DANGEROUS_CONTENT to content-sql-* to avoid id collisions with DANGEROUS_BASH
- Add comment documenting secret-env basename-matching contract
- Change LIBRARY_MODULES from RegExp to Set in hooks-registered regression test
- Add describe('pattern correctness') block with 15 positive/negative match tests

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pure evaluateDangerous() function with bash/write/edit/multi-edit routing,
inline # reviewed override for bash, and 25 new unit tests (43 total pass).
Also extends regression guard to treat *-evaluate.ts as library modules.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tup test

- Add typeof/Array.isArray guards in evalBash/evalWrite/evalEdit/evalMultiEdit
  so missing toolInput fields return null instead of throwing TypeError
- Strip trailing slashes in matchDangerousPath before basename extraction
  so paths like /home/user/.env/ are correctly matched as dangerous
- Refactor git-force-push describe block: replace test('setup') anti-pattern
  with a const re declaration at describe scope

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… registration

- Create hooks/src/hooks/dangerous-actions.ts (minimal entry wiring defineHook + evaluateDangerous + runAsCli)
- Add 11 integration tests to dangerous-actions.test.ts covering full stdin→runHook→JSON pipeline
- Register dangerous-actions.js in core-claude hooks.json.tmpl under PreToolUse (Bash|Write|Edit|MultiEdit)
- Scope regression test to claude-code only for PreToolUse hooks (Cursor/Copilot/Codex lack PreToolUse)
- Distribute bundles to all plugin hooks directories via pre_commit.py

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add <hook> paragraph to dangerous-actions skill (r2, r3, and all plugin
variants) documenting that a deterministic PreToolUse hook enforces a
last-resort gate for the highest-blast-radius patterns, and explaining
the override protocol for Bash vs Write/Edit/MultiEdit tools.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 5, 2026 10:07
@github-actions github-actions Bot added the enhancement New feature or request label May 5, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

Rosetta Triage Review

Summary: Adds a PreToolUse hook (dangerous-actions) that intercepts Bash, Write, Edit, and MultiEdit tool calls across all four plugin platforms (core-claude, core-codex, core-copilot, core-cursor), blocking commands and content that match 28 hardcoded dangerous patterns (16 Bash, 8 path, 4 content). Bash operations can be overridden with a # reviewed inline comment; Write/Edit/MultiEdit require user chat confirmation.

Findings:

  • Architecture: Clean three-file SRP split (patterns.tsevaluate.tsdangerous-actions.ts) mirrors existing hook shape. Pure evaluateDangerous function is properly unit-testable with no side effects.
  • Test coverage: 54 new Vitest tests covering pattern correctness, unit evaluation, integration (runHook), and override semantics. All 407 existing tests pass.
  • Documentation: SKILL.md updated across r2, r3, core-claude, core-codex, core-copilot, and core-cursor with clear <hook> section describing override mechanism.
  • PR description inconsistency: PR body states "Registered in plugins/core-claude only (other platforms in follow-up PR)" but the diff deploys hook files and registers them in all four platforms. The description should be updated to reflect actual scope.
  • Pattern precision concern: The rm-rf-recursive pattern (/\brm\s+-[rf]{2,}\b/) matches rm -rr and rm -ff, which are not actually dangerous. Consider tightening to require both r and f flags specifically.
  • False-positive risk: Content-based SQL checks (DROP TABLE, TRUNCATE) will block legitimate SQL migration scripts, test fixtures, and DDL files. Users writing DB migrations should be aware they will need to confirm in chat for every such file.
  • Breaking behavior: Previously allowed operations (e.g., git reset --hard, rm -rf /tmp/scratch, SQL migration files) are now intercepted. This is an intentional UX tradeoff but reviewers should validate the override flow is user-friendly.

Suggestions:

  • Update the PR body "Out of scope" / "Registered in" section to reflect that all four platforms are actually included in this PR.
  • Consider tightening rm-rf-recursive pattern to require both r and f in the flag group (e.g., /\brm\s+-[a-zA-Z]*r[a-zA-Z]*f[a-zA-Z]*\b/ or similar) to avoid false positives on non-dangerous flag combos.
  • Consider adding a note in SKILL.md or hook deny message about SQL migration workflow so users know what to expect before hitting the block.

Automated triage by Rosetta agent

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces a new high-blast-radius safeguard in the hooks/ package: a PreToolUse hook (dangerous-actions) that denies execution of dangerous Bash commands, secret-path writes/edits, and destructive/secret-bearing content before the tool runs. This is then bundled into plugin artifacts and registered for Claude Code via plugins/core-claude/hooks/hooks.json*, with updated regression coverage to accommodate a Claude-only rollout.

Changes:

  • Added dangerous-actions hook implementation split into patterns/data, evaluation logic, and hook entrypoint.
  • Registered the hook for Claude Code PreToolUse and emitted bundled JS artifacts for multiple plugins.
  • Added extensive Vitest coverage plus new Claude Code PreToolUse fixtures; updated the “hooks registered” regression test to handle library modules and Claude-only hooks.

Reviewed changes

Copilot reviewed 28 out of 28 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
plugins/core-cursor/skills/dangerous-actions/SKILL.md Documents the dangerous-actions skill; now includes a hook blurb.
plugins/core-cursor/.cursor/hooks/dangerous-actions.js Cursor-bundled hook artifact (compiled JS).
plugins/core-cursor/.cursor/hooks/dangerous-actions-patterns.js Cursor-bundled patterns artifact (compiled JS).
plugins/core-cursor/.cursor/hooks/dangerous-actions-evaluate.js Cursor-bundled evaluator artifact (compiled JS).
plugins/core-copilot/skills/dangerous-actions/SKILL.md Documents the dangerous-actions skill; now includes a hook blurb.
plugins/core-copilot/hooks/dangerous-actions.js Copilot-bundled hook artifact (compiled JS).
plugins/core-copilot/hooks/dangerous-actions-patterns.js Copilot-bundled patterns artifact (compiled JS).
plugins/core-copilot/hooks/dangerous-actions-evaluate.js Copilot-bundled evaluator artifact (compiled JS).
plugins/core-codex/.codex/hooks/dangerous-actions.js Codex-bundled hook artifact (compiled JS).
plugins/core-codex/.codex/hooks/dangerous-actions-patterns.js Codex-bundled patterns artifact (compiled JS).
plugins/core-codex/.codex/hooks/dangerous-actions-evaluate.js Codex-bundled evaluator artifact (compiled JS).
plugins/core-codex/.agents/skills/dangerous-actions/SKILL.md Codex skill docs updated with hook blurb.
plugins/core-claude/skills/dangerous-actions/SKILL.md Claude skill docs updated with hook blurb.
plugins/core-claude/hooks/hooks.json.tmpl Registers dangerous-actions for PreToolUse in Claude Code template.
plugins/core-claude/hooks/hooks.json Registers dangerous-actions for PreToolUse in Claude Code concrete config.
plugins/core-claude/hooks/dangerous-actions.js Claude-bundled hook artifact (compiled JS).
plugins/core-claude/hooks/dangerous-actions-patterns.js Claude-bundled patterns artifact (compiled JS).
plugins/core-claude/hooks/dangerous-actions-evaluate.js Claude-bundled evaluator artifact (compiled JS).
instructions/r3/core/skills/dangerous-actions/SKILL.md Core instruction skill docs updated with hook blurb.
instructions/r2/core/skills/dangerous-actions/SKILL.md Core instruction skill docs updated with hook blurb.
hooks/tests/regression/hooks-registered.test.ts Excludes library modules from “must be registered” and scopes Claude-only hooks.
hooks/tests/fixtures/claude-code-pre-tool-use-write.json New Claude Code PreToolUse Write fixture.
hooks/tests/fixtures/claude-code-pre-tool-use-multi-edit.json New Claude Code PreToolUse MultiEdit fixture.
hooks/tests/fixtures/claude-code-pre-tool-use-edit.json New Claude Code PreToolUse Edit fixture.
hooks/tests/dangerous-actions.test.ts New unit + integration tests for patterns, evaluator, and runHook behavior.
hooks/src/hooks/dangerous-actions.ts New hook entrypoint wiring defineHook + runAsCli.
hooks/src/hooks/dangerous-actions-patterns.ts New dangerous pattern catalogs (bash/path/content).
hooks/src/hooks/dangerous-actions-evaluate.ts New pure evaluation logic and deny message construction.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +21 to +31
const snippet = evidence.length > EVIDENCE_MAX
? evidence.slice(0, EVIDENCE_MAX) + '…'
: evidence;

return [
'Blocked by rosetta dangerous-actions hook.',
'',
`Rule: ${pattern.id} — ${pattern.label}`,
`Tool: ${toolKind}`,
`Evidence: ${snippet}`,
'',
const basename = normalizedPath.split('/').pop() ?? normalizedPath;
for (const p of DANGEROUS_PATHS) {
// Test full path first (covers patterns with / in them like aws-credentials)
if (p.re.test(filePath)) return p;
`Tool: ${toolKind}`,
`Evidence: ${snippet}`,
'',
'Did you consider this as a dangerous activity?',
Comment on lines +43 to +47
An automated PreToolUse hook backs this skill for the highest-blast-radius patterns (Bash destructive commands, file writes to secret paths, DDL payloads in content). The hook is a deterministic tripwire — it does not replace this skill's reasoning process. Use this skill to reason about danger; the hook enforces a last-resort gate if that reasoning is skipped.

Bash: override with `# reviewed` shell comment (e.g. `rm -rf /tmp/scratch # reviewed: intentional cleanup`).
Write/Edit/MultiEdit: no inline override — ask the user to confirm in chat, then retry.

Comment on lines +43 to +46
An automated PreToolUse hook backs this skill for the highest-blast-radius patterns (Bash destructive commands, file writes to secret paths, DDL payloads in content). The hook is a deterministic tripwire — it does not replace this skill's reasoning process. Use this skill to reason about danger; the hook enforces a last-resort gate if that reasoning is skipped.

Bash: override with `# reviewed` shell comment (e.g. `rm -rf /tmp/scratch # reviewed: intentional cleanup`).
Write/Edit/MultiEdit: no inline override — ask the user to confirm in chat, then retry.
Comment on lines +43 to +46
An automated PreToolUse hook backs this skill for the highest-blast-radius patterns (Bash destructive commands, file writes to secret paths, DDL payloads in content). The hook is a deterministic tripwire — it does not replace this skill's reasoning process. Use this skill to reason about danger; the hook enforces a last-resort gate if that reasoning is skipped.

Bash: override with `# reviewed` shell comment (e.g. `rm -rf /tmp/scratch # reviewed: intentional cleanup`).
Write/Edit/MultiEdit: no inline override — ask the user to confirm in chat, then retry.
Comment on lines +43 to +46
An automated PreToolUse hook backs this skill for the highest-blast-radius patterns (Bash destructive commands, file writes to secret paths, DDL payloads in content). The hook is a deterministic tripwire — it does not replace this skill's reasoning process. Use this skill to reason about danger; the hook enforces a last-resort gate if that reasoning is skipped.

Bash: override with `# reviewed` shell comment (e.g. `rm -rf /tmp/scratch # reviewed: intentional cleanup`).
Write/Edit/MultiEdit: no inline override — ask the user to confirm in chat, then retry.
Comment on lines +43 to +46
An automated PreToolUse hook backs this skill for the highest-blast-radius patterns (Bash destructive commands, file writes to secret paths, DDL payloads in content). The hook is a deterministic tripwire — it does not replace this skill's reasoning process. Use this skill to reason about danger; the hook enforces a last-resort gate if that reasoning is skipped.

Bash: override with `# reviewed` shell comment (e.g. `rm -rf /tmp/scratch # reviewed: intentional cleanup`).
Write/Edit/MultiEdit: no inline override — ask the user to confirm in chat, then retry.
@sharkich sharkich self-assigned this May 5, 2026
sharkich and others added 13 commits May 5, 2026 16:05
…rt isLibraryModule workaround

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…in bundles

Removed 16 stale split-bundle artifacts (patterns.js + evaluate.js) from
hooks/dist/bundles/ and all 4 plugin directories (core-claude, core-codex,
core-copilot, core-cursor). These were left over from the pre-subdirectory
layout; the current build only emits the consolidated dangerous-actions.js.
Updated plugin bundles reflect the renamed source path comment (patterns.ts
and evaluate.ts now live under src/hooks/dangerous-actions/).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…m-rf regex, grammar

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… section

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nticKind addition, pitfalls

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…n hooks-authoring

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… mcp-call adapter

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…d for mcp__ prefix

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…gin trees

Distributed hooks-authoring SKILL.md to core-claude, core-codex, core-copilot,
and core-cursor. Rebuilt and synced dangerous-actions bundles with MCP heuristic
adapter. Updated dangerous-actions SKILL.md with Claude Code only qualifier.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…gnore

- agents/IMPLEMENTATION.md: document dangerous-actions hook (patterns, override, registration, tests)
- agents/MEMORY.md: 5 new preventive rules (r3 source, auto-discovery, regression guard, basename matching, ID namespaces)
- hooks/dist/src/hooks/dangerous-actions.js: tracked compiled entry-point (was missing from git)
- .gitignore: add /.worktrees/ entry with proper newline (worktree infra)
- remove stale flat dangerous-actions-{evaluate,patterns}.js (superseded by subdirectory layout in b0ef9ab)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ed' in any field, all tool kinds

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…' anywhere in tool call

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
sharkich and others added 12 commits May 7, 2026 10:14
….131 compat

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ns, DRY SQL regex

reviewed: commit message describes source-code pattern fixes, not executable operations

F7: extract SQL_DROP_RE and SQL_TRUNCATE_RE constants shared by DANGEROUS_BASH and DANGEROUS_CONTENT
F1: remove false-positive prod-name branch from kubectl-delete-prod — only --all flag triggers
F2: tighten dropdb psql branch to require SQL keyword after the command word
F3: replace greedy token-consuming regex with lookahead so force flag is detected in any position
… kind

hasReviewedOverride now accepts a toolName parameter and whitelists only
fields rendered in the IDE UI (e.g. command for Bash, content/file_path
for Write, new_string/old_string/file_path for Edit). Hidden metadata
fields like description no longer grant override access. Exports
hasReviewedOverride and evalPatternOnly for use by cooldown logic.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… deny

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…estrator

Promotes dangerous-actions.ts from a one-liner to a full F12 A+B+C
orchestrator: pattern match → cooldown guard → override allow+audit → deny+record.
Adds F12-B and F12-C integration tests (448 total, tsc clean).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Update dangerous-actions SKILL.md (r2 + r3) with accurate F12 A+B+C
threat model: override restricted to user-visible fields only (not
description/metadata), 5s cooldown, append-only audit log. Rebuild
bundles after plugin sync.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
const hasOverride = hasRosettaReviewedOverride(input, ctx.toolName);

// Layer B: cooldown — block immediate self-retry with override.
if (isWithinCooldown(cwd, hash) && hasOverride) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need that. Let's keep it simple.

`Evidence: ${evidenceLine}`,
'',
'Override: append `# Rosetta-reviewed` to the tool call (Bash command, content, or any visible field).',
'HITL: only the human user may add this marker. AI agents MUST NOT add it autonomously — wait for explicit human approval.',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not have this HITL sentence. We only want AI to review this.

'',
'Override: append `# Rosetta-reviewed` to the tool call (Bash command, content, or any visible field).',
'HITL: only the human user may add this marker. AI agents MUST NOT add it autonomously — wait for explicit human approval.',
'Alternative: use soft delete, dry-run, --force-with-lease, or a staging environment.',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change that to:
This is dangerous action. Did you use skill? Did you analyse blast radius and whether you can recover it back? Did you intend dry run?

`Blocked: ${pattern.id} — ${pattern.label} on ${toolKind}`,
`Evidence: ${evidenceLine}`,
'',
'Override: append `# Rosetta-reviewed` to the tool call (Bash command, content, or any visible field).',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put that as last sentence
'If you are sure and confirmed with the user, you can override by appending Rosetta-reviewed comment to the tool call command content.'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants