From d0bc6244f70066998acbf087202c4ecd0e842ad8 Mon Sep 17 00:00:00 2001
From: Alfonso Domenech <adomenechreal@gmail.com>
Date: Tue, 21 Apr 2026 18:56:15 +0200
Subject: [PATCH] fix(ops-suite): enforce adapter pattern, fix queue-status
 auto-chain, add CLAUDE.md

- db-query/db-migrate: replace hardcoded kubectl port-forward with reference to
  port-forward adapter, so non-Kubernetes orchestrators work correctly
- port-forward/adapters/kubernetes.md: add separate section for pod env var
  retrieval (pod_env pattern) vs Kubernetes Secret retrieval
- queue-status: change DLQ auto-chain to queue-triage into a suggest (Next steps),
  preventing unintended Sonnet invocations on a simple status check
- queue-triage: remove NestJS-specific patterns from Step 6, replace with generic
  grep approach + reference to new consumer-patterns.md loaded on-demand
- queue-triage/references/consumer-patterns.md: new file with framework-specific
  consumer search patterns (NestJS, Spring AMQP, Celery)
- workflow-deploy: fix cross-skill adapter path from relative ../deploy/ to
  ${CLAUDE_PLUGIN_ROOT}/skills/deploy/ to avoid path fragility
- CLAUDE.md: add project-level guidance for future Claude Code instances

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---
 CLAUDE.md                                     | 96 +++++++++++++++++++
 plugins/ops-suite/skills/db-migrate/SKILL.md  | 17 ++--
 plugins/ops-suite/skills/db-query/SKILL.md    | 10 +-
 .../port-forward/adapters/kubernetes.md       | 12 ++-
 .../ops-suite/skills/queue-status/SKILL.md    | 10 +-
 .../ops-suite/skills/queue-triage/SKILL.md    | 20 +++-
 .../references/consumer-patterns.md           | 93 ++++++++++++++++++
 .../ops-suite/skills/workflow-deploy/SKILL.md |  2 +-
 8 files changed, 234 insertions(+), 26 deletions(-)
 create mode 100644 CLAUDE.md
 create mode 100644 plugins/ops-suite/skills/queue-triage/references/consumer-patterns.md
diff --git a/CLAUDE.md b/CLAUDE.md
new file mode 100644
index 0000000..6436779
--- /dev/null
+++ b/CLAUDE.md
@@ -0,0 +1,96 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## What this repo is
+
+Pure Markdown + YAML skill collection for Claude Code. No build, no tests, no linting — validation is manual (read and review skill files). The only commands you'll run are `git` operations.
+
+## Architecture: Config + Adapters
+
+Every skill follows the same pattern:
+
+```
+config.yaml  →  SKILL.md  →  adapters/<technology>.md
+```
+
+1. The user defines their stack once in `config.yaml` (`orchestrator: kubernetes`, `message_broker: rabbitmq`, etc.)
+2. Each `SKILL.md` reads config and loads the right adapter at runtime
+3. Adapters contain actual CLI commands with `{config.X.Y}` placeholders
+
+**To add support for a new technology**: create `adapters/<tech>.md` inside the skill folder — no changes to `SKILL.md` or config schema needed.
+
+## Plugin layout
+
+```
+plugins/
+  ops-suite/           # Infrastructure: status, logs, deploy, DB, queues
+    skills/<name>/
+      SKILL.md         # Frontmatter + step-by-step instructions
+      adapters/        # One file per supported technology
+      references/      # Deep docs loaded on-demand (keep SKILL.md < 500 lines)
+    commands/          # Command shims (.md) that invoke each skill
+    hooks/             # hooks.json + session-start.sh
+    runtime/           # Shared docs: chaining.md, safety.md, session-state.md
+    config.example.yaml
+    config.yaml        # User's actual config (gitignored)
+  refinery/            # Roadmap refinement: tickets, design, docs, sprints
+  creating-skills/     # Meta-skill for authoring new skills
+.claude-plugin/
+  marketplace.json     # Plugin registry (name, version, source paths)
+```
+
+## SKILL.md frontmatter
+
+Key fields when creating or editing skills:
+
+| Field | Notes |
+|-------|-------|
+| `name` | Kebab-case, becomes the slash-command |
+| `description` | **Most critical** — drives auto-invocation. Start with "Use when…" + natural-language triggers. Max 1024 chars. |
+| `disable-model-invocation: true` | Required for all destructive ops (deploy, migrate, reprocess). Prevents auto-chaining. |
+| `allowed-tools` | Restrict what the skill can call |
+| `model` | Override per-skill (e.g. `haiku` for cheap read-only checks) |
+| `argument-hint` | Autocomplete hint shown to user |
+
+## Session state (ops-suite)
+
+Skills share state through `/tmp/ops-suite-session/`:
+
+- `config.json` — parsed `config.yaml`, written by the `session-start.sh` hook and cached for the session
+- `env.json` — selected environment (planned)
+- `credentials.json` — DB/broker credentials (planned)
+- `port-forwards.json` — active port-forward PIDs (planned)
+
+Step 0 in every skill checks for `/tmp/ops-suite-session/config.json` before re-parsing config.
+
+## Skill chaining rules
+
+**Read-only skills** (no `disable-model-invocation`) can be auto-invoked mid-step:
+```
+Use ops-suite:service-status with arguments: {service} {env_name}.
+Use session state from /tmp/ops-suite-session/ — do not re-ask for environment.
+```
+
+**Destructive skills** (`disable-model-invocation: true`) must be suggested, never auto-invoked:
+```
+Next steps:
+  → Run `/ops-suite:db-migrate {env_name}` to apply pending migrations.
+```
+
+Chain depth is capped at 3.
+
+## Safety classification
+
+| Skill | Type |
+|-------|------|
+| service-status, service-logs, db-query, queue-status, port-forward | read-only — auto-chainable |
+| deploy, db-migrate, queue-reprocess | destructive — suggest only, always ask for explicit confirmation |
+
+## Adding a new skill
+
+1. Run `/creating-skills:creating-skills` — it guides the full authoring process
+2. Create `plugins/<plugin>/skills/<name>/SKILL.md` with the standard frontmatter
+3. Add adapters under `adapters/` for each supported technology
+4. Add a command shim at `plugins/<plugin>/commands/<name>.md`
+5. Test: trigger test (natural language), functional test (`/skill-name`), performance test (SKILL.md < 500 lines)
diff --git a/plugins/ops-suite/skills/db-migrate/SKILL.md b/plugins/ops-suite/skills/db-migrate/SKILL.md
index d00936f..e9cfe55 100644
--- a/plugins/ops-suite/skills/db-migrate/SKILL.md
+++ b/plugins/ops-suite/skills/db-migrate/SKILL.md
@@ -51,18 +51,15 @@ Show the user:
 The migration tool needs a database connection. This typically requires:
 
 1. **Port-forward** to the database using the orchestrator:
-   ```
-   kubectl --context={env.context} port-forward svc/{env.services.database.name} {deploy.local_ports.{env_name}}:{env.services.database.port} -n {env.services.database.namespace || env.namespaces.infra} &
-   ```
-   Note: Use `env.services.database.namespace` if defined — the database/pgbouncer may live in a
-   different namespace than `env.namespaces.infra`.
+   Load `${CLAUDE_PLUGIN_ROOT}/skills/port-forward/adapters/{orchestrator}.md` and use its
+   "Port-forward a service (background)" command for `{env.services.database.name}`,
+   local port `{deploy.local_ports.{env_name}}`, remote port `{env.services.database.port}`.
+   Use `{env.services.database.namespace}` if defined, otherwise `{env.namespaces.infra}`.
 
 2. **Credentials**: First check if `env.services.database.credentials_from` is defined in config:
-   - If `pod_env:<VAR_NAME>`: retrieve from a running app pod:
-     ```
-     kubectl --context={env.context} exec {any_app_pod} -n {env.namespaces.apps} -- printenv <VAR_NAME>
-     ```
-   - Otherwise, use the adapter's "retrieve secret" command or ask the user.
+   - If `pod_env:<VAR_NAME>`: load `${CLAUDE_PLUGIN_ROOT}/skills/port-forward/adapters/{orchestrator}.md`
+     and use its "retrieve secret" command to read the variable from a running app pod.
+   - Otherwise, use the migration adapter's credential retrieval command or ask the user.
    Never hardcode or display credentials in plain text.
 
 3. **Set environment variables** as required by the migration tool (from adapter).
diff --git a/plugins/ops-suite/skills/db-query/SKILL.md b/plugins/ops-suite/skills/db-query/SKILL.md
index 7535b52..077724e 100644
--- a/plugins/ops-suite/skills/db-query/SKILL.md
+++ b/plugins/ops-suite/skills/db-query/SKILL.md
@@ -38,11 +38,11 @@ If `$ARGUMENTS` contains an environment name, use it. Otherwise ask the user.
 Check if a port-forward is already active on the expected local port (`deploy.local_ports.{env_name}`).
 
 If not active:
-1. Start a port-forward using the orchestrator:
-   ```
-   kubectl --context={env.context} port-forward svc/{env.services.database.name} {deploy.local_ports.{env_name}}:{env.services.database.port} -n {env.services.database.namespace || env.namespaces.infra} &
-   ```
-2. Verify the connection is working
+1. Load `${CLAUDE_PLUGIN_ROOT}/skills/port-forward/adapters/{orchestrator}.md` and use its
+   "Port-forward a service (background)" command for `{env.services.database.name}`,
+   local port `{deploy.local_ports.{env_name}}`, remote port `{env.services.database.port}`,
+   namespace `{env.services.database.namespace}` if defined, otherwise `{env.namespaces.infra}`.
+2. Verify the connection using the adapter's connection check command.
 
 Retrieve or ask for credentials. Never hardcode credentials.
 
diff --git a/plugins/ops-suite/skills/port-forward/adapters/kubernetes.md b/plugins/ops-suite/skills/port-forward/adapters/kubernetes.md
index ecfabdd..a9680a6 100644
--- a/plugins/ops-suite/skills/port-forward/adapters/kubernetes.md
+++ b/plugins/ops-suite/skills/port-forward/adapters/kubernetes.md
@@ -56,12 +56,22 @@ nc -z localhost {local_port} 2>/dev/null && echo "Connection OK" || echo "Connec
 curl -s -o /dev/null -w '%{http_code}' http://localhost:{local_port}/health
 ```
 
-## Retrieve secret (generic pattern)
+## Retrieve secret (from Kubernetes Secret)
 
 ```bash
 kubectl --context={env.context} get secret {secret_name} -n {namespace} -o jsonpath='{.data.{key}}' | base64 -d
 ```
 
+## Retrieve environment variable from a running pod (pod_env pattern)
+
+Use when `credentials_from: pod_env:<VAR_NAME>` is set in config — reads the variable from a
+running app pod instead of a Kubernetes Secret:
+
+```bash
+kubectl --context={env.context} get pods -n {env.namespaces.apps} -l app={service} -o name | head -1
+kubectl --context={env.context} exec {pod} -n {env.namespaces.apps} -- printenv {VAR_NAME}
+```
+
 ## List secrets in namespace
 
 ```bash
diff --git a/plugins/ops-suite/skills/queue-status/SKILL.md b/plugins/ops-suite/skills/queue-status/SKILL.md
index 9cc63c2..3805fdd 100644
--- a/plugins/ops-suite/skills/queue-status/SKILL.md
+++ b/plugins/ops-suite/skills/queue-status/SKILL.md
@@ -55,7 +55,7 @@ Flag the following conditions:
 
 | Condition | Severity | Action |
 |-----------|----------|--------|
-| DLQ with messages > 0 | Warning | Auto-chain to `queue-triage` (see below) |
+| DLQ with messages > 0 | Warning | Suggest queue-triage (see Step 6) |
 | Queue with 0 consumers | Warning | Check if consumer service is running |
 | Queue with growing message count | Warning | Consumer may be slow or stuck |
 | Queue in "down" or "stopped" state | Critical | Investigate immediately |
@@ -83,7 +83,9 @@ Summary:
   Queues with 0 consumers: {count}
 ```
 
-If any DLQs have messages, automatically triage the first one:
+If any DLQs have messages, suggest next steps:
 
-Use ops-suite:queue-triage with arguments: {dlq_name} {env_name}.
-Use session state from /tmp/ops-suite-session/ — do not re-ask for environment.
+```
+Next steps:
+  → Run `/ops-suite:queue-triage {dlq_name} {env_name}` to diagnose why messages are failing.
+```
diff --git a/plugins/ops-suite/skills/queue-triage/SKILL.md b/plugins/ops-suite/skills/queue-triage/SKILL.md
index 3f73b7d..8e97845 100644
--- a/plugins/ops-suite/skills/queue-triage/SKILL.md
+++ b/plugins/ops-suite/skills/queue-triage/SKILL.md
@@ -98,11 +98,21 @@ python3 scripts/analyze_messages.py {messages_file}
 
 If the codebase is available:
 
-1. **Find the subscription config**: Search for the queue name in config files (e.g. `grep -r "queue_name" src/config/`). Identify the subscription name mapped to this queue.
-2. **Verify subscribe() call exists**: Search for `subscribe('{subscription_name}'` in the codebase. Compare all subscriptions declared in config vs actual `subscribe()` calls — any mismatch means orphaned config.
-3. **Find the consumer handler**: Locate the subscriber class (typically in `application/amqp/`) and read its `onApplicationBootstrap()` method to see which subscriptions are actually registered.
-4. **Check error handling patterns**: Look for try/catch, reject, or nack logic in the handler.
-5. **Check if the error matches a known code path**: Cross-reference with the failure mode from Step 5.
+1. **Find the subscription config**: Search for the queue name in config and source files:
+   ```bash
+   grep -r "{queue_name}" src/ --include="*.ts" --include="*.yaml" --include="*.json" -l
+   ```
+   Identify the subscription name or handler mapped to this queue.
+2. **Locate the consumer handler**: Search for the handler that processes this subscription:
+   ```bash
+   grep -r "{subscription_name}" src/ --include="*.ts" -l
+   ```
+3. **Check error handling**: Look for try/catch, reject, or nack logic in the handler.
+4. **Verify the handler is registered**: Check that the subscription is actually active at runtime (not only declared in config). Any mismatch between config and actual subscriptions means orphaned config.
+5. **Cross-reference with the failure mode from Step 5**.
+
+If the codebase uses a specific framework, load `references/consumer-patterns.md` for
+framework-specific search patterns (NestJS + RabbitMQ, Spring AMQP, Celery).
 
 ## Step 6b — Check git history for removed handlers
 
diff --git a/plugins/ops-suite/skills/queue-triage/references/consumer-patterns.md b/plugins/ops-suite/skills/queue-triage/references/consumer-patterns.md
new file mode 100644
index 0000000..669fa18
--- /dev/null
+++ b/plugins/ops-suite/skills/queue-triage/references/consumer-patterns.md
@@ -0,0 +1,93 @@
+# Consumer Patterns — Framework-Specific Search Guide
+
+Load this reference in queue-triage Step 6 when you know the consumer framework.
+
+---
+
+## NestJS + @golevelup/nestjs-rabbitmq
+
+### Find subscription config
+
+Subscriptions are typically declared in a module config file:
+
+```bash
+grep -r "subscriptions" src/config/ --include="*.ts" -A 20
+grep -r "RabbitMQModule.forRoot" src/ --include="*.ts" -A 30
+```
+
+Look for entries like:
+```typescript
+subscriptions: {
+  mySubscriptionName: { routingKey: 'some.routing.key', queue: 'queue_name' }
+}
+```
+
+### Verify subscribe() call exists
+
+```bash
+grep -r "subscribe('" src/ --include="*.ts"
+```
+
+Compare all subscription names in config vs all `subscribe('...')` calls. Any name in config
+without a matching `subscribe()` call = orphaned config (messages go to DLQ with no consumer).
+
+### Find the subscriber class
+
+Consumer classes typically live in `src/**/application/amqp/` or `src/**/infrastructure/amqp/`:
+
+```bash
+find src/ -path "*/amqp/*.ts" -o -path "*/subscribers/*.ts" | head -20
+```
+
+### Check registered subscriptions at bootstrap
+
+Look for `onApplicationBootstrap()` in subscriber classes — this is where `subscribe()` calls are made:
+
+```bash
+grep -r "onApplicationBootstrap" src/ --include="*.ts" -l
+```
+
+Read the method body to see which subscriptions are actually registered at runtime.
+
+### Common failure modes in NestJS
+
+| Pattern | What to search for | Root cause |
+|---------|-------------------|------------|
+| `subscribe()` call missing | `grep -r "subscribe("` returns no match for that name | Handler was removed or renamed |
+| Subscription in config, no handler | Config has entry, no `onApplicationBootstrap` with it | Orphaned config |
+| Handler throws uncaught error | `grep -r "nack\|reject" src/` — handler may not handle errors | Missing try/catch in handler |
+| DTO validation fails | Look for `class-validator` decorators in the DTO | Producer changed payload shape |
+
+---
+
+## Spring AMQP (Java/Kotlin)
+
+### Find the listener
+
+```bash
+grep -r "@RabbitListener" src/ --include="*.java" --include="*.kt" -l
+grep -r "queues = " src/ --include="*.java" --include="*.kt"
+```
+
+### Check the binding
+
+```bash
+grep -r "@Bean" src/ --include="*.java" --include="*.kt" -A 5 | grep -A 5 "Queue\|Binding"
+```
+
+---
+
+## Celery (Python)
+
+### Find the task handler
+
+```bash
+grep -r "@app.task\|@celery.task\|@shared_task" . --include="*.py" -l
+grep -r "queue=" . --include="*.py" | grep "{queue_name}"
+```
+
+### Check task routing
+
+```bash
+grep -r "task_routes\|CELERY_ROUTES" . --include="*.py"
+```
diff --git a/plugins/ops-suite/skills/workflow-deploy/SKILL.md b/plugins/ops-suite/skills/workflow-deploy/SKILL.md
index 4d3f29b..f54598a 100644
--- a/plugins/ops-suite/skills/workflow-deploy/SKILL.md
+++ b/plugins/ops-suite/skills/workflow-deploy/SKILL.md
@@ -32,7 +32,7 @@ Store answers as: `{ref}`, `{env_name}`, `{migrations}`, `{rollback}`.
 
 ## Phase B — Pre-flight (read-only, no confirmation needed)
 
-Load the CI adapter file at `../deploy/adapters/{deploy.ci_provider}.md` and extract:
+Load the CI adapter file at `${CLAUDE_PLUGIN_ROOT}/skills/deploy/adapters/{deploy.ci_provider}.md` and extract:
 - The commands to verify and trigger a deployment
 - The rollback command (`{ci_rollback_command}`) — used if a rollback plan is requested