Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
96 changes: 96 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## What this repo is

Pure Markdown + YAML skill collection for Claude Code. No build, no tests, no linting — validation is manual (read and review skill files). The only commands you'll run are `git` operations.

## Architecture: Config + Adapters

Every skill follows the same pattern:

```
config.yaml → SKILL.md → adapters/<technology>.md
```

1. The user defines their stack once in `config.yaml` (`orchestrator: kubernetes`, `message_broker: rabbitmq`, etc.)
2. Each `SKILL.md` reads config and loads the right adapter at runtime
3. Adapters contain actual CLI commands with `{config.X.Y}` placeholders

**To add support for a new technology**: create `adapters/<tech>.md` inside the skill folder — no changes to `SKILL.md` or config schema needed.

## Plugin layout

```
plugins/
ops-suite/ # Infrastructure: status, logs, deploy, DB, queues
skills/<name>/
SKILL.md # Frontmatter + step-by-step instructions
adapters/ # One file per supported technology
references/ # Deep docs loaded on-demand (keep SKILL.md < 500 lines)
commands/ # Command shims (.md) that invoke each skill
hooks/ # hooks.json + session-start.sh
runtime/ # Shared docs: chaining.md, safety.md, session-state.md
config.example.yaml
config.yaml # User's actual config (gitignored)
refinery/ # Roadmap refinement: tickets, design, docs, sprints
creating-skills/ # Meta-skill for authoring new skills
.claude-plugin/
marketplace.json # Plugin registry (name, version, source paths)
```

## SKILL.md frontmatter

Key fields when creating or editing skills:

| Field | Notes |
|-------|-------|
| `name` | Kebab-case, becomes the slash-command |
| `description` | **Most critical** — drives auto-invocation. Start with "Use when…" + natural-language triggers. Max 1024 chars. |
| `disable-model-invocation: true` | Required for all destructive ops (deploy, migrate, reprocess). Prevents auto-chaining. |
| `allowed-tools` | Restrict what the skill can call |
| `model` | Override per-skill (e.g. `haiku` for cheap read-only checks) |
| `argument-hint` | Autocomplete hint shown to user |

## Session state (ops-suite)

Skills share state through `/tmp/ops-suite-session/`:

- `config.json` — parsed `config.yaml`, written by the `session-start.sh` hook and cached for the session
- `env.json` — selected environment (planned)
- `credentials.json` — DB/broker credentials (planned)
- `port-forwards.json` — active port-forward PIDs (planned)

Step 0 in every skill checks for `/tmp/ops-suite-session/config.json` before re-parsing config.

## Skill chaining rules

**Read-only skills** (no `disable-model-invocation`) can be auto-invoked mid-step:
```
Use ops-suite:service-status with arguments: {service} {env_name}.
Use session state from /tmp/ops-suite-session/ — do not re-ask for environment.
```

**Destructive skills** (`disable-model-invocation: true`) must be suggested, never auto-invoked:
```
Next steps:
→ Run `/ops-suite:db-migrate {env_name}` to apply pending migrations.
```

Chain depth is capped at 3.

## Safety classification

| Skill | Type |
|-------|------|
| service-status, service-logs, db-query, queue-status, port-forward | read-only — auto-chainable |
| deploy, db-migrate, queue-reprocess | destructive — suggest only, always ask for explicit confirmation |

## Adding a new skill

1. Run `/creating-skills:creating-skills` — it guides the full authoring process
2. Create `plugins/<plugin>/skills/<name>/SKILL.md` with the standard frontmatter
3. Add adapters under `adapters/` for each supported technology
4. Add a command shim at `plugins/<plugin>/commands/<name>.md`
5. Test: trigger test (natural language), functional test (`/skill-name`), performance test (SKILL.md < 500 lines)
17 changes: 7 additions & 10 deletions plugins/ops-suite/skills/db-migrate/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,18 +51,15 @@ Show the user:
The migration tool needs a database connection. This typically requires:

1. **Port-forward** to the database using the orchestrator:
```
kubectl --context={env.context} port-forward svc/{env.services.database.name} {deploy.local_ports.{env_name}}:{env.services.database.port} -n {env.services.database.namespace || env.namespaces.infra} &
```
Note: Use `env.services.database.namespace` if defined — the database/pgbouncer may live in a
different namespace than `env.namespaces.infra`.
Load `${CLAUDE_PLUGIN_ROOT}/skills/port-forward/adapters/{orchestrator}.md` and use its
"Port-forward a service (background)" command for `{env.services.database.name}`,
local port `{deploy.local_ports.{env_name}}`, remote port `{env.services.database.port}`.
Use `{env.services.database.namespace}` if defined, otherwise `{env.namespaces.infra}`.

2. **Credentials**: First check if `env.services.database.credentials_from` is defined in config:
- If `pod_env:<VAR_NAME>`: retrieve from a running app pod:
```
kubectl --context={env.context} exec {any_app_pod} -n {env.namespaces.apps} -- printenv <VAR_NAME>
```
- Otherwise, use the adapter's "retrieve secret" command or ask the user.
- If `pod_env:<VAR_NAME>`: load `${CLAUDE_PLUGIN_ROOT}/skills/port-forward/adapters/{orchestrator}.md`
and use its "retrieve secret" command to read the variable from a running app pod.
- Otherwise, use the migration adapter's credential retrieval command or ask the user.
Never hardcode or display credentials in plain text.

3. **Set environment variables** as required by the migration tool (from adapter).
Expand Down
10 changes: 5 additions & 5 deletions plugins/ops-suite/skills/db-query/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,11 +38,11 @@ If `$ARGUMENTS` contains an environment name, use it. Otherwise ask the user.
Check if a port-forward is already active on the expected local port (`deploy.local_ports.{env_name}`).

If not active:
1. Start a port-forward using the orchestrator:
```
kubectl --context={env.context} port-forward svc/{env.services.database.name} {deploy.local_ports.{env_name}}:{env.services.database.port} -n {env.services.database.namespace || env.namespaces.infra} &
```
2. Verify the connection is working
1. Load `${CLAUDE_PLUGIN_ROOT}/skills/port-forward/adapters/{orchestrator}.md` and use its
"Port-forward a service (background)" command for `{env.services.database.name}`,
local port `{deploy.local_ports.{env_name}}`, remote port `{env.services.database.port}`,
namespace `{env.services.database.namespace}` if defined, otherwise `{env.namespaces.infra}`.
2. Verify the connection using the adapter's connection check command.

Retrieve or ask for credentials. Never hardcode credentials.

Expand Down
12 changes: 11 additions & 1 deletion plugins/ops-suite/skills/port-forward/adapters/kubernetes.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,12 +56,22 @@ nc -z localhost {local_port} 2>/dev/null && echo "Connection OK" || echo "Connec
curl -s -o /dev/null -w '%{http_code}' http://localhost:{local_port}/health
```

## Retrieve secret (generic pattern)
## Retrieve secret (from Kubernetes Secret)

```bash
kubectl --context={env.context} get secret {secret_name} -n {namespace} -o jsonpath='{.data.{key}}' | base64 -d
```

## Retrieve environment variable from a running pod (pod_env pattern)

Use when `credentials_from: pod_env:<VAR_NAME>` is set in config — reads the variable from a
running app pod instead of a Kubernetes Secret:

```bash
kubectl --context={env.context} get pods -n {env.namespaces.apps} -l app={service} -o name | head -1
kubectl --context={env.context} exec {pod} -n {env.namespaces.apps} -- printenv {VAR_NAME}
```

## List secrets in namespace

```bash
Expand Down
10 changes: 6 additions & 4 deletions plugins/ops-suite/skills/queue-status/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ Flag the following conditions:

| Condition | Severity | Action |
|-----------|----------|--------|
| DLQ with messages > 0 | Warning | Auto-chain to `queue-triage` (see below) |
| DLQ with messages > 0 | Warning | Suggest queue-triage (see Step 6) |
| Queue with 0 consumers | Warning | Check if consumer service is running |
| Queue with growing message count | Warning | Consumer may be slow or stuck |
| Queue in "down" or "stopped" state | Critical | Investigate immediately |
Expand Down Expand Up @@ -83,7 +83,9 @@ Summary:
Queues with 0 consumers: {count}
```

If any DLQs have messages, automatically triage the first one:
If any DLQs have messages, suggest next steps:

Use ops-suite:queue-triage with arguments: {dlq_name} {env_name}.
Use session state from /tmp/ops-suite-session/ — do not re-ask for environment.
```
Next steps:
→ Run `/ops-suite:queue-triage {dlq_name} {env_name}` to diagnose why messages are failing.
```
20 changes: 15 additions & 5 deletions plugins/ops-suite/skills/queue-triage/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,11 +98,21 @@ python3 scripts/analyze_messages.py {messages_file}

If the codebase is available:

1. **Find the subscription config**: Search for the queue name in config files (e.g. `grep -r "queue_name" src/config/`). Identify the subscription name mapped to this queue.
2. **Verify subscribe() call exists**: Search for `subscribe('{subscription_name}'` in the codebase. Compare all subscriptions declared in config vs actual `subscribe()` calls — any mismatch means orphaned config.
3. **Find the consumer handler**: Locate the subscriber class (typically in `application/amqp/`) and read its `onApplicationBootstrap()` method to see which subscriptions are actually registered.
4. **Check error handling patterns**: Look for try/catch, reject, or nack logic in the handler.
5. **Check if the error matches a known code path**: Cross-reference with the failure mode from Step 5.
1. **Find the subscription config**: Search for the queue name in config and source files:
```bash
grep -r "{queue_name}" src/ --include="*.ts" --include="*.yaml" --include="*.json" -l
```
Identify the subscription name or handler mapped to this queue.
2. **Locate the consumer handler**: Search for the handler that processes this subscription:
```bash
grep -r "{subscription_name}" src/ --include="*.ts" -l
```
3. **Check error handling**: Look for try/catch, reject, or nack logic in the handler.
4. **Verify the handler is registered**: Check that the subscription is actually active at runtime (not only declared in config). Any mismatch between config and actual subscriptions means orphaned config.
5. **Cross-reference with the failure mode from Step 5**.

If the codebase uses a specific framework, load `references/consumer-patterns.md` for
framework-specific search patterns (NestJS + RabbitMQ, Spring AMQP, Celery).

## Step 6b — Check git history for removed handlers

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# Consumer Patterns — Framework-Specific Search Guide

Load this reference in queue-triage Step 6 when you know the consumer framework.

---

## NestJS + @golevelup/nestjs-rabbitmq

### Find subscription config

Subscriptions are typically declared in a module config file:

```bash
grep -r "subscriptions" src/config/ --include="*.ts" -A 20
grep -r "RabbitMQModule.forRoot" src/ --include="*.ts" -A 30
```

Look for entries like:
```typescript
subscriptions: {
mySubscriptionName: { routingKey: 'some.routing.key', queue: 'queue_name' }
}
```

### Verify subscribe() call exists

```bash
grep -r "subscribe('" src/ --include="*.ts"
```

Compare all subscription names in config vs all `subscribe('...')` calls. Any name in config
without a matching `subscribe()` call = orphaned config (messages go to DLQ with no consumer).

### Find the subscriber class

Consumer classes typically live in `src/**/application/amqp/` or `src/**/infrastructure/amqp/`:

```bash
find src/ -path "*/amqp/*.ts" -o -path "*/subscribers/*.ts" | head -20
```

### Check registered subscriptions at bootstrap

Look for `onApplicationBootstrap()` in subscriber classes — this is where `subscribe()` calls are made:

```bash
grep -r "onApplicationBootstrap" src/ --include="*.ts" -l
```

Read the method body to see which subscriptions are actually registered at runtime.

### Common failure modes in NestJS

| Pattern | What to search for | Root cause |
|---------|-------------------|------------|
| `subscribe()` call missing | `grep -r "subscribe("` returns no match for that name | Handler was removed or renamed |
| Subscription in config, no handler | Config has entry, no `onApplicationBootstrap` with it | Orphaned config |
| Handler throws uncaught error | `grep -r "nack\|reject" src/` — handler may not handle errors | Missing try/catch in handler |
| DTO validation fails | Look for `class-validator` decorators in the DTO | Producer changed payload shape |

---

## Spring AMQP (Java/Kotlin)

### Find the listener

```bash
grep -r "@RabbitListener" src/ --include="*.java" --include="*.kt" -l
grep -r "queues = " src/ --include="*.java" --include="*.kt"
```

### Check the binding

```bash
grep -r "@Bean" src/ --include="*.java" --include="*.kt" -A 5 | grep -A 5 "Queue\|Binding"
```

---

## Celery (Python)

### Find the task handler

```bash
grep -r "@app.task\|@celery.task\|@shared_task" . --include="*.py" -l
grep -r "queue=" . --include="*.py" | grep "{queue_name}"
```

### Check task routing

```bash
grep -r "task_routes\|CELERY_ROUTES" . --include="*.py"
```
2 changes: 1 addition & 1 deletion plugins/ops-suite/skills/workflow-deploy/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ Store answers as: `{ref}`, `{env_name}`, `{migrations}`, `{rollback}`.

## Phase B — Pre-flight (read-only, no confirmation needed)

Load the CI adapter file at `../deploy/adapters/{deploy.ci_provider}.md` and extract:
Load the CI adapter file at `${CLAUDE_PLUGIN_ROOT}/skills/deploy/adapters/{deploy.ci_provider}.md` and extract:
- The commands to verify and trigger a deployment
- The rollback command (`{ci_rollback_command}`) — used if a rollback plan is requested

Expand Down