Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .agents/skills/durable-objects/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,9 @@ Fetch the relevant doc page when implementing features.
- `./references/rules.md` - Core rules, storage, concurrency, RPC, alarms
- `./references/testing.md` - Vitest setup, unit/integration tests, alarm testing
- `./references/workers.md` - Workers handlers, types, wrangler config, observability
- `./references/repo-conventions.md` - This repo's Worker/DO conventions: DO call

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok this is actual review feedback and here's where we should make a decision on how we want to handle 3rd party vs custom skills.

If we want to use npx skills as our "skill package manager" to easily be able to update skills to their latest version, then I think we should withhold from making any changes to those skills like we are doing here.

An alternative: No 3rd party skills in repo by default

  • We could have a recommended list of skills that might be useful but require the user to install it separately.
  • This puts extra burden on the developer to make sure they have the right "tools" in place when working on the repo, but one could argue that's how we've done things in the past with which IDE you use, etc.

@marius-kilocode might have some thoughts on this as I believe he favors a more bare repo with limited agent pre-configuration to allow for more flexibility in personal development workflows and control.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioned this above as well but:

i actually struggled with this more broadly as well - because i also have the actual official cloudflare skill + cloudflare mcp set up globally.

I'd be up for this. I already have my own review agents, pr skills, etc. So this doesn't feel un-natural.

Some of these custom additions for Cloudflare specifically could also go in REVIEWS.md. e.g. insisting on withDORetry

retries, DO stub helpers, sub-module splitting, IO boundaries, and DB-client
lifecycle

Search: `blockConcurrencyWhile`, `idFromName`, `getByName`, `setAlarm`, `sql.exec`

Expand Down
84 changes: 84 additions & 0 deletions .agents/skills/durable-objects/references/repo-conventions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
# Repo Conventions — Worker/DO Services in This Monorepo

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment/question, not review feedback:

Something I ran into in the on-call repo was the split between where we put custom-authored skills and 3rd party skills, and does it even matter. For instance we tend to use npx skills as our "skill package manager" for 3rd party skills and install that into .agents/skills as its mostly supported across all agents.

In the on-call repo we had a mixture of custom and 3rd party skills and ended up with:

- .agents/skills → 3rd party skills
- .kilo/skills → custom skills

It feels duplicative though to have skills in two places and maybe we should just put it all in .agents/skills like it seems we're going for in this repo.

No strong opinions, just think it would be good to be consistent across all repos so we don't have to think about it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i actually struggled with this more broadly as well - because i also have the actual official cloudflare skill + cloudflare mcp set up globally.

I'm down for copying the setup in the on-call repo.


Always bias towards following established patterns in existing services. These
are merely guidelines.

## DO call retries

Retry Durable Object stub calls with `withDORetry` from `@kilocode/worker-utils`
(`packages/worker-utils/src/do-retry.ts`) rather than hand-rolling retry/backoff
loops. It creates a fresh stub per attempt (required, since certain errors break a
stub), retries only errors with Cloudflare's documented `.retryable === true`
property, and applies jittered exponential backoff.

```ts
const result = await withDORetry(
() => env.MY_DO.get(env.MY_DO.idFromName(key)), // fresh stub per attempt
stub => stub.getMetadata(),
'getMetadata'
);
```

Services that need service-local log correlation wrap it with their own logger
bound in, e.g. `services/cloud-agent-next/src/utils/do-retry.ts` and
`services/webhook-agent-ingest/src/util/do-retry.ts` — both are thin
logger-binding adapters over the shared `withDORetry`, not reimplementations.
Prefer this pattern (a local `withDORetry` re-export bound to the service logger)
over calling the base helper directly with no logger, and over copying the retry
loop itself.

## DO stub helper

Each DO module must export a `get{ClassName}Stub` helper function (e.g.
`getRigDOStub`) that centralizes how that DO namespace creates instances. Callers
use this helper instead of accessing the namespace binding directly.

## Sub-modules for large DOs

When a Durable Object grows beyond a few hundred lines, extract domain logic into
sub-modules under a `<do-name>/` directory alongside the DO file. For example,
`Town.do.ts` delegates to modules in `town/`:

```
dos/
Town.do.ts # Class definition, RPC methods, alarm loop
town/
agents.ts # Agent CRUD, hook management
beads.ts # Bead CRUD, convoy progress
```

Each sub-module exports plain functions (not classes) that accept `SqlStorage` and
any other required context as arguments. The DO imports them with the
`import * as X` pattern:

```ts
import * as beadOps from './town/beads';
import * as agents from './town/agents';
import * as scheduling from './town/scheduling';

// In the DO class:
beadOps.updateBeadStatus(this.sql, beadId, 'closed', agentId);
agents.getOrCreateAgent(this.sql, 'polecat', rigId, this.townId);
await scheduling.schedulePendingWork(this.schedulingCtx);
```

This keeps the DO class thin (RPC surface + orchestration) while sub-modules own
the business logic. The `import * as X` pattern makes call sites self-documenting —
you can always tell which domain a function belongs to.

## IO boundaries

- Validate data at IO boundaries (HTTP responses, JSON.parse results, SSE
event payloads, subprocess output, upstream responses, persisted session
records) with Zod schemas. Return `unknown` from raw fetch/parse helpers and
`.parse()` in the caller.
- Never use `as` to cast IO data. If the shape is known, define a Zod schema; if
not, use `.passthrough()` or a catch-all schema.

## DB clients

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: File naming/Column naming/SQL queries/HTTP routes sections removed with no home

This deletes the "File naming", "Column naming", "SQL queries" (table-interpolator, no-table-aliasing, Zod row-parsing rules), and "HTTP routes" (no Hono sub-app mounting) sections that an earlier commit in this same PR pointed to as the shared home for these conventions (see this file's own earlier wording and services/mcp-gateway/AGENTS.md's removed "File naming" section, which now assumes this file covers it). services/gastown/AGENTS.md and services/wasteland/AGENTS.md are untouched by this PR and still carry these conventions directly, but services/mcp-gateway/AGENTS.md removed its own copies expecting this file to retain them — leaving those rules (notably the SQL table-aliasing/raw-string guardrails) undocumented for mcp-gateway.


Reply with @kilocode-bot fix it to have Kilo Code address this issue.


See `workers-best-practices` skill (`references/rules.md` → "Repo rule: never
cache DB clients/pools in module scope") for the Hyperdrive/`getWorkerDb`
lifecycle rule shared by all Workers in this repo. Durable Object instance fields
(e.g. a Drizzle/SQLite wrapper over `state.storage`) are exempt — that's
object-local state, not module-scope caching.
33 changes: 33 additions & 0 deletions .agents/skills/specs/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
---
name: specs
description: Business-rule specs for KiloClaw billing/lifecycle/controller/data model/Composio, MCP Gateway auth, model experiments, Security Agent, subscription center, team/enterprise seat billing, Impact affiliate/referrals, Kilo Pass, organization SSO, Stripe early fraud warnings, and coding plans. Load when you need context about the business requirements that guided the implementation.
---

# Business-Rule Specs

Specs in `.specs/` are context as to the original business intention, rules and
invariants of the domains they cover. Consult them for context and flag inconsistencies
to the user if instructions or changes will cause deviations from the original intent.

## Index

| Spec | Governs |
|---|---|
| `.specs/kiloclaw-billing.md` | KiloClaw billing, pricing, invoicing, usage metering, payment flows |
| `.specs/kiloclaw-billing-lifecycle.md` | KiloClaw billing lifecycle — credit-renewal orchestration safety |
| `.specs/kiloclaw-composio.md` | KiloClaw Composio credential provisioning, injection, and sharing |
| `.specs/kiloclaw-controller.md` | KiloClaw controller/machine lifecycle, bootstrap, Docker image |
| `.specs/kiloclaw-datamodel.md` | KiloClaw data model — instance/subscription tables, invariants |
| `.specs/mcp-gateway-auth.md` | Kilo MCP Gateway v1 — protocol surface, ownership, OAuth lifecycle, provider grants, runtime auth |
| `.specs/model-experiments.md` | Model experiment routing, bucketing, lifecycle, prompt retention, and reporting rules |
| `.specs/security-agent.md` | Security Agent Auto Remediation and finding/SLA notification guarantees |
| `.specs/subscription-center.md` | Subscription Center ownership, states, and user-facing behavior |
| `.specs/team-enterprise-seat-billing.md` | Team and Enterprise seat billing, subscription management |
| `.specs/impact-affiliate-tracking.md` | Impact.com affiliate conversion tracking |
| `.specs/impact-referrals.md` | Impact.com Advocate referral programs for KiloClaw and Kilo Pass |
| `.specs/kilo-pass.md` | Kilo Pass states, provider support, credit amounts, eligibility, lifecycle |
| `.specs/organization-sso.md` | Organization SSO enforcement — auth requirements, membership admission/removal, policy inheritance |
| `.specs/stripe-early-fraud-warnings.md` | Stripe Early Fraud Warning enforcement — scope, containment, financial unwinding, remediation |
| `.specs/coding-plans.md` | Coding Plans business rules (RFC 2119 normative language) |

`.specs/template.md` is the authoring template for new specs, not a governed domain.
1 change: 1 addition & 0 deletions .agents/skills/workers-best-practices/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ mkdir -p /tmp/workers-types-latest && \
| Queues & Workflows | Move async/background work off the critical path |
| Service bindings | Use service bindings for Worker-to-Worker calls — not public HTTP |
| Hyperdrive | Always use Hyperdrive for external PostgreSQL/MySQL connections |
| Repo: DB client lifecycle | Never cache DB clients/pools in module scope (Worker or DO); use `getWorkerDb(...)` per use. DO instance fields are fine for object-local state. See `references/rules.md` |

### Observability

Expand Down
30 changes: 30 additions & 0 deletions .agents/skills/workers-best-practices/references/rules.md
Original file line number Diff line number Diff line change
Expand Up @@ -268,6 +268,36 @@ async fetch(request: Request, env: Env): Promise<Response> {

**Retrieve**: `/hyperdrive/` for current configuration and supported databases.

### Repo rule: never cache DB clients/pools in module scope (Workers or Durable Objects)

This repo's Workers and Durable Objects must not cache database clients, pools, or other transport-owning/request-context-bound SDK objects in module scope. Workers reuse isolates across requests, and Durable Object classes in the same Worker can share module memory across object instances — stale module-scope I/O state causes cross-context runtime failures, not just the generic "stale request state" anti-pattern above.

**Check**: no `const client = new Client(...)` (or Drizzle/pg pool equivalent) declared at module scope, even for "read-only" or "singleton" convenience.

- Create external database clients through the approved per-use helper, `getWorkerDb(...)`; let Hyperdrive own pooling.
- Only cache pure data or context-independent values in module scope (e.g. static config objects, parsed constants).
- Durable Object **instance fields** are fine for object-local state created from constructor inputs — for example, SQLite/Drizzle wrappers over `state.storage`. That is per-object state, not shared module state.
- If an optimization appears to require module-scope client caching, stop and document why lifetime, transport ownership, binding freshness, and Cloudflare runtime behavior make it safe before implementing it.

```ts
// Correct: per-request/per-call helper owns client lifecycle
export async function handler(env: Env) {
const db = getWorkerDb(env.HYPERDRIVE.connectionString);
return db.select().from(users);
}
```

Anti-pattern:

```ts
// Module-scope client — shared across isolate reuse and (for DOs) across instances
const db = drizzle(new Client({ connectionString: env.HYPERDRIVE.connectionString }));

export async function handler() {
return db.select().from(users); // stale/leaked connection risk
}
```

---

## Observability
Expand Down
30 changes: 30 additions & 0 deletions .kilo/command/cloud-pr.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might be good to delete this command. I imagine most folks have their own workflow for this and the repo template serves as a guide for external contributors.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

description: Create a pull request following repo conventions
---

Create a pull request for the current branch following the repo conventions below. $ARGUMENTS

Before opening the PR, inspect the branch: review all commits on it (not just the latest), `git status`, `git diff` against the base branch, and remote tracking state.

## Titles

- Format: `type(scope): <description>` (e.g., `feat(auth): add SSO login`)
- Common types: `feat`, `fix`, `refactor`, `docs`, `test`, `chore`, `ci`, `style`, `perf`
- Imperative mood, under 72 characters, no trailing period.

## Descriptions

Follow the PR template in `.github/pull_request_template.md`. Every description must include four sections in order:

1. **`## Summary`** — What changed and why. Outcome-focused, call out architectural changes.
2. **`## Verification`** — Manual verification only. Do not list automated checks such as `pnpm typecheck`, `pnpm test`, `pnpm lint`, `pnpm validate`, CI, or formatting commands here.
3. **`## Visual Changes`** — Before/after screenshots, or `N/A`.
4. **`## Reviewer Notes`** — Risk areas, tricky logic, rollout notes, or `N/A`.

Do not leave HTML comments from the template. Review all commits on the branch when writing the summary.

## Workflow

- Create PRs as **ready for review** by default. Only use `--draft` if explicitly requested.
- When assigning PRs or issues, resolve the GitHub username with `gh api user --jq '.login'`. Never guess usernames.
- Never use `--force`, `--no-verify`, or any other flag that bypasses git hooks without explicit user approval. If a hook or check fails, diagnose and fix it or ask how to proceed — do not silently skip it.
8 changes: 5 additions & 3 deletions .kilo/rules/coding-style.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,11 @@
- KISS: Be wary of over-abstracting code. Do report and ask about violations of DRY, but don't prematurely generalize.
- If trivial, avoid TS classes; use e.g. closures instead
- STRONGLY AVOID coding patterns that cannot be statically checked:
- AVOID typescript's "as" operator
- AVOID typescript's null-forgiving "!"
- INSTEAD TRY where possible typescript's "satisfies", or leverage flow-sensitive typing.
- Use `as` casts sparingly, but do not ban them outright. Prefer `satisfies`, discriminated unions, generics, or flow-sensitive narrowing when TypeScript can be made to understand the type naturally.
- A targeted `as` cast is acceptable when code is at a known boundary where TypeScript has lost information that the surrounding control flow guarantees. For example, inside a platform switch, casting `message` to `Message<SlackEvent>` or `Message<GitHubRawMessage>` is preferable to adding generic `Record<string, unknown>` property helpers just to read known adapter fields.
- Avoid broad casts that hide real uncertainty, especially `as any`, double casts through `unknown`, or casting external/untrusted data without validation. Use runtime validation when the data shape is genuinely unknown, user-controlled, persisted, or coming from an API contract we do not own.
- `as` casts are explicitly permitted inside test files (e.g. `*.test.ts`, `*.spec.ts`, files under `__tests__/`, and other test fixtures/helpers) — they are commonly needed for fixture construction, narrowing partial mocks, and exercising error paths. Production code conventions still apply to non-test code imported by tests.
- AVOID typescript's null-forgiving "!"; prefer explicit checks or flow-sensitive typing.

- Prefer clear NAMES (for e.g. variables, functions and tests) over COMMENTS.
- ONLY add comments about things that are NOT OBVIOUS in context.
Expand Down
Loading
Loading