Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 39 additions & 0 deletions .cursor/rules/adapters.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
---
description: TaskAdapter conventions — fetch_tasks, on_task_started, on_task_completed.
globs: agent_factory/adapters/**/*.py
alwaysApply: false
---

# Adapter Conventions

Each adapter lives in `agent_factory/adapters/<source>.py` and implements `TaskAdapter`:

```python
class TaskAdapter:
def fetch_tasks(self) -> list[Task]: ...
def on_task_started(self, task: Task) -> None: ...
def on_task_completed(self, task: Task, result: AgentResult) -> None: ...
```

## Required behaviours

- **Auth missing → loud failure** at `__init__`. Don't silently fall back to anonymous mode.
- **HTTP timeouts** (15s default for Jira / Notion). Keep them.
- **Lifecycle hooks** wrap upstream calls in try/except and `print` the error — they must **not** raise (a failed comment on Jira shouldn't kill the PR).
- **Task fields** populated: at minimum `title`, `task_type`, `description`. `files` and `acceptance_criteria` are optional but the agent uses them when present.
- **Stash adapter-specific identifiers** as private attrs on the Task (`_jira_key`, `_notion_page_id`).

## When fetching

- Sort tasks deterministically (e.g. by ticket key / created date). Don't rely on API ordering.
- Limit pages — most adapters cap at 50 results per call.

## Adding a new adapter

See `docs/EXTENDING.md` → "Add an adapter". TL;DR:

1. Subclass `TaskAdapter` in `adapters/<source>.py`.
2. Add a branch in `_build_adapter()` (`main.py`).
3. Add a subparser in `_build_argparser()` (`main.py`).
4. Mirror `.env.example` block for credentials.
5. Smoke test (`tests/adapters/test_<source>.py`).
47 changes: 47 additions & 0 deletions .cursor/rules/anthropic-tool-use.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
---
description: Anthropic tool-use conventions — TOOL_DEFINITIONS schemas, response handling.
globs: agent_factory/agent.py,agent_factory/tools.py
alwaysApply: false
---

# Anthropic tool-use conventions

## Tool definitions

Each tool in `tools.py:TOOL_DEFINITIONS` is a dict with:

```python
{
"name": "snake_case",
"description": "<one-line, instructions-to-the-model voice>",
"input_schema": {
"type": "object",
"properties": { ... },
"required": [ ... ],
},
}
```

The `description` is read by the LLM. Write it as instructions, not as docs for humans.

## Adding a new tool

1. Add the schema in `TOOL_DEFINITIONS`.
2. Add a branch in `execute_tool()` that returns a string (or short structured value) — never a multi-MB blob.
3. Cap large outputs (truncate + indicate). The current `read_file` / `run_command` already do this — follow that pattern.
4. Respect `Config.dry_run` if the tool mutates state.
5. Add a smoke test in `tests/` (we'll need to create `tests/`; see `docs/EXTENDING.md`).

## Response handling in the loop

The loop reads `response.content` for both `text` blocks and `tool_use` blocks. After each turn:

- Append the assistant message verbatim.
- For each `tool_use`, run `execute_tool()` and append a `tool_result` with the same `tool_use_id`.
- If `stop_reason == "end_turn"` and there were no `tool_use` blocks → done.
- If there were no tool calls but not `end_turn` → exit early ("model gave up").
- After `max_agent_turns` → exit without PR URL.

## Cost / token usage

`response.usage.input_tokens` and `output_tokens` are sampled per call. `UsageTracker` aggregates. Per-MTok rates are **hardcoded** for Sonnet. If the model changes, fix `COST_PER_MTok` in the same PR.
13 changes: 13 additions & 0 deletions .cursor/rules/general.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
---
description: General coding standards for agentic-coder.
alwaysApply: true
---

# General Rules

- Keep the code small and obvious. This is a tool used unattended overnight; any abstraction must pay for itself.
- No new top-level modules unless a real second use case justifies it.
- Comments / docstrings on every public function. Internal helpers can stay terse if the name is obvious.
- Prefer **intent / why** comments over what comments. The diff shows what.
- No silent failures. Bare `except: pass` is banned outside Jira/Notion lifecycle hooks where the upstream is genuinely best-effort.
- `print()` is the current logging surface. If we adopt `logging`, do it project-wide in one PR.
44 changes: 44 additions & 0 deletions .cursor/rules/git-conventions.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
---
description: Branch naming, commit format, PR text. Base branch `main`.
alwaysApply: true
---

# Git Conventions — agentic-coder

## Base branch

`main` (this is a tool repo, no `develop`).

## Branches

- `pborges/<slug>` — Pablo's work.
- `feat/<slug>` / `fix/<slug>` / `chore/<slug>` — anyone.
- `agent/<slug>` — Cursor-agent-driven work.
- `agentic-coder/<slug>` — autonomous runs against itself (rare; safer to test against `CityCatalyst`).

## Commit messages

Conventional Commits (`<type>(<scope>): summary`).

Recommended `<scope>`: `agent`, `tools`, `adapter`, `scanner`, `cli`, `docs`, `ci`.

Examples:

```
feat(adapter): add linear task source
fix(scanner): wire exclude_glob for console.log rule
docs(playbook): document Cloud Agents kickoff template
chore(deps): pin anthropic to 0.50.x
```

## PRs

- Title ≤ 72 chars, imperative.
- Body: Summary (1–3 sentences) + Changes (bullets) + Commits (optional).
- **Don't open PRs unless explicitly told** in the active task. Push the branch and stop; a human reviews and opens it.

## Merge policy

- **Code in `agent_factory/`, tests, docs/** — any tech-team member can merge after standard review (≥1 approval, CI green).
- **Agentic foundation** (`AGENTS.md`, `CLAUDE.md`, `.cursor/rules/`, `.cursor/skills/`, `prompts/`, `profiles/`) — core-team sign-off required; after approval, anyone merges.
- **Agents** never merge their own PRs.
15 changes: 15 additions & 0 deletions .cursor/rules/os-shell.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
description: POSIX shell defaults; `run.sh` is bash-only.
alwaysApply: true
---

# OS / Shell Defaults

This repo ships `run.sh` (bash). Default to POSIX `bash`.

- `bash`-isms allowed inside `run.sh`. Document in comments where they're load-bearing.
- Forward-slash paths.
- `cat <<'EOF'` heredocs for multi-line text.
- Avoid pagers in the agent's `run_command` invocations: append `| cat` or `--no-pager` for git.

Windows users run inside WSL2.
56 changes: 56 additions & 0 deletions .cursor/rules/project-architecture.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
---
description: agentic-coder architecture map — agent loop, tools, adapters, scanner.
alwaysApply: true
---

# agentic-coder Architecture

```
agentic-coder/
├── agent_factory/
│ ├── main.py CLI: markdown / jira / notion / scan / watch
│ ├── agent.py The loop: client.messages.create + tool dispatch
│ ├── config.py Config dataclass + .env loader
│ ├── context.py Read target-repo AGENTS.md / .cursor/* into system prompt
│ ├── tools.py 5 tools (read_file, search_code, list_directory, edit_file, run_command)
│ ├── scanner.py SCAN_CATEGORIES (TODO, console.log, as any, empty catch)
│ ├── watcher.py Polling loop + idle scan
│ ├── preflight.py git / gh / clean tree / remote checks
│ ├── task_parser.py Task dataclass + markdown parser
│ └── adapters/
│ ├── base.py TaskAdapter interface
│ ├── markdown.py Markdown task source
│ ├── jira.py Jira REST adapter
│ └── notion.py Notion API adapter
├── profiles/<repo>.yaml Per-target-repo defaults (NEW)
├── prompts/ System-prompt scaffolds (NEW)
├── tasks/ Markdown task backlogs
└── logs/ Per-task session log (gitignored)
```

## Default model + branch

- Model: `claude-sonnet-4-20250514` (`Config.model`).
- Target base branch: `develop` (`Config.base_branch`).
- Target branch prefix: `agentic-coder/` (`Config.branch_prefix`).
- Max turns per task: `50` (`Config.max_agent_turns`).

## Loop end conditions

- `stop_reason == "end_turn"` and no `tool_use` blocks → success, returns.
- No tool calls and not `end_turn` → exits early.
- `max_agent_turns` reached → returns without PR URL.

## Cost tracking

`UsageTracker` in `agent.py` sums `usage.input_tokens` / `output_tokens` per response. Rates are **hardcoded for Sonnet** (`COST_PER_MTok = {"input": 3.0, "output": 15.0}`). If model changes, this is wrong.

## Path discipline

`logs/` is `Path(__file__).parent.parent / "logs"` — always inside this repo. Not the target repo.

## Gotchas

- `task.repo` field is parsed but never used.
- `scanner.SCAN_CATEGORIES["console.log"].exclude_glob` is defined but **not wired** into `_search`.
- `_extract_pr_url` only matches `github.com` URLs.
47 changes: 47 additions & 0 deletions .cursor/rules/security-baseline.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
---
description: Security baseline — secrets, shell, gh token leakage, untrusted input.
alwaysApply: true
---

# Security Baseline — agentic-coder

This tool runs commands and edits files in **other repos** on behalf of an LLM. Treat it like a remote-code-execution surface in your own org.

## Secrets

- `.env` is gitignored. Never commit it.
- `.env.example` is the canonical schema; keep it in sync with `Config` defaults.
- Never log full Anthropic / Jira / Notion / GitHub tokens. Mask after the first 4 chars.
- `tools.py:execute_tool` strips `GITHUB_TOKEN` from the env passed to subprocess. Keep that — don't "simplify" it away.

## Shell (`run_command`)

- `tools.py:run_command` accepts arbitrary shell input from the model. The current blocklist is small. **Improve, don't shrink.**
- Keep blocking: `rm -rf /`, force-pushes to `main`/`develop`/`master`, `git config --global` edits.
- Cap on output: 10k chars per call (already enforced).
- Cap on time: 120s (already enforced).
- Do not extend `run_command` with elevation (`sudo`).

## Tool dispatch

- Dry-run mode (`Config.dry_run`) must skip both `edit_file` and `run_command`. Adding a new mutating tool? It must respect dry-run.

## Preflight checks

- `preflight.py` must continue to refuse to operate on a non-clean working tree, on the wrong base branch, or with `gh` unauthenticated.
- If a check is bypassed, surface `--force` explicitly — never make bypass the default.

## Adapter credentials

- Jira / Notion adapters fail loudly when credentials are missing — do not fall back to anonymous mode.
- HTTP adapters time out (Jira: 15s default, Notion: 15s default) — keep them.

## Commit messages and PR bodies (model-authored)

- Strip credentials and bearer tokens from the model's text output before posting (defensive — the model shouldn't emit them, but the `gh pr create` path is the last barrier).
- PR body cap: keep ≤ 4k chars to avoid surprise on GitHub UI.

## Cost guardrails

- Cost tracking is informational, not a hard cap. The hard cap today is `max_agent_turns=50`. Do not raise without a reason in the PR body.
- For autonomous scenarios (`watch`), consider a daily budget: log per-day spend and stop if exceeded.
45 changes: 45 additions & 0 deletions .cursor/skills/commit-message-standards/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
---
name: commit-message-standards
description: Generate Conventional Commits messages for agentic-coder.
---

# commit-message-standards — agentic-coder

Conventional Commits, ≤72 chars per line, imperative summary.

```
<type>(<scope>): <imperative summary>

<body — explain WHY, wrap at 72>

<footer — Refs: #N>
```

Recommended `<scope>`:
- `agent` — `agent.py`, system prompt, loop changes.
- `tools` — `tools.py`, tool definitions, `execute_tool`.
- `adapter` — anything in `adapters/`.
- `scanner` — `scanner.py`.
- `cli` — `main.py`, `run.sh`, argparse.
- `docs` — README, AGENTS.md, CLAUDE.md, docs/.
- `ci`, `chore`, `deps`.

Examples:

```
feat(adapter): add linear task source

Mirrors the Jira adapter shape. Reads LINEAR_API_KEY and
LINEAR_TEAM_ID from .env. On task complete, posts a comment with
the PR URL via the Linear GraphQL API.
```

```
fix(scanner): wire exclude_glob for console.log rule

The SCAN_CATEGORIES dict defined `exclude_glob` for the
`console.log` rule but `_search` ignored it, so test files were
counted as production violations. Pass it to rg via --glob '!…'.
```

Anti-patterns: `wip`, `update`, `cleanup`, multi-purpose commits.
44 changes: 44 additions & 0 deletions .cursor/skills/pull-request-standards/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
---
name: pull-request-standards
description: Draft PRs for agentic-coder. Base branch is main.
---

# pull-request-standards — agentic-coder

## Derive context

- Owner / repo: `git remote get-url origin` → `Open-Earth-Foundation/agentic-coder`.
- Head: `git rev-parse --abbrev-ref HEAD`.
- Base: **`main`** (this is a tool repo).

## Title

- ≤72 chars, imperative.
- Conventional-Commit-flavoured: `feat(adapter): add linear task source`.

## Body

```markdown
## Summary
1–3 sentences: what changed and why.

## Changes
- bullet list

## Verification
- how this was tested locally (e.g. `./run.sh task 1` against a test repo)

## Compatibility notes (if applicable)
- CLI surface changes
- .env additions / removals
```

## Push policy

Branch is assumed to be already pushed. Don't `git push` unless explicitly asked.

## Who merges

- **Code in `agent_factory/`, tests, docs/** — any tech-team member after standard review (≥1 approval, CI green).
- **Agentic foundation** (`AGENTS.md`, `CLAUDE.md`, `.cursor/rules/`, `.cursor/skills/`, `prompts/`, `profiles/`) — core-team sign-off required; then anyone merges.
- **Agents** never merge their own PRs and do not open PRs unless explicitly told. Open the PR when told to, then stop.
18 changes: 18 additions & 0 deletions .cursorignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Files / dirs Cursor should NOT index for embeddings or context.

**/__pycache__/
**/.pytest_cache/
**/.venv/
**/venv/

# Logs (gitignored, large)
logs/

# Local env / secrets
.env
.env.local
**/credentials*.json

# OS noise
.DS_Store
**/.DS_Store
Loading