diff --git a/.cursor/rules/adapters.mdc b/.cursor/rules/adapters.mdc new file mode 100644 index 0000000..d55f394 --- /dev/null +++ b/.cursor/rules/adapters.mdc @@ -0,0 +1,39 @@ +--- +description: TaskAdapter conventions — fetch_tasks, on_task_started, on_task_completed. +globs: agent_factory/adapters/**/*.py +alwaysApply: false +--- + +# Adapter Conventions + +Each adapter lives in `agent_factory/adapters/.py` and implements `TaskAdapter`: + +```python +class TaskAdapter: + def fetch_tasks(self) -> list[Task]: ... + def on_task_started(self, task: Task) -> None: ... + def on_task_completed(self, task: Task, result: AgentResult) -> None: ... +``` + +## Required behaviours + +- **Auth missing → loud failure** at `__init__`. Don't silently fall back to anonymous mode. +- **HTTP timeouts** (15s default for Jira / Notion). Keep them. +- **Lifecycle hooks** wrap upstream calls in try/except and `print` the error — they must **not** raise (a failed comment on Jira shouldn't kill the PR). +- **Task fields** populated: at minimum `title`, `task_type`, `description`. `files` and `acceptance_criteria` are optional but the agent uses them when present. +- **Stash adapter-specific identifiers** as private attrs on the Task (`_jira_key`, `_notion_page_id`). + +## When fetching + +- Sort tasks deterministically (e.g. by ticket key / created date). Don't rely on API ordering. +- Limit pages — most adapters cap at 50 results per call. + +## Adding a new adapter + +See `docs/EXTENDING.md` → "Add an adapter". TL;DR: + +1. Subclass `TaskAdapter` in `adapters/.py`. +2. Add a branch in `_build_adapter()` (`main.py`). +3. Add a subparser in `_build_argparser()` (`main.py`). +4. Mirror `.env.example` block for credentials. +5. Smoke test (`tests/adapters/test_.py`). diff --git a/.cursor/rules/anthropic-tool-use.mdc b/.cursor/rules/anthropic-tool-use.mdc new file mode 100644 index 0000000..7deb67b --- /dev/null +++ b/.cursor/rules/anthropic-tool-use.mdc @@ -0,0 +1,47 @@ +--- +description: Anthropic tool-use conventions — TOOL_DEFINITIONS schemas, response handling. +globs: agent_factory/agent.py,agent_factory/tools.py +alwaysApply: false +--- + +# Anthropic tool-use conventions + +## Tool definitions + +Each tool in `tools.py:TOOL_DEFINITIONS` is a dict with: + +```python +{ + "name": "snake_case", + "description": "", + "input_schema": { + "type": "object", + "properties": { ... }, + "required": [ ... ], + }, +} +``` + +The `description` is read by the LLM. Write it as instructions, not as docs for humans. + +## Adding a new tool + +1. Add the schema in `TOOL_DEFINITIONS`. +2. Add a branch in `execute_tool()` that returns a string (or short structured value) — never a multi-MB blob. +3. Cap large outputs (truncate + indicate). The current `read_file` / `run_command` already do this — follow that pattern. +4. Respect `Config.dry_run` if the tool mutates state. +5. Add a smoke test in `tests/` (we'll need to create `tests/`; see `docs/EXTENDING.md`). + +## Response handling in the loop + +The loop reads `response.content` for both `text` blocks and `tool_use` blocks. After each turn: + +- Append the assistant message verbatim. +- For each `tool_use`, run `execute_tool()` and append a `tool_result` with the same `tool_use_id`. +- If `stop_reason == "end_turn"` and there were no `tool_use` blocks → done. +- If there were no tool calls but not `end_turn` → exit early ("model gave up"). +- After `max_agent_turns` → exit without PR URL. + +## Cost / token usage + +`response.usage.input_tokens` and `output_tokens` are sampled per call. `UsageTracker` aggregates. Per-MTok rates are **hardcoded** for Sonnet. If the model changes, fix `COST_PER_MTok` in the same PR. diff --git a/.cursor/rules/general.mdc b/.cursor/rules/general.mdc new file mode 100644 index 0000000..52f35cb --- /dev/null +++ b/.cursor/rules/general.mdc @@ -0,0 +1,13 @@ +--- +description: General coding standards for agentic-coder. +alwaysApply: true +--- + +# General Rules + +- Keep the code small and obvious. This is a tool used unattended overnight; any abstraction must pay for itself. +- No new top-level modules unless a real second use case justifies it. +- Comments / docstrings on every public function. Internal helpers can stay terse if the name is obvious. +- Prefer **intent / why** comments over what comments. The diff shows what. +- No silent failures. Bare `except: pass` is banned outside Jira/Notion lifecycle hooks where the upstream is genuinely best-effort. +- `print()` is the current logging surface. If we adopt `logging`, do it project-wide in one PR. diff --git a/.cursor/rules/git-conventions.mdc b/.cursor/rules/git-conventions.mdc new file mode 100644 index 0000000..74236f9 --- /dev/null +++ b/.cursor/rules/git-conventions.mdc @@ -0,0 +1,44 @@ +--- +description: Branch naming, commit format, PR text. Base branch `main`. +alwaysApply: true +--- + +# Git Conventions — agentic-coder + +## Base branch + +`main` (this is a tool repo, no `develop`). + +## Branches + +- `pborges/` — Pablo's work. +- `feat/` / `fix/` / `chore/` — anyone. +- `agent/` — Cursor-agent-driven work. +- `agentic-coder/` — autonomous runs against itself (rare; safer to test against `CityCatalyst`). + +## Commit messages + +Conventional Commits (`(): summary`). + +Recommended ``: `agent`, `tools`, `adapter`, `scanner`, `cli`, `docs`, `ci`. + +Examples: + +``` +feat(adapter): add linear task source +fix(scanner): wire exclude_glob for console.log rule +docs(playbook): document Cloud Agents kickoff template +chore(deps): pin anthropic to 0.50.x +``` + +## PRs + +- Title ≤ 72 chars, imperative. +- Body: Summary (1–3 sentences) + Changes (bullets) + Commits (optional). +- **Don't open PRs unless explicitly told** in the active task. Push the branch and stop; a human reviews and opens it. + +## Merge policy + +- **Code in `agent_factory/`, tests, docs/** — any tech-team member can merge after standard review (≥1 approval, CI green). +- **Agentic foundation** (`AGENTS.md`, `CLAUDE.md`, `.cursor/rules/`, `.cursor/skills/`, `prompts/`, `profiles/`) — core-team sign-off required; after approval, anyone merges. +- **Agents** never merge their own PRs. diff --git a/.cursor/rules/os-shell.mdc b/.cursor/rules/os-shell.mdc new file mode 100644 index 0000000..96210c4 --- /dev/null +++ b/.cursor/rules/os-shell.mdc @@ -0,0 +1,15 @@ +--- +description: POSIX shell defaults; `run.sh` is bash-only. +alwaysApply: true +--- + +# OS / Shell Defaults + +This repo ships `run.sh` (bash). Default to POSIX `bash`. + +- `bash`-isms allowed inside `run.sh`. Document in comments where they're load-bearing. +- Forward-slash paths. +- `cat <<'EOF'` heredocs for multi-line text. +- Avoid pagers in the agent's `run_command` invocations: append `| cat` or `--no-pager` for git. + +Windows users run inside WSL2. diff --git a/.cursor/rules/project-architecture.mdc b/.cursor/rules/project-architecture.mdc new file mode 100644 index 0000000..2097a77 --- /dev/null +++ b/.cursor/rules/project-architecture.mdc @@ -0,0 +1,56 @@ +--- +description: agentic-coder architecture map — agent loop, tools, adapters, scanner. +alwaysApply: true +--- + +# agentic-coder Architecture + +``` +agentic-coder/ +├── agent_factory/ +│ ├── main.py CLI: markdown / jira / notion / scan / watch +│ ├── agent.py The loop: client.messages.create + tool dispatch +│ ├── config.py Config dataclass + .env loader +│ ├── context.py Read target-repo AGENTS.md / .cursor/* into system prompt +│ ├── tools.py 5 tools (read_file, search_code, list_directory, edit_file, run_command) +│ ├── scanner.py SCAN_CATEGORIES (TODO, console.log, as any, empty catch) +│ ├── watcher.py Polling loop + idle scan +│ ├── preflight.py git / gh / clean tree / remote checks +│ ├── task_parser.py Task dataclass + markdown parser +│ └── adapters/ +│ ├── base.py TaskAdapter interface +│ ├── markdown.py Markdown task source +│ ├── jira.py Jira REST adapter +│ └── notion.py Notion API adapter +├── profiles/.yaml Per-target-repo defaults (NEW) +├── prompts/ System-prompt scaffolds (NEW) +├── tasks/ Markdown task backlogs +└── logs/ Per-task session log (gitignored) +``` + +## Default model + branch + +- Model: `claude-sonnet-4-20250514` (`Config.model`). +- Target base branch: `develop` (`Config.base_branch`). +- Target branch prefix: `agentic-coder/` (`Config.branch_prefix`). +- Max turns per task: `50` (`Config.max_agent_turns`). + +## Loop end conditions + +- `stop_reason == "end_turn"` and no `tool_use` blocks → success, returns. +- No tool calls and not `end_turn` → exits early. +- `max_agent_turns` reached → returns without PR URL. + +## Cost tracking + +`UsageTracker` in `agent.py` sums `usage.input_tokens` / `output_tokens` per response. Rates are **hardcoded for Sonnet** (`COST_PER_MTok = {"input": 3.0, "output": 15.0}`). If model changes, this is wrong. + +## Path discipline + +`logs/` is `Path(__file__).parent.parent / "logs"` — always inside this repo. Not the target repo. + +## Gotchas + +- `task.repo` field is parsed but never used. +- `scanner.SCAN_CATEGORIES["console.log"].exclude_glob` is defined but **not wired** into `_search`. +- `_extract_pr_url` only matches `github.com` URLs. diff --git a/.cursor/rules/security-baseline.mdc b/.cursor/rules/security-baseline.mdc new file mode 100644 index 0000000..3ddaf66 --- /dev/null +++ b/.cursor/rules/security-baseline.mdc @@ -0,0 +1,47 @@ +--- +description: Security baseline — secrets, shell, gh token leakage, untrusted input. +alwaysApply: true +--- + +# Security Baseline — agentic-coder + +This tool runs commands and edits files in **other repos** on behalf of an LLM. Treat it like a remote-code-execution surface in your own org. + +## Secrets + +- `.env` is gitignored. Never commit it. +- `.env.example` is the canonical schema; keep it in sync with `Config` defaults. +- Never log full Anthropic / Jira / Notion / GitHub tokens. Mask after the first 4 chars. +- `tools.py:execute_tool` strips `GITHUB_TOKEN` from the env passed to subprocess. Keep that — don't "simplify" it away. + +## Shell (`run_command`) + +- `tools.py:run_command` accepts arbitrary shell input from the model. The current blocklist is small. **Improve, don't shrink.** + - Keep blocking: `rm -rf /`, force-pushes to `main`/`develop`/`master`, `git config --global` edits. + - Cap on output: 10k chars per call (already enforced). + - Cap on time: 120s (already enforced). +- Do not extend `run_command` with elevation (`sudo`). + +## Tool dispatch + +- Dry-run mode (`Config.dry_run`) must skip both `edit_file` and `run_command`. Adding a new mutating tool? It must respect dry-run. + +## Preflight checks + +- `preflight.py` must continue to refuse to operate on a non-clean working tree, on the wrong base branch, or with `gh` unauthenticated. +- If a check is bypassed, surface `--force` explicitly — never make bypass the default. + +## Adapter credentials + +- Jira / Notion adapters fail loudly when credentials are missing — do not fall back to anonymous mode. +- HTTP adapters time out (Jira: 15s default, Notion: 15s default) — keep them. + +## Commit messages and PR bodies (model-authored) + +- Strip credentials and bearer tokens from the model's text output before posting (defensive — the model shouldn't emit them, but the `gh pr create` path is the last barrier). +- PR body cap: keep ≤ 4k chars to avoid surprise on GitHub UI. + +## Cost guardrails + +- Cost tracking is informational, not a hard cap. The hard cap today is `max_agent_turns=50`. Do not raise without a reason in the PR body. +- For autonomous scenarios (`watch`), consider a daily budget: log per-day spend and stop if exceeded. diff --git a/.cursor/skills/commit-message-standards/SKILL.md b/.cursor/skills/commit-message-standards/SKILL.md new file mode 100644 index 0000000..76dc6ea --- /dev/null +++ b/.cursor/skills/commit-message-standards/SKILL.md @@ -0,0 +1,45 @@ +--- +name: commit-message-standards +description: Generate Conventional Commits messages for agentic-coder. +--- + +# commit-message-standards — agentic-coder + +Conventional Commits, ≤72 chars per line, imperative summary. + +``` +(): + + + +