From 082f913e5d16549042ed6f8f0d5681e15f8a54c6 Mon Sep 17 00:00:00 2001
From: saagpatel <saagarpatel08@gmail.com>
Date: Sun, 7 Jun 2026 14:38:23 -0700
Subject: [PATCH] chore: remove agent scratchpad and internal planning files
 from public repo

---
 .codex/verify.commands    |   3 -
 AGENTS.md                 |  81 ---------
 CASE-STUDY.md             | 248 --------------------------
 DEMO-PLAN.md              |  95 ----------
 DEMO-SCRIPT.md            |  64 -------
 IMPLEMENTATION-ROADMAP.md | 365 --------------------------------------
 6 files changed, 856 deletions(-)
 delete mode 100644 .codex/verify.commands
 delete mode 100644 AGENTS.md
 delete mode 100644 CASE-STUDY.md
 delete mode 100644 DEMO-PLAN.md
 delete mode 100644 DEMO-SCRIPT.md
 delete mode 100644 IMPLEMENTATION-ROADMAP.md
diff --git a/.codex/verify.commands b/.codex/verify.commands
deleted file mode 100644
index aaa9499..0000000
--- a/.codex/verify.commands
+++ /dev/null
@@ -1,3 +0,0 @@
-# codex-os-managed
-python3 -m pytest -q -p no:cacheprovider
-ruff check src/ tests/
diff --git a/AGENTS.md b/AGENTS.md
deleted file mode 100644
index 1be1e7b..0000000
--- a/AGENTS.md
+++ /dev/null
@@ -1,81 +0,0 @@
-<!-- portfolio-context:start -->
-# Portfolio Context
-
-## What This Project Is
-
-A portfolio audit and operator tool — a "GitHub portfolio operating system" — for
-developers with many repositories. It clones every repo on a GitHub account, runs 12
-analyzers across completeness and interest dimensions, assigns letter grades, achievement
-badges, and dual-axis scores, preserves historical state in SQLite, and generates aligned
-JSON / Markdown / HTML / Excel-workbook / control-center surfaces so you can decide what to
-finish, fix, or safely ignore. Published on PyPI as `github-repo-auditor`. The day-to-day
-operating surfaces are the Excel workbook and the read-only `audit triage --control-center`
-queue.
-
-## Current State
-
-The audit tool, portfolio-truth layer, risk/security overlay, Action Sync proposal
-lane, and local `audit serve`/desktop-consumer surfaces are active. The public
-remote is `canonical`; `origin` remains a stale private archive and should not be
-used for PRs. Do not trust hardcoded status or test-count claims in handoff text:
-rerun the local gates and inspect `output/portfolio-truth-latest.json` for the
-current Portfolio OS state.
-
-## Stack
-
-- Language: Python 3.11+
-- GitHub API: REST v3 + GraphQL (raw `requests`)
-- Excel: `openpyxl` + committed workbook template; PDF: `fpdf2`
-- AI narrative: Anthropic Claude API; complexity analysis: Radon; CLI output: Rich
-- Storage: SQLite history warehouse
-- `pyproject.toml` is the canonical dependency definition (`requirements.txt` is a synced mirror)
-
-## How To Run
-
-```bash
-# install (editable, dev + optional extras)
-pip install -e ".[dev,serve,semantic,config]"
-
-# core operator loop
-audit run <github-username> --doctor               # preflight diagnostics
-audit run <github-username> --html                 # full audit + workbook + dashboard
-audit triage <github-username> --control-center    # read-only operator queue
-audit report <github-username> --portfolio-truth   # regenerate workspace truth layer
-python -m src.cli --portfolio-truth --portfolio-truth-include-security <github-username>  # demo truth + security overlay
-audit serve                                        # local web UI at http://127.0.0.1:8080/
-
-# tests + gates
-python3 -m pytest -q -p no:cacheprovider           # full suite
-python3 -m ruff check src/ tests/                  # lint
-make demo                                          # token-free sample run from fixture
-make workbook-gate                                 # workbook invariant check (workbook code only)
-```
-
-## Known Risks
-
-- **Dual remote**: push/PR to the `canonical` remote (public), NOT `origin` — `origin` is a
-  stale private archive with unrelated history, so PRs against it fail with "no common
-  history". Solo repo, so merges land with `gh ... --admin`.
-- **The context-quality metric is gameable**: injecting generic-filler context blocks zeros
-  the flag while lying about resumability. Only real harvested content counts — this block
-  was hand-authored for exactly that reason.
-- **Five parallel render surfaces** (Excel workbook, Markdown, HTML, review-pack, handoff)
-  carry a parity tax: every new signal must be threaded through all five (the motivation for
-  the deferred Arc F renderer simplification).
-- **Partial reruns fail closed**: `--repos` / `--incremental` require a compatible full
-  baseline; the stored baseline contract rejects mismatched portfolio context rather than
-  emitting a misleading partial.
-- **Manual workbook signoff**: the final release step is opening the generated `standard`
-  workbook in desktop Excel and recording the outcome with `make workbook-signoff`.
-
-## Next Recommended Move
-
-For Portfolio OS demo readiness, refresh `portfolio-truth-latest.json` with
-`--portfolio-truth-include-security`, refresh `audit triage --control-center`,
-then launch PortfolioCommandCenter with `pnpm demo:desktop`. Current proof
-points: 129 projects, 63 open high/critical Dependabot-alert repos, and Weekly
-Digest says to start with codexkit. After demo readiness is settled, continue
-with the highest-signal live queue item from the current control-center output
-rather than reviving old roadmap counts.
-
-<!-- portfolio-context:end -->
diff --git a/CASE-STUDY.md b/CASE-STUDY.md
deleted file mode 100644
index c4c3f4b..0000000
--- a/CASE-STUDY.md
+++ /dev/null
@@ -1,248 +0,0 @@
-# Operator OS — A Multi-Agent Control Plane for a 129-Repo Portfolio
-
-> A case study in turning portfolio sprawl into a single source of truth, and
-> coordinating two autonomous coding agents against it without stepping on each
-> other.
-
-This document describes a working system — six local services plus a two-agent
-coordination model — that I run over my own development portfolio. The metrics
-below are pulled verbatim from a real `portfolio-truth-latest.json` snapshot
-(schema `0.5.0`), not illustrative numbers.
-
----
-
-## The problem: `git log` lies about your portfolio
-
-If you ship fast and start often, you accumulate repositories faster than you can
-remember them. The naive way to take inventory — walk every repo and read its
-`git log` — answers the wrong question. A recent commit tells you *something
-happened*; it does not tell you whether the project is **healthy, drifting,
-blocked, or safe to ignore**.
-
-Concretely, `git log` across 100+ repos can't answer:
-
-- Which repos have **open high/critical security alerts** right now?
-- Which ones are **ship-ready but haven't shipped**?
-- Which have **tests and CI**, and which are one bad refactor from silent breakage?
-- Which were last touched by **me**, by **Claude Code**, or by **Codex** — and is
-  that drift expected?
-- Which are genuinely **stale** versus merely quiet between releases?
-
-A timestamp is a fact with no judgement attached. The portfolio needed a layer
-that turns raw git/GitHub facts into a *graded, precedence-resolved, historical*
-picture — one trustworthy artifact every other tool can consume. That artifact is
-the spine of the whole system.
-
----
-
-## The system: truth → state → events → surface
-
-The Operator OS is five data services and one desktop shell, arranged as a
-one-directional pipeline. Each layer has exactly one job and a clean contract with
-the next.
-
-```mermaid
-flowchart TB
-    subgraph AGENTS["Coordination layer — two builder lanes + a dispatcher"]
-        CAI["Claude.ai<br/>dispatcher / PM<br/>(handoffs only)"]
-        CC["Claude Code<br/>builder lane"]
-        CX["Codex<br/>builder lane"]
-    end
-
-    subgraph L1["1 · TRUTH"]
-        AUD["GithubRepoAuditor<br/>Python · 12 analyzers · precedence matrix"]
-        TJSON[("portfolio-truth-latest.json<br/>+ dated history snapshots")]
-    end
-
-    subgraph L2["2 · SHARED STATE"]
-        BDB["bridge-db<br/>SQLite (WAL) · 23 MCP tools<br/>caller-owned writes"]
-    end
-
-    subgraph L3["3 · EVENTS"]
-        NH["notification-hub<br/>FastAPI · classify → suppress → route"]
-        OUT["macOS push · Slack · JSONL log"]
-    end
-
-    subgraph L4["4 · SURFACES"]
-        PCC["PortfolioCommandCenter<br/>Tauri 2 desktop"]
-        PH["portfolio-health<br/>MCP overlay"]
-        CT["cost-tracker<br/>MCP overlay"]
-    end
-
-    CC -->|work across 129 repos| AUD
-    CX -->|work across 129 repos| AUD
-    AUD --> TJSON
-    TJSON --> PCC
-
-    CC -->|log activity / pick up handoff| BDB
-    CX -->|log activity / pick up handoff| BDB
-    CAI -->|dispatch handoff| BDB
-    BDB -->|assigned work| CC
-    BDB -->|assigned work| CX
-
-    BDB -->|watched activity| NH
-    NH --> OUT
-
-    BDB -->|activity join| PH
-    PH --> PCC
-    CT -->|record cost| BDB
-```
-
-### The components
-
-| Layer | Component | Stack | One job |
-|---|---|---|---|
-| **Truth** | **GithubRepoAuditor** | Python 3.11+, SQLite history warehouse, Rich CLI | Scan every repo, run 12 analyzers, resolve a precedence matrix, emit one canonical `portfolio-truth-latest.json` + dated history. |
-| **Shared state** | **bridge-db** | SQLite (WAL), MCP over stdio, FTS5 | Single store for cross-agent state: activity, handoffs, snapshots, cost, long-lived context. 23 tools; every write is ownership-gated by `caller`. |
-| **Events** | **notification-hub** | Python 3.12, FastAPI, localhost-only | Turn agent/tool events into *routed* notifications: deterministic classify → dedup/quiet-hours/rate-limit suppress → deliver. |
-| **Surface** | **PortfolioCommandCenter** | Tauri 2 (Rust shell) + React 18 + TS strict + Vite 6 | A signed desktop app that reads the truth snapshot read-only and renders the portfolio, weekly digest, and security burndown. |
-| **Overlay** | **portfolio-health** | Python, MCP, SQLite FTS5 | Join project memory against bridge-db activity to answer "what's active / stale / ship-ready-but-unshipped." |
-| **Overlay** | **cost-tracker** | Python, MCP, `ccusage` | Live agent spend: today, per-session, monthly trend, top projects, threshold alerts — persisted back into bridge-db. |
-
-**Why this shape works:** the auditor is the *only* writer of truth, so every
-surface agrees by construction. bridge-db is the *only* writer of shared agent
-state, so two agents never disagree about who owns what. There is no shared
-daemon — each MCP client spawns its own bridge-db process over stdio, and SQLite
-WAL mode plus a busy-timeout makes concurrent writes safe without a coordinator.
-
----
-
-## Real metrics from the truth snapshot
-
-Every number here is read directly from the canonical
-`portfolio-truth-latest.json` (schema `0.5.0`). It is regenerated on demand; this
-is one real snapshot.
-
-### Portfolio shape — 129 projects
-
-| Dimension | Breakdown |
-|---|---|
-| **Total projects** | **129** (128 git repos, 1 non-git working dir) |
-| **Activity status** | 21 recent · 91 active · 5 stale · 12 archived |
-| **Lifecycle** | 107 active · 6 maintenance · 3 dormant · 12 archived · 1 uncataloged |
-| **Recency** | 91 repos touched in the last 7 days · 123 within 30 days · 127 within 90 days · median **4 days** since last meaningful activity |
-
-The recency curve is the punchline: **only 2 of 129 repos** are older than 90 days.
-This isn't a graveyard of abandoned projects — it's an actively churning portfolio,
-which is *exactly* why a timestamp-only view is useless. Almost everything looks
-"recent." The auditor's job is to grade what "recent" actually means.
-
-### Health & risk
-
-| Dimension | Breakdown |
-|---|---|
-| **Risk tier** | 61 baseline · 27 moderate · 29 elevated · 12 deferred |
-| **Security posture** | **63 repos** carry open high/critical Dependabot alerts; **49 repos** are currently classified with security risk |
-| **Tests present** | 101 / 129 (78%) |
-| **CI present** | 81 / 129 (63%) |
-| **License present** | 100 / 129 (78%) |
-| **Context quality** | 67 minimum-viable · 29 standard · 16 full · 17 boilerplate |
-
-That **63** is the single most valuable number the desktop demo puts on screen
-and the one `git log` can never give you: a precise, current count of repos with
-live high/critical security exposure, ready to be burned down. The truth layer
-separately marks **49** repos as active security-risk items after applying its
-portfolio risk rules.
-
-### Agent attribution — who built what
-
-The truth file records a `tool_provenance` for each repo. Across 129 projects:
-
-| Builder | Repos attributed |
-|---|---|
-| **Claude Code** | **53** |
-| **Codex** | **22** |
-| GPT (other) | 12 |
-| Unknown / human-seeded | 42 |
-
-**75 of 129 repos** are attributable to the two autonomous coding agents this
-control plane coordinates. That coordination is the other half of the story.
-
----
-
-## The coordination model: two agents, one control plane
-
-Claude Code and Codex both write code across the same 129-repo portfolio. Left
-uncoordinated, two autonomous agents on a shared filesystem are a merge-conflict
-machine. The Operator OS keeps them out of each other's way with three rules.
-
-### 1 · Lanes — ownership by area, enforced at the write boundary
-
-Work is partitioned into **lanes**, and bridge-db enforces lane ownership at the
-data layer: every write tool checks the `caller` and rejects writes to state the
-caller doesn't own. The recognized writers are `cc` (Claude Code), `codex`,
-`claude_ai`, and two ops services. A repo's build provenance, CI workflows, and
-sync code belong to one lane; another agent reads them but does not mutate them.
-The boundary is structural, not a polite convention — an agent *cannot* clobber
-another lane's state even if it tries.
-
-### 2 · Handoffs — a dispatcher hands work down, builders pick it up
-
-The handoff protocol mirrors a PM-and-engineers org:
-
-- **Claude.ai dispatches.** Only the `claude_ai` caller may `create_handoff` — it
-  is the planning/PM seat and never writes code directly.
-- **Builders pick up.** `cc` or `codex` calls `pick_up_handoff` to claim a unit of
-  work, then `clear_handoff` when it's done.
-- **State is shared, not messaged.** Handoffs live in bridge-db, so a builder
-  starting a fresh session reads its assigned work from the store instead of
-  needing the originating conversation. Context survives session boundaries.
-
-### 3 · Push policy — feature branches, never `main`, merge server-side
-
-The hard rule across every repo: **agents never push to `main`/`master`.** It's
-enforced by a pre-tool hook, not trusted to the model. The workflow:
-
-- Each unit of work happens on a **feature branch** (`docs/...`, `feat/...`,
-  `fix/...`).
-- Commits are small, conventional, and verified (compile + test) before they land.
-- When a branch is ready, it merges through a **server-side merge** (e.g. a
-  reviewed PR merge) rather than a local push to a protected branch — which also
-  keeps the push-to-main guard satisfied without weakening it.
-- Repos can carry **distinct push targets** (a public mirror vs. a private
-  origin), so "where does this land" is per-repo, never assumed.
-
-The result: two agents, hundreds of branches, zero pushes to protected branches,
-and a truth layer that tells you — after the fact — exactly which agent touched
-which repo.
-
----
-
-## What this demonstrates
-
-Beyond the portfolio itself, the build exercises a set of platform-engineering
-patterns:
-
-- **One-writer-per-fact architecture.** Truth has a single producer (the auditor);
-  shared state has a single mutation path with ownership gating (bridge-db). Every
-  consumer agrees by construction — no reconciliation logic anywhere downstream.
-- **Contracts over coupling.** Layers communicate through versioned artifacts
-  (`schema_version`) and typed load commands, so the desktop shell can render a
-  snapshot it never has to understand how to compute.
-- **Deterministic before probabilistic.** Notification urgency is decided by
-  keyword rules and explicit policy, not an LLM call — fast, free, and auditable.
-  The agents reason; the plumbing does not.
-- **Safety enforced at the boundary, not requested politely.** No-push-to-main,
-  caller-owned writes, localhost-only daemons, and secrets read from the OS
-  keychain (never from repo files) are all structural guarantees.
-- **Local-first and private by default.** Every service binds to loopback or runs
-  over stdio. Nothing in this control plane requires a hosted backend.
-
----
-
-## Component reference
-
-| Component | Role in the pipeline | Interface |
-|---|---|---|
-| GithubRepoAuditor | Produces canonical portfolio truth + history | CLI, JSON/HTML/Markdown/Excel outputs |
-| bridge-db | Shared cross-agent state | MCP (stdio), 23 tools, SQLite WAL |
-| notification-hub | Event classification + routed delivery | Localhost HTTP intake + bridge file watcher |
-| PortfolioCommandCenter | Desktop visualization of truth | Tauri 2 app, read-only truth consumer |
-| portfolio-health | Active/stale/unshipped overlay | MCP (stdio), 5 tools, FTS5 |
-| cost-tracker | Agent spend visibility | MCP (stdio), 6 tools, `ccusage` + bridge-db |
-
----
-
-*Metrics in this document are drawn from a real `portfolio-truth-latest.json`
-snapshot (schema 0.5.0). Paths are shown home-relative; this is a sanitized,
-public write-up of a private local system.*
diff --git a/DEMO-PLAN.md b/DEMO-PLAN.md
deleted file mode 100644
index 7aa7503..0000000
--- a/DEMO-PLAN.md
+++ /dev/null
@@ -1,95 +0,0 @@
-# Demo Plan — Operator OS in 90 Seconds
-
-A shot-by-shot script for a screen recording that makes a hiring manager
-*understand the system* — not just see a pretty dashboard. The throughline:
-**`git log` can't grade a portfolio; this can — and two agents act on it.**
-
-The demo is driven entirely by **PortfolioCommandCenter** (the Tauri 2 desktop
-shell), because it renders the truth artifact every other layer produces. Five
-tabs, one header action, one closing line.
-
----
-
-## What the viewer should walk away knowing
-
-1. There's a **single source of truth** over 129 repos — graded, not just dated.
-2. It surfaces the number `git log` can't: **63 repos with open high/critical
-   Dependabot alerts**, plus which package bump clears each advisory group.
-3. The weekly digest gives one current decision: **start with codexkit**.
-
-If those three land in 90 seconds, the demo worked.
-
-Current screenshot proof for the five-tab local demo is archived in
-[`docs/demo-proof/2026-06-07/`](docs/demo-proof/2026-06-07/).
-
----
-
-## Pre-record setup (off-camera)
-
-Do this before hitting record so the app opens warm and current:
-
-1. **Refresh the producer artifacts** so the snapshot is today's:
-   ```sh
-   # in the auditor repo — flags FIRST, then username, run via python -m
-   python -m src.cli --portfolio-truth --portfolio-truth-include-security <user>
-   python -m src.cli triage <user> --control-center
-   ```
-2. **Launch the desktop shell** with `pnpm demo:desktop` from
-   `../PortfolioCommandCenter`.
-3. Confirm the header shows the correct **output directory** and a fresh
-   `generated_at`.
-4. Set window to a **clean 1920×1080 capture**; hide the macOS menu bar clutter.
-
-> **Privacy callout (this is for a public audience):** the Portfolio tab lists
-> real repo names. Before publishing, either (a) scroll/zoom to the **aggregate
-> counts and risk columns** rather than individual rows, or (b) blur repo-name
-> cells in post. Show the *shape* of the portfolio, not the contents.
-
----
-
-## The 90-second shot list
-
-| Time | Screen | Action | Line to land |
-|---|---|---|---|
-| **0:00–0:10** | App launch / **Portfolio** tab | Open cold. Let the full 129-row table paint. | *"Every repo I've ever started — 129 of them — in one graded view. Not a commit log. A judgement."* |
-| **0:10–0:28** | **Portfolio** tab | Sort by risk tier; point at the columns: risk, context quality, registry status, **tool**, open high/critical alert count. | *"Each repo carries a risk tier, a context-quality grade, and who built it. `git log` gives you a timestamp; this tells you what the timestamp means."* |
-| **0:28–0:48** | **Risk + Security** tab | Filter to elevated-risk; show the posture counts (117 scanned / 63 with open high-critical / 65 critical / 191 high). | *"63 of 129 repos have an open high or critical Dependabot alert. That's the number a timestamp can never give you."* |
-| **0:48–1:02** | **Burndown** tab | Show the advisory-grouped fix list — one package bump → the repos it clears. | *"And it's actionable: each advisory is grouped by the single dependency bump that burns it down across every affected repo."* |
-| **1:02–1:14** | **Trends** → **Weekly Digest** | Flash the risk/security drift chart across snapshots, then the digest's headline + decision + next-step. | *"It keeps history, so I can see drift over time — and today it says: start with codexkit."* |
-| **1:14–1:26** | Header **Run auditor** action | Click **Run auditor** (fast); show the views reload on completion. | *"This isn't a static export. I regenerate the truth live, right from the app."* |
-| **1:26–1:30** | Back on **Portfolio**, point at the **tool** column | Rest on the Claude Code / Codex attribution. | *"And it knows which agent built what — because two of them work this portfolio under one control plane."* |
-
-Total: **90 seconds**, six beats, one number that sticks (**63**).
-
----
-
-## Optional extended cut (~2:30) — the coordination story
-
-If the audience is technical and you have extra runway, append a second act that
-shows the *control plane*, not just the dashboard:
-
-| Time | What to show | Point |
-|---|---|---|
-| +0:00–0:25 | A terminal split: Claude Code on a `feat/...` branch in one repo, Codex on a `fix/...` branch in another. | Two autonomous agents, different lanes, same portfolio. |
-| +0:25–0:50 | bridge-db handoff flow: a dispatched handoff being **picked up**, then **cleared** (via the MCP tools or the bridge markdown). | Work is shared state, not chat history — it survives session boundaries. |
-| +0:50–1:10 | A blocked push to `main` (the pre-tool guard firing), then the same work landing via a **server-side merge**. | Safety is enforced at the boundary, not requested politely. |
-| +1:10–1:30 | A **notification-hub** event arriving (macOS push) after a session completes. | Events are classified and routed deterministically — no LLM in the plumbing. |
-
----
-
-## Recording checklist
-
-- [ ] Artifacts regenerated today (`generated_at` is current in the header).
-- [ ] Window at 1920×1080, menu-bar/desktop clutter hidden.
-- [ ] Individual repo names blurred or kept off-frame; show aggregates.
-- [ ] No terminal scrollback exposing absolute home paths, tokens, or hostnames.
-- [ ] The number **63** is on screen and called out by voice.
-- [ ] Closing line names both agents (Claude Code + Codex) and "one control plane."
-- [ ] Final cut ≤ 90 seconds for the core demo.
-
----
-
-*This plan drives PortfolioCommandCenter against a real
-`portfolio-truth-latest.json` snapshot (schema 0.5.0). Keep individual repo names
-out of the published frame — show the system's shape, not the portfolio's
-contents.*
diff --git a/DEMO-SCRIPT.md b/DEMO-SCRIPT.md
deleted file mode 100644
index d827848..0000000
--- a/DEMO-SCRIPT.md
+++ /dev/null
@@ -1,64 +0,0 @@
-# Demo Voiceover Script — Operator OS (90 seconds)
-
-Record-ready teleprompter script for the demo defined in [DEMO-PLAN.md](DEMO-PLAN.md).
-Drive **PortfolioCommandCenter**; read the **bold spoken lines**; do the
-`[SCREEN]` action just before each line. Total spoken ≈ 180 words ≈ 72 s of
-speech, leaving ~18 s of breathing room inside a 90 s cut.
-
-**Delivery:** conversational, confident, ~150 wpm. Pause on the dashes. Land hard
-on the word **sixty-three** — that's the line that sells the whole system.
-
----
-
-### 0:00 — Cold open · app launch / Portfolio tab
-`[SCREEN]` Open the app cold. Let the full 129-row table paint. Don't narrate the loading.
-
-> **"This is every repo I've ever started — a hundred and twenty-nine of them — in one graded view. Not a commit log. A judgment call on every single one."**
-
-### 0:10 — Portfolio columns
-`[SCREEN]` Sort by risk tier. Slowly run the cursor across the columns: risk · context quality · registry status · tool · open-alert count.
-
-> **"Each repo carries a risk tier, a context-quality grade, and which agent built it. `git log` gives you a timestamp. This tells you what that timestamp actually means."**
-
-### 0:28 — Risk + Security tab
-`[SCREEN]` Switch to Risk + Security. Filter to elevated risk; let the security posture counts fill the frame.
-
-> **"And here's the number a timestamp can never give you — sixty-three of these repos have a live, high-or-critical security alert. Right now."**
-
-### 0:48 — Burndown tab
-`[SCREEN]` Switch to Burndown. Hover one advisory group so the "repos cleared by this bump" list expands.
-
-> **"It's not just a count — it's a fix list. Every advisory is grouped by the one dependency bump that clears it across every repo it touches."**
-
-### 1:02 — Trends → Weekly Digest
-`[SCREEN]` Flash the Trends drift chart (2–3 s), then cut to the Weekly Digest: headline, decision, next-step.
-
-> **"It keeps history, so I can watch risk drift over time. And every week it hands me one headline, one decision, one next move."**
-
-### 1:14 — Run auditor (header action)
-`[SCREEN]` Click **Run auditor** (fast mode). Let the views visibly reload on completion.
-
-> **"This isn't a static export. I regenerate the truth live, right from the app."**
-
-### 1:26 — Close · back on Portfolio, rest on the tool column
-`[SCREEN]` Return to Portfolio. Rest the cursor on the Claude Code / Codex attribution column. Hold for the final beat.
-
-> **"And it knows which agent built what — because two of them work this portfolio, under one control plane."**
-
-`[SCREEN]` Hold 1 s on the full table, then cut.
-
----
-
-## Pickup lines (swap in if a beat runs long or you want a different close)
-
-- **Tighter cold open:** *"A hundred and twenty-nine repos. One question: which ones are actually worth finishing? This answers it."*
-- **Alt security beat:** *"Sixty-three repos with live high-or-critical alerts — and the exact bump that clears each one."*
-- **Alt close (coordination-forward):** *"Two autonomous agents, hundreds of branches, one source of truth keeping them honest."*
-
-## Numbers cheat-sheet (say these exactly)
-- **129** total repos · **63** with open high/critical Dependabot alerts · **49** classified security-risk items
-- Agent attribution: **Claude Code 53 · Codex 22** (the two coordinated lanes)
-- Recency: **91** repos touched in the last 7 days — "almost everything looks recent, which is *why* a timestamp is useless"
-
-*Pairs with DEMO-PLAN.md. Keep individual repo names blurred or off-frame when
-recording — show the system's shape, not the portfolio's contents.*
diff --git a/IMPLEMENTATION-ROADMAP.md b/IMPLEMENTATION-ROADMAP.md
deleted file mode 100644
index df1ef77..0000000
--- a/IMPLEMENTATION-ROADMAP.md
+++ /dev/null
@@ -1,365 +0,0 @@
-# GitHub Repo Auditor — Implementation Roadmap
-
-## Architecture
-
-### System Overview
-```
-[CLI Entry] → [GitHub API Client] → [Repo Fetcher (clone)] → [Analyzer Engine] → [Report Generator]
-                    ↓                        ↓                       ↓                     ↓
-              [Rate Limiter]          [/tmp/audit-repos/]      [Per-Repo Scores]     [output/*.json + *.md]
-```
-
-**Flow:**
-1. CLI accepts username + optional token
-2. GitHub API fetches all repos (paginated, handles 100+ repos)
-3. Each repo is shallow-cloned to a temp directory
-4. Analyzer engine runs 10+ dimension checks per repo
-5. Results aggregated into JSON + Markdown report
-6. Temp clones cleaned up
-
-### File Structure
-```
-github-repo-auditor/
-├── src/
-│   ├── __init__.py
-│   ├── cli.py                # argparse entry point
-│   ├── github_client.py      # API calls: list repos, get commit stats, get languages
-│   ├── cloner.py             # Shallow clone + cleanup
-│   ├── analyzers/
-│   │   ├── __init__.py
-│   │   ├── base.py           # BaseAnalyzer abstract class
-│   │   ├── readme.py         # README quality scoring
-│   │   ├── structure.py      # Project structure analysis
-│   │   ├── code_quality.py   # TODO/FIXME counts, entry points, build configs
-│   │   ├── testing.py        # Test presence, framework detection
-│   │   ├── cicd.py           # GitHub Actions / CI detection
-│   │   ├── dependencies.py   # Lockfile detection, staleness signals
-│   │   ├── activity.py       # Commit recency, frequency (via API)
-│   │   └── completeness.py   # Overall completeness heuristic
-│   ├── scorer.py             # Aggregates analyzer results into per-repo score
-│   └── reporter.py           # Generates JSON + Markdown output
-├── output/                   # Generated reports land here
-├── requirements.txt
-├── CLAUDE.md
-├── IMPLEMENTATION-ROADMAP.md
-└── README.md
-```
-
-### Data Model
-
-No database. All data flows through Python dataclasses in memory and writes to JSON.
-
-```python
-from dataclasses import dataclass, field
-from datetime import datetime
-from typing import Optional
-
-@dataclass
-class RepoMetadata:
-    name: str
-    full_name: str
-    description: Optional[str]
-    language: Optional[str]
-    languages: dict[str, int]          # language -> bytes
-    private: bool
-    fork: bool
-    archived: bool
-    created_at: datetime
-    updated_at: datetime
-    pushed_at: datetime
-    default_branch: str
-    stars: int
-    forks: int
-    open_issues: int
-    size_kb: int
-    html_url: str
-    clone_url: str
-    topics: list[str]
-
-@dataclass
-class AnalyzerResult:
-    dimension: str                      # e.g., "readme", "testing", "structure"
-    score: float                        # 0.0 – 1.0
-    max_score: float                    # always 1.0
-    findings: list[str]                 # human-readable notes
-    details: dict                       # dimension-specific structured data
-
-@dataclass
-class RepoAudit:
-    metadata: RepoMetadata
-    analyzer_results: list[AnalyzerResult]
-    overall_score: float                # weighted composite 0.0 – 1.0
-    completeness_tier: str              # "shipped", "functional", "wip", "skeleton", "abandoned"
-    flags: list[str]                    # e.g., ["no-readme", "no-tests", "stale-2yr"]
-
-@dataclass
-class AuditReport:
-    username: str
-    generated_at: datetime
-    total_repos: int
-    repos_audited: int                  # excludes forks if --skip-forks
-    tier_distribution: dict[str, int]   # tier -> count
-    average_score: float
-    audits: list[RepoAudit]
-```
-
-### API Contracts
-
-**GitHub REST API v3:**
-
-| Endpoint | Method | Auth | Rate Limit | Purpose |
-|----------|--------|------|------------|---------|
-| `/users/{username}/repos` | GET | Token (optional) | 60/hr unauth, 5000/hr auth | List all public repos |
-| `/user/repos` | GET | Token (required) | 5000/hr | List all repos including private |
-| `/repos/{owner}/{repo}/languages` | GET | Token (optional) | 5000/hr | Language breakdown by bytes |
-| `/repos/{owner}/{repo}/commits` | GET | Token (optional) | 5000/hr | Recent commit activity |
-| `/repos/{owner}/{repo}/stats/commit_activity` | GET | Token (optional) | 5000/hr | Weekly commit counts (last year) |
-| `/repos/{owner}/{repo}/stats/contributors` | GET | Token (optional) | 5000/hr | Contributor commit counts |
-| `/repos/{owner}/{repo}/topics` | GET | Token (optional) | 5000/hr | Repo topics |
-
-**Pagination:** All list endpoints use `Link` header with `rel="next"`. Fetch pages until no `next` link.
-
-**Auth header:** `Authorization: token {GITHUB_TOKEN}` — read from `GITHUB_TOKEN` env var.
-
-**Rate limit handling:** Check `X-RateLimit-Remaining` header. If < 10, sleep until `X-RateLimit-Reset` timestamp.
-
-### Dependencies
-```bash
-pip install requests python-dateutil
-```
-
-That's it. Two dependencies. Everything else is stdlib.
-
----
-
-## Scope Boundaries
-
-**In scope:**
-- Fetch all repos (public + private with token) for a given GitHub username
-- Shallow clone each repo and run local file analysis
-- Score across 10 dimensions (see analyzer details below)
-- Classify each repo into a completeness tier
-- Generate JSON report (machine-readable, PCC-compatible)
-- Generate Markdown summary report (human-readable)
-- Handle 100+ repos gracefully with progress output
-- Skip forks optionally via `--skip-forks` flag
-
-**Out of scope:**
-- Web UI or dashboard (output is files only)
-- Running actual test suites or build commands
-- Dependency vulnerability scanning (just detect presence of lockfiles)
-- GitHub Actions run history analysis
-- Cross-repo dependency detection
-- Organization repos (user repos only)
-
-**Deferred:**
-- Integration with project-registry.md reconciliation (Phase 2)
-- PCC import format generation (Phase 2)
-- Historical trend tracking across multiple audit runs (future)
-
-## Security & Credentials
-- GitHub token read from `GITHUB_TOKEN` environment variable — never passed as CLI arg, never logged
-- Token is optional for public-only audits, required for private repos
-- Cloned repos are written to a temp directory and cleaned up after analysis
-- No data leaves the machine — all analysis is local
-
----
-
-## Analyzer Dimension Specifications
-
-Each analyzer scores 0.0–1.0. The overall score is a weighted average.
-
-### 1. README Quality (`readme.py`) — Weight: 15%
-| Check | Points | Detection |
-|-------|--------|-----------|
-| README exists | 0.2 | `README.md` or `README` or `README.rst` in root |
-| Has project description (>50 chars first section) | 0.2 | Parse first heading + paragraph |
-| Has installation/setup instructions | 0.2 | Look for headings containing "install", "setup", "getting started", "usage" |
-| Has usage examples or screenshots | 0.2 | Look for code blocks or image references |
-| Length > 500 chars | 0.1 | Character count |
-| Has badges | 0.1 | `![` patterns in first 10 lines |
-
-### 2. Project Structure (`structure.py`) — Weight: 10%
-| Check | Points | Detection |
-|-------|--------|-----------|
-| Has `.gitignore` | 0.2 | File exists |
-| Has `src/` or `lib/` or language-standard structure | 0.3 | Directory detection based on primary language |
-| Has config file (package.json, Cargo.toml, pyproject.toml, etc.) | 0.3 | File exists by known names |
-| Has LICENSE file | 0.1 | `LICENSE` or `LICENSE.md` in root |
-| Not a flat dump (>1 directory depth) | 0.1 | Directory tree depth analysis |
-
-### 3. Code Quality Signals (`code_quality.py`) — Weight: 15%
-| Check | Points | Detection |
-|-------|--------|-----------|
-| Has identifiable entry point | 0.3 | `main.py`, `index.ts`, `src/main.rs`, `main.go`, `App.tsx`, etc. |
-| TODO/FIXME density < 5 per 1000 LOC | 0.2 | Grep + LOC count |
-| Has type definitions (if applicable) | 0.2 | `.ts` files, Python type hints, Rust types |
-| No large generated/vendored files | 0.15 | Detect `vendor/`, `node_modules/` committed, files >1MB |
-| Has meaningful commit messages (last 10) | 0.15 | Via API: check messages aren't all "update" or "fix" |
-
-### 4. Testing (`testing.py`) — Weight: 15%
-| Check | Points | Detection |
-|-------|--------|-----------|
-| Test directory or test files exist | 0.4 | `test/`, `tests/`, `__tests__/`, `*_test.*`, `*_spec.*`, `test_*.*` |
-| Test framework configured | 0.3 | jest in package.json, pytest in pyproject.toml, etc. |
-| Test count > 0 (heuristic) | 0.3 | Count files matching test patterns |
-
-### 5. CI/CD (`cicd.py`) — Weight: 10%
-| Check | Points | Detection |
-|-------|--------|-----------|
-| `.github/workflows/` exists with YAML files | 0.5 | Directory + file check |
-| Alternative CI config (`.travis.yml`, `Jenkinsfile`, `.circleci/`, `Dockerfile`) | 0.3 | File exists |
-| Has build script in package.json / Makefile | 0.2 | Parse for "build", "test" scripts |
-
-### 6. Dependency Management (`dependencies.py`) — Weight: 10%
-| Check | Points | Detection |
-|-------|--------|-----------|
-| Has lockfile (`package-lock.json`, `yarn.lock`, `Cargo.lock`, `poetry.lock`, `Pipfile.lock`) | 0.4 | File exists |
-| Has dependency manifest (package.json, requirements.txt, Cargo.toml, go.mod) | 0.4 | File exists |
-| Dependencies count is reasonable (not 0, not 500+) | 0.2 | Parse manifest for dep count |
-
-### 7. Activity & Recency (`activity.py`) — Weight: 15%
-| Check | Points | Detection |
-|-------|--------|-----------|
-| Last push within 6 months | 0.3 | `pushed_at` from API |
-| Last push within 1 year | 0.2 | `pushed_at` from API (if >6mo, partial credit) |
-| More than 10 commits total | 0.2 | Contributor stats API |
-| Commits in last 3 months | 0.2 | Commit activity API |
-| Not archived | 0.1 | `archived` field from API |
-
-### 8. Documentation Beyond README (`completeness.py`) — Weight: 5%
-| Check | Points | Detection |
-|-------|--------|-----------|
-| Has `docs/` directory or wiki-style files | 0.3 | Directory check |
-| Has CHANGELOG or HISTORY file | 0.3 | File exists |
-| Has CONTRIBUTING guide | 0.2 | File exists |
-| Has inline code comments (sampling) | 0.2 | Sample 5 largest files, check comment density |
-
-### 9. Build/Run Readiness (`completeness.py`) — Weight: 5%
-| Check | Points | Detection |
-|-------|--------|-----------|
-| Has Dockerfile or docker-compose | 0.3 | File exists |
-| Has Makefile or build script | 0.3 | File exists |
-| Has environment example (.env.example, .env.sample) | 0.2 | File exists |
-| Has deployment config (Vercel, Netlify, fly.toml, etc.) | 0.2 | File exists |
-
----
-
-## Completeness Tier Classification
-
-Based on overall weighted score:
-
-| Tier | Score Range | Description |
-|------|-------------|-------------|
-| **Shipped** | 0.75 – 1.0 | Production-ready or clearly complete. README, tests, CI, recent activity. |
-| **Functional** | 0.55 – 0.74 | Works but rough edges. Missing tests or CI. Has clear entry point. |
-| **WIP** | 0.35 – 0.54 | Active development, partially built. Some structure, some code, incomplete. |
-| **Skeleton** | 0.15 – 0.34 | Scaffolded but barely started. Boilerplate only. |
-| **Abandoned** | 0.0 – 0.14 | No meaningful content, no recent activity, or just a README. |
-
-**Override rules:**
-- If `archived == true` and score > 0.5 → cap tier at "Functional" (archived = not actively shipped)
-- If `fork == true` → add flag `"forked"`, reduce activity weight to 5%
-- If last push > 2 years ago → add flag `"stale-2yr"`, cap tier at "WIP" regardless of score
-- If repo has 0 files beyond README → force tier to "Skeleton"
-
----
-
-## Phase 0: Foundation (Day 1)
-
-**Objective:** Working CLI that fetches repos from GitHub API, clones them, and outputs raw metadata JSON.
-
-**Tasks:**
-1. Scaffold project structure per file tree above — **Acceptance:** All directories and `__init__.py` files exist
-2. Implement `github_client.py` — list repos with pagination, rate limit handling — **Acceptance:** `python -m src.cli saagpatel` prints repo names to stdout
-3. Implement `cloner.py` — shallow clone to temp dir, cleanup after — **Acceptance:** Repos appear in `/tmp/audit-repos/`, are removed after script exits
-4. Implement `cli.py` with argparse — **Acceptance:** `python -m src.cli --help` shows usage; `python -m src.cli saagpatel --token $GITHUB_TOKEN` runs end-to-end
-5. Write `RepoMetadata` dataclass and populate from API response — **Acceptance:** `output/raw_metadata.json` contains all repos with all fields populated
-
-**Verification checklist:**
-- [ ] `python -m src.cli saagpatel` → prints list of all public repos
-- [ ] `python -m src.cli saagpatel --token $GITHUB_TOKEN` → includes private repos
-- [ ] `output/raw_metadata.json` exists and is valid JSON with all repos
-- [ ] No repos left in temp directory after script completes
-- [ ] Rate limit handling works (check `X-RateLimit-Remaining` logged)
-
-**Risks:**
-- GitHub stats endpoints return 202 (computing) on first call: Retry with exponential backoff (3 attempts, 2s/4s/8s)
-- Rate limit hit with 100+ repos: Implement sleep-until-reset using `X-RateLimit-Reset` header
-
----
-
-## Phase 1: Analyzer Engine (Day 1–2)
-
-**Objective:** All 9 analyzer dimensions implemented, producing per-repo scores.
-
-**Tasks:**
-1. Implement `BaseAnalyzer` abstract class with `analyze(repo_path: Path, metadata: RepoMetadata) -> AnalyzerResult` — **Acceptance:** Interface defined, type-checked
-2. Implement all 9 analyzers per dimension specs above — **Acceptance:** Each returns `AnalyzerResult` with score, findings, details
-3. Implement `scorer.py` — weighted aggregation + tier classification with override rules — **Acceptance:** `RepoAudit` objects have `overall_score` and `completeness_tier` populated
-4. Wire analyzers into CLI pipeline: fetch → clone → analyze → score — **Acceptance:** `python -m src.cli saagpatel` produces scored results for all repos
-5. Add `--verbose` flag that prints per-dimension scores per repo — **Acceptance:** Verbose output shows all 9 dimension scores per repo
-
-**Verification checklist:**
-- [ ] Run against 3 repos of varying quality → scores feel intuitive (high for complete, low for skeletons)
-- [ ] Override rules work: archived repos capped, stale repos flagged
-- [ ] `--verbose` shows per-dimension breakdown
-- [ ] No crashes on empty repos, repos with no code, or repos with unusual structures
-
-**Risks:**
-- Analyzer crashes on unexpected file structures: Wrap each analyzer in try/except, return score 0.0 with finding "analysis failed: {error}"
-- Large repos slow down analysis: Set max file scan limit (500 files per repo, skip binary files)
-
----
-
-## Phase 2: Report Generation (Day 2–3)
-
-**Objective:** Full JSON + Markdown reports with summary statistics, tier distribution, and per-repo breakdowns.
-
-**Tasks:**
-1. Implement JSON report output — **Acceptance:** `output/audit-report-{username}-{date}.json` matches `AuditReport` schema exactly
-2. Implement Markdown report with:
-   - Summary table (total repos, tier distribution, average score)
-   - Tier-grouped repo lists with scores and key flags
-   - Per-repo detail sections (expandable in Markdown viewers)
-   — **Acceptance:** `output/audit-report-{username}-{date}.md` renders cleanly in GitHub/VS Code preview
-3. Add `--skip-forks` flag — **Acceptance:** Fork repos excluded from analysis and report when flag set
-4. Add `--output-dir` flag — **Acceptance:** Reports written to specified directory
-5. Add progress bar using stderr prints — **Acceptance:** Shows `[12/47] Analyzing repo-name...` during run
-6. Add PCC-compatible JSON export — flat array of objects with fields matching PCC project schema (name, status, score, url, last_activity, tier, flags) — **Acceptance:** `output/pcc-import-{username}-{date}.json` is importable into PCC
-
-**Verification checklist:**
-- [ ] JSON report validates against `AuditReport` dataclass
-- [ ] Markdown report renders with proper tables and formatting
-- [ ] `--skip-forks` correctly excludes forked repos
-- [ ] `--output-dir /custom/path` writes reports there
-- [ ] Progress output shows on stderr (not mixed with stdout)
-- [ ] PCC import file has flat structure ready for dashboard import
-
-**Risks:**
-- Markdown table formatting breaks with long repo names: Truncate names to 40 chars in tables
-- JSON serialization fails on datetime objects: Use `.isoformat()` for all datetimes
-
----
-
-## Phase 3: Polish & Reconciliation (Day 3)
-
-**Objective:** Cross-reference with local project-registry.md, add summary stats, handle edge cases.
-
-**Tasks:**
-1. Add `--registry` flag accepting path to project-registry.md — **Acceptance:** Report includes "On GitHub but not in registry" and "In registry but not on GitHub" sections
-2. Registry parser: extract project names and statuses from markdown — **Acceptance:** Parses the registry format used at `~/Projects/project-registry.md`
-3. Add summary statistics to report: most active repos, most neglected, highest/lowest scored, language distribution — **Acceptance:** Summary section in Markdown report has all stats
-4. Handle edge cases: empty repos, repos with only a README, repos with >10k files, binary-only repos — **Acceptance:** No crashes, appropriate tier assignments
-5. Write README.md for the auditor tool itself — **Acceptance:** Complete with usage, examples, output format docs
-
-**Verification checklist:**
-- [ ] Full audit run against `saagpatel` completes without errors
-- [ ] Registry reconciliation correctly identifies gaps in both directions
-- [ ] Summary statistics are accurate (spot-check 3 repos manually)
-- [ ] README documents all CLI flags and output formats
-- [ ] Tool audits itself and scores > 0.6
-
-**Risks:**
-- Registry format varies: Build a lenient parser that handles common markdown table and list formats
-- Too many API calls for large accounts: Cache API responses to `output/.cache/` with 1-hour TTL