Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
129 changes: 129 additions & 0 deletions CASE-STUDY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
# Operator OS: A Multi-Agent Control Plane Over A Software Portfolio

Operator OS is a local-first control plane for AI-assisted builders: it turns a sprawling repo portfolio and multiple coding agents into verified truth, visible risk, and one operator-approved next move.

GitHub Repo Auditor is the truth engine behind the first public wedge. It began as a repository auditor, but the stronger product shape is a portfolio operating layer: one system that can say which projects are healthy, which are drifting, which are blocked, what changed, and what deserves attention next.

## Who This Is For

- Solo builders with dozens of repos and not enough trust in their own backlog.
- Staff engineers and technical leads who need decision-grade visibility across experimental, internal, and production projects.
- AI-native developers using Codex, Claude Code, ChatGPT, or similar tools across many workstreams.
- Devtools teams studying how agent-created work should be verified, prioritized, and governed.

## The Problem

AI coding tools make it easier to start and modify projects. They do not automatically make it easier to know what is true afterward.

Once a portfolio has enough projects and enough agent-touched work, normal tools flatten the wrong things:

- `git log` shows activity, not whether the work matters.
- GitHub alerts show risk, not which fix clears the most portfolio pain.
- Notes and handoffs preserve intent, but can drift from the current repo state.
- Agent transcripts are useful history, but they are not proof.
- Dashboards can look polished while hiding stale or private source data.

Operator OS starts from a stricter premise: local evidence wins. Every product surface should be traceable back to files, commands, generated artifacts, or explicit operator approval.

## Before And After

| Before | After |
| --- | --- |
| A long list of repos | A portfolio truth snapshot with risk, readiness, context quality, and security posture |
| Agent work scattered across chats | Agent provenance and follow-through visible in operator surfaces |
| Weekly review rebuilt from memory | Weekly command-center artifacts generated from current audit facts |
| Security alerts handled repo by repo | Advisory-grouped burndown showing the dependency bump that clears risk across repos |
| Handoffs as stale prose | Restart-safe handoffs that say what was checked, what must be rechecked, and what not to touch |
| Automation as blind trust | Dry-run-first proposals, explicit approvals, and evidence-backed execution gates |

## System Shape

```mermaid
flowchart LR
A["Local repos"] --> B["GitHub Repo Auditor"]
B --> C["Portfolio truth JSON"]
B --> D["Workbook / HTML / Markdown outputs"]
C --> E["Portfolio Command Center"]
D --> E
E --> F["Operator decision"]
F --> G["Manual or gated follow-through"]
```

The public wedge keeps this system deliberately narrow:

- `GithubRepoAuditor` produces portfolio truth and weekly/operator artifacts.
- `PortfolioCommandCenter` reads those artifacts and presents the operating view.
- Fixture or sanitized data drives the public demo.
- Private systems remain private implementation references, not public data sources.

The broader local machine adds other surfaces in private use:

- `bridge-db` for compact cross-agent receipts and state coordination.
- `personal-ops` for private inbox, planning, approvals, and local operator workflows.
- `notification-hub` for local event routing, review queues, and noise control.
- `SecondBrain` for private synthesized knowledge and source-grounded lessons.
- Codex and ChatGPT Pro workflow docs for advisory-only model review.

Those adjacent systems are useful because they prove the operating model under real pressure. They are not required for the public demo.

## What The Demo Shows

The public-safe demo should show the Portfolio Command Center running over fixture or sanitized portfolio truth:

1. A full portfolio table with risk, status, context quality, tool provenance, and security columns.
2. A risk/security tab that turns raw alert counts into portfolio-level attention.
3. A burndown tab that groups advisories by the fix that clears the most risk.
4. Trend charts that show whether risk is improving or getting worse.
5. A weekly digest that gives one headline, one decision, and one next move.

The private local proof package for the 2026-06-07 five-tab demo lives under `docs/demo-proof/2026-06-07/`. It proves the live local demo, but it is not the public publishing package because it may reveal real local portfolio details.

For public sharing, use the fixture-backed package under `docs/demo-proof/public-fixture/`.

## What Stays Private

The product should not expose raw local operating state. These surfaces are private by design:

- Local Codex sessions, memories, reports, hooks, secrets, config, and SQLite state.
- Gmail, Calendar, Drive, task, approval, and daemon state from `personal-ops`.
- Raw SecondBrain captures, conversation exports, vault history, and private notes.
- Real Notion databases, project rows, tokens, API traces, and live write receipts.
- `bridge-db` live SQLite contents, handoffs, snapshots, receipts, recall logs, and activity rows.
- `notification-hub` events, Slack routing, local queue state, and review logs.
- Private repo names, local absolute paths, branch state, and security findings unless they are intentionally sanitized.

The productizable asset is the pattern: local truth, bounded context, visible provenance, approval gates, and operator-facing decisions.

## Why It Is Hard To Copy

The moat is not a chart. Charts are easy.

The hard part is the lived-in operating discipline:

- One canonical truth contract feeding multiple views.
- Generated artifacts that agree with each other instead of becoming separate stories.
- Dry-run-first action flows that preserve human approval.
- Explicit stale-state handling instead of cheerful lies.
- Agent role boundaries: advisory models advise; local agents verify and execute.
- Restart-safe handoffs that force the next session back to current evidence.
- Private-by-default local operation with public-safe fixture demos.

Most products start with a dashboard and bolt trust on later. Operator OS starts with trust and lets the dashboard expose it.

## Public Wedge

The first wedge is Portfolio Command Center:

- simple enough to demo in 90 seconds;
- grounded in concrete repo facts;
- visually understandable to developers immediately;
- impressive without needing private email, calendar, Notion, or agent transcripts;
- extensible into the broader Operator OS story.

## Demo Links

- [90-second demo plan](DEMO-PLAN.md)
- [Fixture demo source](fixtures/demo/sample-report.json)
- [Public fixture proof package](docs/demo-proof/public-fixture/README.md)
- [Private local demo proof package](docs/demo-proof/2026-06-07/README.md)
- [Portfolio Command Center](../PortfolioCommandCenter/README.md)
122 changes: 122 additions & 0 deletions DEMO-PLAN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
# Portfolio Command Center Demo Plan

This is the public-safe demo plan for the Operator OS wedge.

The demo should prove one thing quickly: a serious builder can turn repo sprawl and agent-touched work into verified truth, visible risk, and one next move.

## Demo Thesis

Git history tells you what changed. Portfolio Command Center tells you what the change means.

## Demo Modes

Use one of two modes:

| Mode | Use for | Data source | Public-safe |
| --- | --- | --- | --- |
| Fixture mode | Public recording, docs, external sharing | `fixtures/demo/sample-report.json` via `make demo` | Yes |
| Live local mode | Private operator proof and internal review | `output/portfolio-truth-latest.json` from the real local portfolio | No, unless redacted |

Default to fixture mode for anything public.

## Fixture Demo Setup

From this repo:

```sh
make demo
```

Expected outputs:

- `output/demo/demo-report.json`
- `output/demo/demo-workbook.xlsx`
- `output/demo/dashboard-*.html`
- `output/demo/operator-control-center-demo.json`
- `output/demo/operator-control-center-demo.md`
- `output/demo/portfolio-truth-latest.json`
- `output/demo/portfolio-warehouse.db`

Then launch the desktop shell from the sibling app:

```sh
cd ../PortfolioCommandCenter
pnpm install
pnpm demo:desktop
```

In the app header, set the output directory to:

```text
../GithubRepoAuditor/output/demo
```

If the recording needs live-shaped data, create a sanitized output directory first. Do not point a public recording at the private live `output/` directory.

## 90-Second Arc

| Time | Frame | Spoken line |
| --- | --- | --- |
| 0:00-0:10 | Portfolio table | "This is the problem AI builders are about to have: not one repo, but a portfolio of agent-touched work." |
| 0:10-0:25 | Risk/context/status columns | "A commit timestamp is not enough. I need to know which projects are healthy, blocked, risky, stale, or worth ignoring." |
| 0:25-0:42 | Risk + Security | "The control plane turns raw alerts and project facts into an attention map, so risk stops hiding in individual repos." |
| 0:42-0:58 | Burndown | "The useful question is not just 'what is broken?' It is 'which fix clears the most portfolio pain?'" |
| 0:58-1:12 | Trends | "Because it keeps history, I can tell whether the portfolio is improving or just getting noisier." |
| 1:12-1:25 | Weekly Digest | "Every week, the system reduces the mess to one headline, one decision, and one next move." |
| 1:25-1:30 | Return to Portfolio | "That is Operator OS: verified truth for builders using agents at portfolio scale." |

## Must-Land Product Points

- The app is reading generated artifacts, not a hand-maintained spreadsheet.
- Portfolio truth, weekly digest, burndown, and charts come from the same evidence chain.
- The operator remains in charge.
- Public demo data is fixture-backed or sanitized.
- Private local systems are implementation references, not public data sources.

## Do Not Show Publicly

- Real private repo names.
- Local absolute paths under the user's home directory.
- Real GitHub security alert details.
- Notion database rows or page IDs.
- Gmail, Calendar, Drive, Slack, or task data.
- Codex sessions, memories, hook logs, or SQLite databases.
- SecondBrain raw captures or conversation exports.
- Tokens, cookies, env values, terminal scrollback, hostnames, or account settings.

## Redaction Checklist

Before publishing:

- [ ] Confirm the app is using `output/demo` or another sanitized output directory.
- [ ] Confirm no private repo names are readable.
- [ ] Confirm no terminal panes or local paths are visible.
- [ ] Confirm no account names, tokens, hostnames, or private URLs are visible.
- [ ] Confirm screenshots and video frames do not expose Notion, email, calendar, Slack, or SecondBrain.
- [ ] Confirm any local live proof package is described as private/local evidence only.

## Verification Checklist

Run:

```sh
make demo
python scripts/validate_proof_package.py docs/demo-proof/public-fixture/proof-package.json
```

For the desktop shell:

```sh
cd ../PortfolioCommandCenter
pnpm typecheck
pnpm test
pnpm build
```

Visual verification is complete only after the app is opened against the fixture output and the Portfolio, Risk + Security, Burndown, Trends, and Weekly Digest tabs all render without private data.

## Final Public Framing

Use this closing sentence:

> Operator OS is the missing control plane for AI-assisted builders: it turns scattered agent work and repo sprawl into verified truth, visible risk, and one operator-approved next move.
28 changes: 26 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,10 @@ Treat campaign/writeback, GitHub Projects, Notion sync, catalog overrides, score

- Safe demo path: run `make demo` after a local clone to generate sample artifacts from the committed fixture without a GitHub token.
- Demo fixture: [fixtures/demo/sample-report.json](fixtures/demo/sample-report.json)
- Operator OS case study: [CASE-STUDY.md](CASE-STUDY.md)
- Public-safe recording plan: [DEMO-PLAN.md](DEMO-PLAN.md)
- Product brief: [docs/product/operator-os-product-brief.md](docs/product/operator-os-product-brief.md)
- Public fixture proof package: [docs/demo-proof/public-fixture/README.md](docs/demo-proof/public-fixture/README.md)
- Product modes: [docs/modes.md](docs/modes.md)
- Web UI operator guide: [docs/audit-serve.md](docs/audit-serve.md)
- CLI migration (flat → subcommand): [docs/audit-cli-migration.md](docs/audit-cli-migration.md)
Expand Down Expand Up @@ -192,6 +196,8 @@ pip install "github-repo-auditor[serve]"
### Try the safe demo

The demo uses committed fixture data and writes only to `output/demo/`.
Use this path for public recordings and screenshots. The live local portfolio
output is private operator evidence unless it has been intentionally sanitized.

```bash
git clone https://github.com/saagpatel/GithubRepoAuditor.git
Expand All @@ -204,8 +210,10 @@ Expected outputs include `output/demo/demo-report.json`,
`output/demo/demo-workbook.xlsx`, `output/demo/dashboard-*.html`,
`output/demo/operator-control-center-demo.json`,
`output/demo/operator-control-center-demo.md`,
`output/demo/portfolio-truth-latest.json`, and
`output/demo/portfolio-warehouse.db`.
`output/demo/portfolio-truth-latest.json`,
`output/demo/weekly-command-center-sample-user-2026-04-12.json`,
`output/demo/security-burndown-sample-user-2026-04-12.json`,
`output/demo/pending-proposals.json`, and `output/demo/portfolio-warehouse.db`.

To browse the same fixture in the local web UI:

Expand All @@ -214,6 +222,9 @@ pip install -e ".[serve]"
audit serve --output-dir output/demo
```

To record the Portfolio Command Center wedge from the same fixture, follow
[DEMO-PLAN.md](DEMO-PLAN.md) and point the desktop app at `output/demo/`.

### Quick start (subcommand form)

```bash
Expand Down Expand Up @@ -431,6 +442,19 @@ Common fixes:

There is also a longer operator guide in [docs/operator-troubleshooting.md](docs/operator-troubleshooting.md).

## Proof Packages

Cross-repo done-state proof now uses `proof-package.v1`; see
[docs/proof-package-contract.md](docs/proof-package-contract.md). The first
concrete package is the PortfolioCommandCenter five-tab local demo proof at
[docs/demo-proof/2026-06-07/proof-package.json](docs/demo-proof/2026-06-07/proof-package.json).

Validate a package with:

```bash
python scripts/validate_proof_package.py docs/demo-proof/2026-06-07/proof-package.json
```

## License

MIT
19 changes: 19 additions & 0 deletions docs/demo-proof/2026-06-07/SUMMARY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# PortfolioCommandCenter Demo Proof Package

This package proves the 2026-06-07 local five-tab PortfolioCommandCenter demo.
GithubRepoAuditor is the evidence producer because it owns the portfolio truth
and generated screenshots; PortfolioCommandCenter is the subject repo being
demonstrated.

Status: passed.

Key proof points:

- Portfolio tab: 129 projects.
- Risk + Security tab: 117 scanned repos and 63 with open high/critical alerts.
- Burndown tab: advisory-grouped fix guidance.
- Trends tab: risk and high/critical history charts.
- Weekly Digest tab: current decision ends with `Start with codexkit.`

Use `proof-package.json` for the machine-readable claim-to-evidence map and
`README.md` for the narrative proof summary.
Loading