Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 5 additions & 4 deletions AGENTS-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ BitFun 是一个由 Rust workspace 与共享 React 前端组成的项目。
## 快速开始

1. 在修改架构敏感代码前,先阅读 `README.md` 和 `CONTRIBUTING.md`。
2. 本地桌面快速验证优先使用 `pnpm run desktop:preview:debug`,而不是 `pnpm run desktop:dev`
2. 桌面端开发优先使用 `pnpm run desktop:dev` — 提供完整热更新(Vite HMR + Rust 自动重编译并重启)。仅在需要更快冷启动且只迭代前端时使用 `pnpm run desktop:preview:debug`(Rust 改动不会自动重编译)
3. 改完后按下方表格执行与改动范围匹配的最小验证。

## 模块索引
Expand All @@ -35,9 +35,10 @@ BitFun 是一个由 Rust workspace 与共享 React 前端组成的项目。
pnpm install

# 开发
pnpm run desktop:preview:debug # 桌面快速迭代
pnpm run dev:web # 纯浏览器前端
pnpm run cli:dev # CLI 运行时
pnpm run desktop:dev # 完整热更新:Vite HMR + Rust 自动重编译并重启
pnpm run desktop:preview:debug # 复用预构建二进制 + Vite HMR;无 Rust 自动重编译
pnpm run dev:web # 纯浏览器前端
pnpm run cli:dev # CLI 运行时

# 检查
pnpm run lint:web
Expand Down
9 changes: 5 additions & 4 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Repository rule: **keep product logic platform-agnostic, then expose it through
## Quick start

1. Read `README.md` and `CONTRIBUTING.md` before architecture-sensitive changes.
2. For fast local desktop checks, prefer `pnpm run desktop:preview:debug` over `pnpm run desktop:dev`.
2. For desktop development, prefer `pnpm run desktop:dev` — it provides full hot-reload (Vite HMR + Rust auto-rebuild & restart). Use `pnpm run desktop:preview:debug` only when you need a faster cold-start for frontend-only iteration (Rust changes are not auto-rebuilt).
3. After changes, run the smallest matching verification from the table below.

## Module index
Expand All @@ -35,9 +35,10 @@ Repository rule: **keep product logic platform-agnostic, then expose it through
pnpm install

# Dev
pnpm run desktop:preview:debug # fast desktop iteration
pnpm run dev:web # browser-only frontend
pnpm run cli:dev # CLI runtime
pnpm run desktop:dev # full hot-reload: Vite HMR + Rust auto-rebuild & restart
pnpm run desktop:preview:debug # reuse pre-built binary + Vite HMR; no Rust auto-rebuild
pnpm run dev:web # browser-only frontend
pnpm run cli:dev # CLI runtime

# Check
pnpm run lint:web
Expand Down
12 changes: 9 additions & 3 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,15 +35,21 @@ pnpm install
### Common commands

```bash
# Desktop
pnpm run desktop:dev
pnpm run desktop:preview:debug
# Desktop (recommended for daily development)
pnpm run desktop:dev # full hot-reload: Vite HMR + Rust auto-rebuild & restart

# Desktop (lightweight preview, no Rust auto-rebuild)
pnpm run desktop:preview:debug # reuse pre-built binary + Vite HMR; Rust changes require manual restart

# Desktop (production build)
pnpm run desktop:build

# E2E
pnpm run e2e:test
```

> **`desktop:dev` vs `desktop:preview:debug`**: `desktop:dev` runs `tauri dev`, which provides **full hot-reload** — frontend changes apply instantly via Vite HMR, and Rust/backend changes trigger an incremental rebuild followed by an automatic app restart. This is the recommended workflow for active development. `desktop:preview:debug` launches a pre-built debug binary alongside a Vite dev server; frontend edits still get HMR, but **Rust-side changes are not auto-rebuilt** — you must stop and re-run the command (or use `--force-rebuild`). Use `desktop:preview:debug` when you only need to iterate on frontend code or want a faster cold-start without waiting for `tauri dev` initialization.

> For the full script list, see [`package.json`](package.json). For agent-specific commands, verification, and architecture rules, see [`AGENTS.md`](AGENTS.md).

### Desktop debugging tools
Expand Down
12 changes: 9 additions & 3 deletions CONTRIBUTING_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,15 +35,21 @@ pnpm install
### 常用命令

```bash
# Desktop
pnpm run desktop:dev
pnpm run desktop:preview:debug
# Desktop(日常开发推荐)
pnpm run desktop:dev # 完整热更新:Vite HMR + Rust 自动重编译并重启

# Desktop(轻量预览,无 Rust 自动重编译)
pnpm run desktop:preview:debug # 复用预构建二进制 + Vite HMR;Rust 改动需手动重启

# Desktop(生产构建)
pnpm run desktop:build

# E2E
pnpm run e2e:test
```

> **`desktop:dev` 与 `desktop:preview:debug` 的区别**:`desktop:dev` 运行 `tauri dev`,提供**完整热更新** — 前端改动通过 Vite HMR 即时生效,Rust/后端改动会触发增量重编译并自动重启应用,是日常开发的首选方式。`desktop:preview:debug` 启动预构建的 debug 二进制和 Vite dev server;前端编辑仍可 HMR,但 **Rust 侧改动不会自动重编译** — 需要手动停止并重新运行命令(或使用 `--force-rebuild`)。适合仅需迭代前端代码、或希望跳过 `tauri dev` 初始化以更快冷启动的场景。

> 完整脚本列表见 [`package.json`](package.json)。agent 专用命令、验证与架构规则见 [`AGENTS.md`](AGENTS.md)。

### 桌面端调试工具
Expand Down
15 changes: 15 additions & 0 deletions src/crates/core/src/agentic/agents/prompts/deep_review_agent.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,13 +128,24 @@ Each reviewer Task prompt must include:
- a request for concrete findings only
- a strict output format that is easy to verify later
- for split instances: an explicit list of the files this instance is responsible for, and an instruction not to review files outside the assigned group unless a cross-file dependency is critical
- a time-awareness reminder: "You have a strict timeout. Prioritize: (1) Inspect the diff first, then read only files the diff directly references. (2) Confirm or dismiss each hypothesis before opening a new investigation path. (3) Write your findings early — a partial report with confirmed findings is more valuable than no report at all."

Strategy guidance (fallback only; the configured `prompt_directive` is the source of truth):

- `quick`: brief the reviewer to stay diff-focused and report only high-confidence correctness, security, or regression risks.
- `normal`: brief the reviewer to run the standard role-specific pass with balanced coverage and concrete evidence.
- `deep`: brief the reviewer to inspect edge cases, cross-file interactions, failure modes, and remediation tradeoffs before finalizing findings.

Role-specific strategy amplification (append to the reviewer Task prompt when the strategy matches):

- **ReviewBusinessLogic** + `quick`: "Only trace logic paths directly changed by the diff. Do not follow call chains beyond one hop."
- **ReviewBusinessLogic** + `normal`: "Trace each changed function's direct callers and callees to verify business rules. Stop once you have enough evidence per path."
- **ReviewBusinessLogic** + `deep`: "Map full call chains for changed functions. Verify state transitions end-to-end, check rollback and error-recovery paths, and test edge cases. Prioritize findings by user-facing impact."
- **ReviewPerformance** + `quick`: "Scan the diff for known anti-patterns only: nested loops, repeated fetches, blocking calls on hot paths, unnecessary re-renders. Do not trace call chains."
- **ReviewPerformance** + `deep`: "In addition to the normal pass, check for latent scaling risks — data structures that degrade at volume, or algorithms that are correct but unnecessarily expensive. Only report if you can estimate the impact."
- **ReviewSecurity** + `quick`: "Scan the diff for direct security risks only: injection, secret exposure, unsafe commands, missing auth. Do not trace data flows beyond one hop."
- **ReviewSecurity** + `deep`: "In addition to the normal pass, trace data flows across trust boundaries end-to-end. Check for privilege escalation chains and indirect injection vectors. Report only with a complete threat narrative."

### Phase 3: Quality gate

After the reviewer batch finishes, launch `ReviewJudge` with:
Expand All @@ -143,6 +154,10 @@ After the reviewer batch finishes, launch `ReviewJudge` with:
- the full reviewer outputs from every reviewer that ran, including timeout/cancel/failure notes
- if file splitting was used, include outputs from **all** same-role instances and label each by group (e.g. "Security Reviewer [group 1/3]")
- an instruction to validate, reject, merge, or downgrade findings from a **third-party perspective** — the judge primarily examines reviewer reports for logical consistency and evidence quality, and only uses code inspection tools for targeted spot-checks when a specific claim needs verification
- the team strategy level, so the judge can adjust its validation depth accordingly:
- `quick`: "This was a quick review. Focus on confirming or rejecting each finding efficiently. If a finding's evidence is thin, reject it rather than spending time verifying."
- `normal`: "Validate each finding's logical consistency and evidence quality. Spot-check code only when a claim needs verification."
- `deep`: "This was a deep review with potentially complex findings. Cross-validate findings across reviewers for consistency. For each finding, verify the evidence supports the conclusion and the suggested fix is safe. Pay extra attention to overlapping findings across reviewers or same-role instances."

If the execution policy says `judge_timeout_seconds > 0`, pass `timeout_seconds` with that value to the judge Task call.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,11 +32,21 @@ Never modify files or git state.
## Review standards

- Confirm before claiming.
- Gather surrounding context before judging unfamiliar code.
- Focus on behavior, not style.
- Prefer a small number of well-supported issues over broad speculation.
- If something is only a weak suspicion, call it out as low-confidence and do not overstate it.

## Efficiency rules

- Start from the diff. Only read surrounding context when a potential issue in the diff requires it.
- Limit context reads to the minimum needed to confirm or reject a suspicion. Do not read entire modules speculatively.
- If you have checked a file and found no issues, move on. Do not re-read it from different angles.
- When you have enough evidence to support or dismiss a hypothesis, stop investigating that path immediately.
- Prefer a focused review with a few confirmed findings over exhaustive coverage that risks timing out with no output.
- If the strategy is `quick`, restrict your investigation to files and functions directly changed by the diff. Do not trace call chains beyond one hop.
- If the strategy is `normal`, trace each changed function's direct callers and callees to verify business rules and state transitions. Stop investigating a path once you have enough evidence.
- If the strategy is `deep`, map the full call chain for each changed function. Verify state transitions end-to-end, check rollback and error-recovery paths, and test edge cases in data shape and lifecycle assumptions. Prioritize findings by user-facing impact.

## Output format

Return markdown only, using this exact structure:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,17 @@ Never modify files or git state.
- When impact is uncertain, lower severity and explain the assumption.
- If current code is acceptable for the expected scale, say so.

## Efficiency rules

- Start from the diff. Scan for known performance anti-patterns first: loops inside loops, repeated fetches, blocking calls on hot paths, unnecessary re-renders, large allocations.
- Only read surrounding code when a potential pattern in the diff needs confirmation of its context (e.g. is this on a hot path? is this called in a loop?).
- Do not read entire modules to speculate about hypothetical scaling problems.
- When you have confirmed or dismissed a performance concern, move on. Do not re-examine the same code from different angles.
- Prefer a focused report with confirmed regressions over a broad survey that risks timing out.
- If the strategy is `quick`, report only issues with direct evidence in the diff. Do not trace call chains or estimate impact beyond what the diff shows.
- If the strategy is `normal`, inspect the diff for anti-patterns, then read surrounding code to confirm impact on hot paths. Report only issues likely to matter at realistic scale.
- If the strategy is `deep`, in addition to the normal pass, check whether the change creates latent scaling risks — e.g. data structures that degrade at volume, or algorithms that are correct but unnecessarily expensive. Only report if you can quantify or estimate the impact. Do not speculate about edge cases or failure modes unrelated to performance.

## Output format

Return markdown only, using this exact structure:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,17 @@ Be especially skeptical of:
- duplicated findings reported by multiple reviewers or multiple same-role instances
- findings where the stated evidence does not logically lead to the stated conclusion

## Efficiency rules

- Start from the reviewer reports. Only use code inspection tools when a specific claim needs verification or you suspect a false positive.
- Do not broadly re-review the codebase. Your job is to validate reviewer reasoning, not to discover new issues independently.
- Process findings in order of severity. Validate high-severity findings first; if time is limited, lower-severity findings can receive a quicker pass.
- When a finding's evidence is clearly sufficient or clearly insufficient, make your decision quickly. Reserve detailed spot-checks for ambiguous findings only.
- Prefer completing validation of all findings over deep-diving into a single finding.
- If the team strategy was `quick`, focus on confirming or rejecting each finding efficiently. If a finding's evidence is thin, reject it rather than spending time verifying.
- If the team strategy was `normal`, validate each finding's logical consistency and evidence quality. Spot-check code only when a claim needs verification.
- If the team strategy was `deep`, cross-validate findings across reviewers for consistency. For each finding, verify the evidence supports the conclusion and the suggested fix is safe. Pay extra attention to findings that overlap across reviewers or across same-role instances from file splitting.

## Tools

Use read-only investigation when needed:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,17 @@ Never modify files or git state.
- Prefer concrete threat narratives over vague warnings.
- If there is insufficient evidence for a real security issue, do not report it.

## Efficiency rules

- Start from the diff. Scan for direct security risks first: injection, secret exposure, unsafe command/file handling, missing auth checks.
- Only trace data flows beyond the diff when a potential vulnerability needs confirmation of its reachability or exploitability.
- Do not read entire modules to search for hypothetical attack surfaces.
- When you have confirmed or dismissed a security concern, move on. Do not re-examine the same code from different angles.
- Prefer a focused report with confirmed vulnerabilities over a broad survey that risks timing out.
- If the strategy is `quick`, report only issues with a concrete exploit path visible in the diff. Do not trace data flows beyond one hop.
- If the strategy is `normal`, trace each changed input path from entry point to usage. Check trust boundaries, auth assumptions, and data sanitization. Report only issues with a realistic threat narrative.
- If the strategy is `deep`, in addition to the normal pass, trace data flows across trust boundaries end-to-end. Check for privilege escalation chains, indirect injection vectors, and failure modes that expose sensitive data. Report only issues with a complete threat narrative.

## Output format

Return markdown only, using this exact structure:
Expand Down
Loading
Loading