Skip to content

fix(uffs-core): drop defensive Vec<u16> clone in path-only / path-sorted top-N hot path (Phase 6c, refs #193)#282

Merged
githubrobbi merged 1 commit into
mainfrom
chore/phase-6c-clone-taxonomy
May 19, 2026
Merged

fix(uffs-core): drop defensive Vec<u16> clone in path-only / path-sorted top-N hot path (Phase 6c, refs #193)#282
githubrobbi merged 1 commit into
mainfrom
chore/phase-6c-clone-taxonomy

Conversation

@githubrobbi
Copy link
Copy Markdown
Collaborator

Phase 6 sub-phase 6c deliverable per phase_6_ownership_borrowing_allocation_implementation_plan.md §5c + §3.2 (local plan, gitignored under /docs/dev/).

What this changes

Two surgical fixes inside the per-drive top-N execution path of uffs-core::search:

Site File Refactor
path_only_top_n inner-loop ext-id iteration crates/uffs-core/src/search/query/path_only_top_n.rs:516 Drop search_filters.resolved_ext_ids.clone() → borrow immutably
path_sorted_top_n inner-loop ext-id iteration crates/uffs-core/src/search/query/path_sorted_top_n.rs:252 Drop search_filters.resolved_ext_ids.clone() → borrow immutably

Both sites had a defensive .clone() justified by an inline comment anticipating a future re-borrow of search_filters inside the loop body. That re-borrow never landed — the inner-loop body reads only drive.*, hide_system, hide_ads, and writes to local candidates, never touching search_filters. The immutable borrow of resolved_ext_ids is naturally released when the inner loop ends, so the next outer iteration's call to resolve_ext_ids_for_drive (which needs &mut search_filters) compiles without conflict.

Why this is the right shape

Phase 6c — the clone taxonomy audit — classifies all 253 prod .clone() sites by:

  • α — Arc clone for actor-model task dispatch (KEEP — Cargo.toml [workspace.lints.clippy] clone_on_ref_ptr = "deny" already enforces the explicit Arc::clone(&x) form)
  • β — genuine ownership-fence (KEEP — caller has &T, API needs T)
  • γ — error / log context carry-along (KEEP — error paths are cold)
  • δ — hot-path anti-pattern (FIX — restructure ownership)
  • ε#[cfg(test)] only (out of scope)

These two sites are the only cat-δ candidates surfaced by the audit that are surgically fixable without an API redesign. Two larger cat-δ opportunities are documented in the findings doc as follow-up phases:

  • Future-AParsedColumns::push_record(_expanded) taking &ParsedRecord forces a String clone per record in the MFT parse hot path. Refactor to consume by value would change a pub signature; deferred to a dedicated sub-phase gated on the §6g bench numbers.
  • Future-BFastResolver::resolve_cached returns String; migrating the path cache to Arc<str> would replace per-cache-hit String clones with refcount bumps. API-level change; deferred.

The remaining 251 sites are legitimate by construction and documented in the per-crate breakdown of the (gitignored) findings doc at docs/dev/baseline/2026-05-12/phase_6_clone_taxonomy_findings.md.

Verification

  • just lint-pre-push ✅ (89s): every gate green — fmt, file-size, gates-drift, hooks-drift, workflow-drift, fast-drift, manifest-drift, commit-subjects, vet, vet-audit-discipline, machete, typos, reuse, cargo-check, lint-ci, lint-prod, lint-tests, rustdoc, doc-tests, tests, smoke, deny, lint-ci-windows.
  • cargo nextest run -p uffs-core ✅ 817 passed, 3 skipped, 0 failed — same numbers as pre-fix.
  • Search-suite coverage: 374 search-tests pass; the affected paths are exercised by parallel_drive_scan_merges_global_top_n_across_drives, parallel_drive_scan_respects_limit_smaller_than_per_drive_count, match_all_explicit_limit_caps_results, unlimited_returns_more_than_capped, match_all_none_limit_is_unlimited, top_n_sort_by_system_flag.
  • Strict-clippy posture unchanged. No new #[allow] / #[expect] introduced; cargo clippy --workspace --all-targets --message-format=json emits zero diagnostics for the 9 clone-family lints across the workspace.

Behavior & contracts

  • Both affected functions are pub(crate) (select_path_only_top_n / select_path_sorted_top_n). Zero public-API surface change.
  • Observable behavior — top-N rows, ordering, candidate selection, filter precedence — is unchanged. Only the per-query allocation count drops by one Vec<u16> per participating drive (~7 small allocs per query in typical 7-drive configurations).
  • Comment text at each site is replaced with a multi-line block documenting the new (correct) borrow invariant and cross-referencing Phase 6c so the reason-text survives future code review.

Allocation savings

Per top-N query with extension filtering active:

  • Before: N_drives × Vec<u16>::with_capacity(N_ext_ids) allocations per query (7 small allocs in the typical 7-drive case)
  • After: zero — the per-drive Vec<u16> lives in search_filters.resolved_ext_ids and is borrowed immutably

The per-query overhead is small but real, and the Vec was long enough lived (the full inner-loop scan duration) to defeat the small-block allocator's fast path. §6g bench refresh will quantify the wall-clock delta.

Strict-lint posture

  • No suppression hacks.
  • Both edits are 8-line comment blocks + a single-character source change (drop .clone()).
  • No new #[allow] / #[expect] reasons added anywhere.

refs #193

…ted top-N hot path (Phase 6c)

Remove the `search_filters.resolved_ext_ids.clone()` allocation at the
top of the per-drive inner loop in both `path_only_top_n` and
`path_sorted_top_n`.  The comment anticipated a re-borrow of
`search_filters` inside the loop that never landed — the inner-loop
body reads only `drive.*`, `hide_system`, `hide_ads`, and writes to
local `candidates`, never touching `search_filters`.  The immutable
borrow of `resolved_ext_ids` is released when the inner loop ends,
freeing the next outer iteration's call to
`resolve_ext_ids_for_drive` (which needs `&mut search_filters`).

Phase 6 sub-phase 6c — category-δ (hot-path anti-pattern) per
`docs/dev/architecture/code_clean/phase_6_ownership_borrowing_allocation_implementation_plan.md`
§3.2.  The two sites surface in the per-query top-N execution path
(targeted-query hot path per plan §1.2 — `uffs-core::search` HIGH
priority).

Allocation saved: one `Vec<u16>` per drive per top-N query.  In
typical configurations (1-10 resolved extension IDs across 7 drives)
that is 7 small allocs per query, ~80-280 bytes total.  Per-query
overhead is small but real, and the `Vec` lives long enough to
defeat the small-block allocator's fast path.

Verification
------------

- `cargo check -p uffs-core` ✅
- `cargo clippy -p uffs-core --all-targets` ✅ (strict-lint posture
  unchanged; no new `#[allow]` / `#[expect]`)
- `cargo nextest run -p uffs-core` ✅ (817/817 tests pass, 3 skipped
  — same numbers as pre-fix)
- The per-drive top-N tests (`parallel_drive_scan_merges_global_top_n_across_drives`,
  `parallel_drive_scan_respects_limit_smaller_than_per_drive_count`,
  `match_all_explicit_limit_caps_results`,
  `unlimited_returns_more_than_capped`,
  `match_all_none_limit_is_unlimited`,
  `top_n_sort_by_system_flag`) exercise this path and all pass.

Strict-lint posture
-------------------

No suppression hacks.  The borrow checker accepts the change as-is
because the inner loop body genuinely does not touch `search_filters`.
The comment is updated to document the new (correct) invariant and
cross-reference the Phase 6c sub-phase.

Behavior & contracts
--------------------

No public API change; both functions are `pub(crate)` (`select_path_only_top_n` / `select_path_sorted_top_n`).  Observable
behavior (top-N rows, ordering, candidate selection) is unchanged —
only the per-query allocation count drops by one `Vec<u16>` per
participating drive.

refs #193
@githubrobbi githubrobbi merged commit 82a368e into main May 19, 2026
20 checks passed
@githubrobbi githubrobbi deleted the chore/phase-6c-clone-taxonomy branch May 19, 2026 14:24
githubrobbi added a commit that referenced this pull request May 19, 2026
…s-link (Phase 6f, refs #193) (#285)

Adds the workspace allocation contract as a standalone document at
`docs/architecture/code-quality/allocation_policy.md` and a §Allocation
policy section in `CONTRIBUTING.md` summarising the rule and linking
to the new doc.

Mirrors the Phase 5e `panic_policy.md` pattern (PR #278): a single
contract page captures the rule, the five-category decision tree
(α / β / γ / δ / ε), the lint posture, the per-site annotation
contract, the audit cadence, the workspace cross-references, and the
append-only decisions log.

Phase 6 sub-phase 6f deliverable per the phase-6 ownership /
borrowing / allocation plan §5f.

Contents
--------

The new `allocation_policy.md` documents:

1. The one-line rule: hot paths never allocate defensively; cold
   paths may; every prod `.clone()` / `format!()` / `to_owned()`
   must fit one of five blessed categories.
2. The five categories with patterns, verdicts, and worked
   examples from the workspace:
   - **α — Arc clone**:    KEEP (explicit `Arc::clone(&x)` form)
   - **β — Ownership fence**: KEEP (caller has `&T`, API needs `T`)
   - **γ — Error / log context**: KEEP (cold path)
   - **δ — Hot-path anti-pattern**: FIX (refactor, never suppress)
   - **ε — Test helper**:   KEEP (out of scope)
3. The lint posture: 5 clone-family lints at `deny` level
   (`redundant_clone`, `clone_on_ref_ptr`, `cloned_instead_of_copied`,
   `inefficient_to_string`, `unnecessary_to_owned`) plus the 7
   companion lints at `warn`.
4. The per-site annotation contract: every prod allocation site
   must satisfy one of three justification shapes (self-evident
   α, β/γ reason text, or δ refactor comment).
5. The audit cadence: `bash scripts/dev/clone_alloc_audit.sh` (the
   helper shipped in PR #281) produces the workspace inventory;
   absolute counts must not regress.
6. Workspace cross-references between this doc, `panic_policy.md`,
   `lint-posture.md`, `Cargo.toml [workspace.lints]`, and
   `clippy.toml`.
7. Append-only decisions log capturing every cat-δ refactor that
   has landed in Phase 6 so future contributors can trace the
   policy's evolution.

`CONTRIBUTING.md` gains a §Allocation policy section directly below
the existing §Panic policy section, summarising the rule + five
categories in roughly 200 words and linking to the full doc.

Verification
------------

- `just lint-pre-push` ✅ (95s): every gate green (fmt, file-size,
  gates-drift, hooks-drift, workflow-drift, fast-drift, manifest-drift,
  commit-subjects, vet, vet-audit-discipline, machete, typos, reuse,
  cargo-check, lint-ci, lint-prod, lint-tests, rustdoc, doc-tests,
  tests, smoke, deny, lint-ci-windows).
- Markdown links validated by the `typos` + `reuse` gates plus a
  manual check against `docs/architecture/code-quality/` (existing
  `panic_policy.md` and `lint-posture.md` cross-links resolve).
- Doc structure mirrors `panic_policy.md` 1:1 so contributors who
  know one know the other.

Behavior & contracts
--------------------

- No code change.  Documentation-only.
- The lint posture described in §2 is already in force as of
  Phase 6a's `Cargo.toml` (5 deny-level + 7 warn-level
  clone-family lints).  This commit codifies the *contract* behind
  the lints, not their values.
- The decisions log in §7 records PRs #281 / #282 / #283 / #284 as
  the cat-δ refactors that have landed; future entries follow the
  same append-only convention as `panic_policy.md` §7.

Strict-lint posture
-------------------

No code change; lint posture unchanged.

refs #193
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant