Adding tests for diverse_search, diverse_priority_queue, and a small transient suite. by JordanMaples · Pull Request #972 · microsoft/DiskANN

JordanMaples · 2026-04-22T22:10:46Z

I had Copilot take a look at some of the test coverage numbers and write up some unit tests to address some
gaps.

diverse_search was only gaining test coverage in diskann-providers, so unit tests were added directly in
diskann. The test in providers will remain as it is an integration test and requires tight coupling with inmem.
Added some transient error tests that were gaps highlighted following the implementation of the flaky tests
added in Migrate test_flaky_consolidate from diskann_providers to diskann #942.
Added some extra validation in diverse_priority_queue.

- Amend test_insert_inner_queue_full to verify Case 3 local queue cleanup: assert evicted item is removed from its attribute's local queue, not just from the global queue. - Add test_post_process_empty_queue: confirm post_process is a no-op on an empty queue without panicking. - Add test_post_process_no_trimming_needed: confirm post_process leaves queues unchanged when all local queues are already within diverse_results_k. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Enable flaky search_accessor in test provider Strategy so transient_ids are respected during search (matching existing prune_accessor behavior). - Add search_tolerates_transient_errors: exercises allow_transient in ExpandBeam during beam expansion with flaky get_element calls. - Add insert_tolerates_transient_prune_errors: exercises transient error handling during robust_prune_list when building an index one-by-one. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Add 6 tests exercising Diverse search through DiskANNIndex::search using the in-memory test provider and synthetic Grid data: - diverse_params_accessors: verify Diverse/DiverseSearchParams getters - diverse_search_returns_results: basic search produces sorted distances - diverse_search_produces_multiple_labels_2d: results span multiple labels - diverse_search_produces_multiple_labels_3d: same for 3D grid - diverse_search_limits_per_label_count: diversity reduces per-label concentration - diverse_search_sorted_distances_various_params: sorted distances across configs Tests are feature-gated behind experimental_diversity_search. Note: strict per-attribute cardinality (diverse_results_k) is tested at the queue level in diverse_priority_queue_test. Search-level tests verify end-to-end behavior because NeighborPriorityQueue::remove can leave orphaned items in the global queue during Case 2 eviction, weakening the strict guarantee at the search integration level. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Beam width variations (1, 2, 4) mirroring grid_search patterns - Empty attribute provider: all get() → None yields 0 results - diverse_results_l = 0 (div_k=0): documents panic via #[should_panic] - diverse_results_l = 1: tightest non-degenerate local queue config - diverse_results_k > k: over-provisioned diversity (no-op post_process) - result_count <= k invariant across many (k, l, div_k) combos - Refactored run_diverse_search to support optional beam_width param Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

This PR expands unit test coverage in diskann for experimental diverse search and for transient (non-critical) error handling paths, and adjusts the in-memory test provider strategy to optionally inject transient accessor failures during search.

Changes:

Added/expanded tests for DiverseNeighborQueue post-processing and eviction bookkeeping.
Added a new transient error test suite covering search beam-expansion and insert-time pruning tolerance.
Added crate-level diverse search tests (behind experimental_diversity_search) and enabled flaky accessors in the test provider strategy.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
diskann/src/neighbor/diverse_priority_queue.rs	Adds stronger assertions and new post_process edge-case tests for the diverse priority queue.
diskann/src/graph/test/provider.rs	Enables transient/flaky accessor injection via `Strategy::with_transient` for search/prune accessors.
diskann/src/graph/test/cases/transient_errors.rs	New tests ensuring transient accessor failures are tolerated during search and insert pruning.
diskann/src/graph/test/cases/mod.rs	Registers the new test modules (diverse_search behind feature flag; transient_errors always).
diskann/src/graph/test/cases/diverse_search.rs	New crate-level tests exercising diverse search behavior and parameter accessors.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-22T22:15:57Z

+// ─── diverse_results_l boundary: div_l = 0 panics ────────────────────────────
+
+#[test]
+#[should_panic]
+fn diverse_search_div_l_zero_panics() {
+    // When diverse_results_k * l_value / k_value rounds to 0, local queues
+    // have capacity 0. The first insert with an attribute triggers Case 2
+    // (local queue "full" at size 0 == capacity 0), which indexes at
+    // diverse_results_l - 1 = usize::MAX. This is an unchecked edge case.
+    let (index, attr) = setup_diverse_grid(Grid::Two, 4, 3);
+
+    let query = vec![0.0f32; 2];
+    // k=10, l=5, div_k=1 → diverse_results_l = 1 * 5 / 10 = 0
+    let _result = run_diverse_search(&index, &query, 10, 10, 0, attr);


diverse_search_div_l_zero_panics is currently driven by diverse_results_k = 0 (run_diverse_search(..., 10, 10, 0, ...)), not by the documented rounding case (diverse_results_k * l / k == 0 with diverse_results_k > 0). This makes the test misleading and effectively encodes “invalid input causes an internal underflow panic”. Prefer adding/using input validation (e.g., reject diverse_results_k == 0 with a clear error) and updating this test to assert the validation behavior rather than #[should_panic] on an internal implementation detail.

Suggested change

// ─── diverse_results_l boundary: div_l = 0 panics ────────────────────────────

#[test]

#[should_panic]

fn diverse_search_div_l_zero_panics() {

// When diverse_results_k * l_value / k_value rounds to 0, local queues

// have capacity 0. The first insert with an attribute triggers Case 2

// (local queue "full" at size 0 == capacity 0), which indexes at

// diverse_results_l - 1 = usize::MAX. This is an unchecked edge case.

let (index, attr) = setup_diverse_grid(Grid::Two, 4, 3);

let query = vec![0.0f32; 2];

// k=10, l=5, div_k=1 → diverse_results_l = 1 * 5 / 10 = 0

let _result = run_diverse_search(&index, &query, 10, 10, 0, attr);

// ─── Input validation: diverse_results_k must be non-zero ────────────────────

#[test]

fn diverse_search_rejects_zero_diverse_results_k() {

let (index, attr) = setup_diverse_grid(Grid::Two, 4, 3);

let query = vec![0.0f32; 2];

let panic = std::panic::catch_unwind(|| {

let _result = run_diverse_search(&index, &query, 10, 10, 0, attr);

})

.expect_err("diverse_results_k == 0 should be rejected by input validation");

let panic_message = if let Some(message) = panic.downcast_ref::<&str>() {

*message

} else if let Some(message) = panic.downcast_ref::<String>() {

message.as_str()

} else {

panic!("validation panic should use a string message");

};

assert!(

panic_message.contains("diverse_results_k")

&& (panic_message.contains("non-zero") || panic_message.contains("zero")),

"unexpected panic message for invalid diverse_results_k: {panic_message}",

);

This one is a pretty valid comment. it seems to me that diverse_results_k should be a NonZeroUsize, it's not really clear to me why we'd allow 0 as a value.

- Fix inaccurate comment about transient IDs in prune test - Relax over-constrained assertion in diverse queue eviction test Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

codecov-commenter · 2026-04-22T22:38:50Z

Codecov Report

❌ Patch coverage is 97.46835% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.44%. Comparing base (56e3b7f) to head (912b605).

Files with missing lines	Patch %	Lines
diskann/src/graph/test/cases/transient_errors.rs	97.36%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #972      +/-   ##
==========================================
- Coverage   90.58%   89.44%   -1.15%     
==========================================
  Files         449      450       +1     
  Lines       83779    83857      +78     
==========================================
- Hits        75891    75002     -889     
- Misses       7888     8855     +967

Flag	Coverage Δ
miri	`89.44% <97.46%> (-1.15%)`	⬇️
unittests	`89.28% <97.46%> (-1.27%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
diskann/src/graph/test/provider.rs	`94.41% <100.00%> (+0.01%)`	⬆️
diskann/src/graph/test/cases/transient_errors.rs	`97.36% <97.36%> (ø)`

... and 38 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

JordanMaples and others added 5 commits April 22, 2026 14:18

fmt

83b0124

JordanMaples requested review from a team and Copilot April 22, 2026 22:10

Copilot started reviewing on behalf of JordanMaples April 22, 2026 22:11 View session

Copilot AI reviewed Apr 22, 2026

View reviewed changes

Address PR review comments (#972)

912b605

- Fix inaccurate comment about transient IDs in prune test - Relax over-constrained assertion in diverse queue eviction test Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding tests for diverse_search, diverse_priority_queue, and a small transient suite.#972

Adding tests for diverse_search, diverse_priority_queue, and a small transient suite.#972
JordanMaples wants to merge 6 commits intomainfrom
jordanmaples/misc_remaining_tests

JordanMaples commented Apr 22, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

JordanMaples Apr 22, 2026

Uh oh!

codecov-commenter commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

JordanMaples commented Apr 22, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

JordanMaples Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Apr 22, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants