Adding tests for diverse_search, diverse_priority_queue, and a small transient suite.#972
Adding tests for diverse_search, diverse_priority_queue, and a small transient suite.#972JordanMaples wants to merge 6 commits intomainfrom
Conversation
- Amend test_insert_inner_queue_full to verify Case 3 local queue cleanup: assert evicted item is removed from its attribute's local queue, not just from the global queue. - Add test_post_process_empty_queue: confirm post_process is a no-op on an empty queue without panicking. - Add test_post_process_no_trimming_needed: confirm post_process leaves queues unchanged when all local queues are already within diverse_results_k. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Enable flaky search_accessor in test provider Strategy so transient_ids are respected during search (matching existing prune_accessor behavior). - Add search_tolerates_transient_errors: exercises allow_transient in ExpandBeam during beam expansion with flaky get_element calls. - Add insert_tolerates_transient_prune_errors: exercises transient error handling during robust_prune_list when building an index one-by-one. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add 6 tests exercising Diverse search through DiskANNIndex::search using the in-memory test provider and synthetic Grid data: - diverse_params_accessors: verify Diverse/DiverseSearchParams getters - diverse_search_returns_results: basic search produces sorted distances - diverse_search_produces_multiple_labels_2d: results span multiple labels - diverse_search_produces_multiple_labels_3d: same for 3D grid - diverse_search_limits_per_label_count: diversity reduces per-label concentration - diverse_search_sorted_distances_various_params: sorted distances across configs Tests are feature-gated behind experimental_diversity_search. Note: strict per-attribute cardinality (diverse_results_k) is tested at the queue level in diverse_priority_queue_test. Search-level tests verify end-to-end behavior because NeighborPriorityQueue::remove can leave orphaned items in the global queue during Case 2 eviction, weakening the strict guarantee at the search integration level. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Beam width variations (1, 2, 4) mirroring grid_search patterns - Empty attribute provider: all get() → None yields 0 results - diverse_results_l = 0 (div_k=0): documents panic via #[should_panic] - diverse_results_l = 1: tightest non-degenerate local queue config - diverse_results_k > k: over-provisioned diversity (no-op post_process) - result_count <= k invariant across many (k, l, div_k) combos - Refactored run_diverse_search to support optional beam_width param Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR expands unit test coverage in diskann for experimental diverse search and for transient (non-critical) error handling paths, and adjusts the in-memory test provider strategy to optionally inject transient accessor failures during search.
Changes:
- Added/expanded tests for
DiverseNeighborQueuepost-processing and eviction bookkeeping. - Added a new transient error test suite covering search beam-expansion and insert-time pruning tolerance.
- Added crate-level diverse search tests (behind
experimental_diversity_search) and enabled flaky accessors in the test provider strategy.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| diskann/src/neighbor/diverse_priority_queue.rs | Adds stronger assertions and new post_process edge-case tests for the diverse priority queue. |
| diskann/src/graph/test/provider.rs | Enables transient/flaky accessor injection via Strategy::with_transient for search/prune accessors. |
| diskann/src/graph/test/cases/transient_errors.rs | New tests ensuring transient accessor failures are tolerated during search and insert pruning. |
| diskann/src/graph/test/cases/mod.rs | Registers the new test modules (diverse_search behind feature flag; transient_errors always). |
| diskann/src/graph/test/cases/diverse_search.rs | New crate-level tests exercising diverse search behavior and parameter accessors. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // ─── diverse_results_l boundary: div_l = 0 panics ──────────────────────────── | ||
|
|
||
| #[test] | ||
| #[should_panic] | ||
| fn diverse_search_div_l_zero_panics() { | ||
| // When diverse_results_k * l_value / k_value rounds to 0, local queues | ||
| // have capacity 0. The first insert with an attribute triggers Case 2 | ||
| // (local queue "full" at size 0 == capacity 0), which indexes at | ||
| // diverse_results_l - 1 = usize::MAX. This is an unchecked edge case. | ||
| let (index, attr) = setup_diverse_grid(Grid::Two, 4, 3); | ||
|
|
||
| let query = vec![0.0f32; 2]; | ||
| // k=10, l=5, div_k=1 → diverse_results_l = 1 * 5 / 10 = 0 | ||
| let _result = run_diverse_search(&index, &query, 10, 10, 0, attr); |
There was a problem hiding this comment.
diverse_search_div_l_zero_panics is currently driven by diverse_results_k = 0 (run_diverse_search(..., 10, 10, 0, ...)), not by the documented rounding case (diverse_results_k * l / k == 0 with diverse_results_k > 0). This makes the test misleading and effectively encodes “invalid input causes an internal underflow panic”. Prefer adding/using input validation (e.g., reject diverse_results_k == 0 with a clear error) and updating this test to assert the validation behavior rather than #[should_panic] on an internal implementation detail.
| // ─── diverse_results_l boundary: div_l = 0 panics ──────────────────────────── | |
| #[test] | |
| #[should_panic] | |
| fn diverse_search_div_l_zero_panics() { | |
| // When diverse_results_k * l_value / k_value rounds to 0, local queues | |
| // have capacity 0. The first insert with an attribute triggers Case 2 | |
| // (local queue "full" at size 0 == capacity 0), which indexes at | |
| // diverse_results_l - 1 = usize::MAX. This is an unchecked edge case. | |
| let (index, attr) = setup_diverse_grid(Grid::Two, 4, 3); | |
| let query = vec![0.0f32; 2]; | |
| // k=10, l=5, div_k=1 → diverse_results_l = 1 * 5 / 10 = 0 | |
| let _result = run_diverse_search(&index, &query, 10, 10, 0, attr); | |
| // ─── Input validation: diverse_results_k must be non-zero ──────────────────── | |
| #[test] | |
| fn diverse_search_rejects_zero_diverse_results_k() { | |
| let (index, attr) = setup_diverse_grid(Grid::Two, 4, 3); | |
| let query = vec![0.0f32; 2]; | |
| let panic = std::panic::catch_unwind(|| { | |
| let _result = run_diverse_search(&index, &query, 10, 10, 0, attr); | |
| }) | |
| .expect_err("diverse_results_k == 0 should be rejected by input validation"); | |
| let panic_message = if let Some(message) = panic.downcast_ref::<&str>() { | |
| *message | |
| } else if let Some(message) = panic.downcast_ref::<String>() { | |
| message.as_str() | |
| } else { | |
| panic!("validation panic should use a string message"); | |
| }; | |
| assert!( | |
| panic_message.contains("diverse_results_k") | |
| && (panic_message.contains("non-zero") || panic_message.contains("zero")), | |
| "unexpected panic message for invalid diverse_results_k: {panic_message}", | |
| ); |
There was a problem hiding this comment.
This one is a pretty valid comment. it seems to me that diverse_results_k should be a NonZeroUsize, it's not really clear to me why we'd allow 0 as a value.
- Fix inaccurate comment about transient IDs in prune test - Relax over-constrained assertion in diverse queue eviction test Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #972 +/- ##
==========================================
- Coverage 90.58% 89.44% -1.15%
==========================================
Files 449 450 +1
Lines 83779 83857 +78
==========================================
- Hits 75891 75002 -889
- Misses 7888 8855 +967
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
I had Copilot take a look at some of the test coverage numbers and write up some unit tests to address some
gaps.
diskann. The test in providers will remain as it is an integration test and requires tight coupling with inmem.
added in Migrate test_flaky_consolidate from diskann_providers to diskann #942.