feat(test): add WHERE-prefix view infrastructure for cross-command testing (#5505)#5508
Conversation
…sting Introduces the foundational infrastructure proposed in RFC opensearch-project#5505 to catch bugs where PPL commands break when preceded by a `where` filter — a common gap since individual-command tests pass but combined queries fail. ## What changed ### New extended test indices - `accounts_extended.json` / `bank_extended.json`: copies of the top two test datasets with `"_is_real":true` on every real record plus 3 synthetic records with `"_is_real":false` that should never appear in results - Corresponding index mapping files that add a `_is_real` boolean field alongside the existing schema ### Infrastructure additions - `TestsConstants`: new `TEST_INDEX_ACCOUNT_EXTENDED` and `TEST_INDEX_BANK_EXTENDED` constants - `TestUtils`: `getAccountExtendedIndexMapping()` and `getBankExtendedIndexMapping()` mapping helpers - `SQLIntegTestCase.Index`: `ACCOUNT_EXTENDED` and `BANK_EXTENDED` enum entries wiring constants → mapping → data file - `PPLIntegTestCase.sourceView(extendedIndex, query)`: helper that builds the view chain `source=<ext> | where _is_real | fields - _is_real | <query>` ### Proof-of-concept parametrization in `FieldsCommandIT` `testBasicFieldSelection`, `testMultipleFieldSelection`, and `testSpecialDataTypes` are converted to `@ParameterizedTest` tests that run twice: once directly against the base index, and once through the WHERE-prefix view of the extended index. The assertions are identical — the fake `_is_real=false` rows are filtered before reaching the user query. ## Why `fields - fieldname` works without Calcite `hasEnhancedFieldFeatures` (AstBuilder.java) gates only wildcards and space-delimited fields — not the MINUS/exclusion form — so the view chain is safe to use in the default (non-Calcite) test mode. Signed-off-by: Radhakrishnan Pachyappan <gingeekrishna@gmail.com>
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds “extended” versions of the Account and Bank test indices (with an _is_real flag) and updates PPL integration tests to run the same assertions against both the base indices and a where _is_real-prefixed view to better catch multi-command chain issues.
Changes:
- Added extended index mappings and NDJSON datasets for
accountandbank. - Introduced constants and index enum entries for the new extended indices.
- Parameterized
FieldsCommandITto run tests against both direct sources and a WHERE-prefixed view that filters out synthetic rows.
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| integ-test/src/test/resources/indexDefinitions/bank_extended_index_mapping.json | New mapping for extended bank index including _is_real and extra fields used by extended dataset. |
| integ-test/src/test/resources/indexDefinitions/account_extended_index_mapping.json | New mapping for extended account index including _is_real and fielddata/keyword subfields for test use. |
| integ-test/src/test/resources/bank_extended.json | New NDJSON bulk dataset for the extended bank index with real + synthetic rows. |
| integ-test/src/test/resources/accounts_extended.json | New large NDJSON bulk dataset for the extended account index with real + synthetic rows. |
| integ-test/src/test/java/org/opensearch/sql/ppl/PPLIntegTestCase.java | Adds a helper to build a WHERE-prefixed “view” source over an extended index. |
| integ-test/src/test/java/org/opensearch/sql/ppl/FieldsCommandIT.java | Parameterizes tests to run against both base indices and an extended-index filtered view. |
| integ-test/src/test/java/org/opensearch/sql/legacy/TestsConstants.java | Adds constants for extended index names. |
| integ-test/src/test/java/org/opensearch/sql/legacy/TestUtils.java | Adds mapping loaders for the new extended index mapping files. |
| integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java | Registers ACCOUNT_EXTENDED and BANK_EXTENDED in the Index enum for test setup/loading. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
PR Reviewer Guide 🔍(Review updated until commit 8bd33a8)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to 8bd33a8
Previous suggestionsSuggestions up to commit 006643b
|
- Fix UTF-8 BOM in accounts_extended.json: PowerShell Out-File added a
BOM (0xEF 0xBB 0xBF) which breaks bulk NDJSON ingestion; regenerated
with UTF8Encoding(false) so the file starts with the literal {
- Deduplicate @MethodSource providers: basicFieldSelectionSources and
multipleFieldSelectionSources were identical; replaced both with a
shared accountIndexSources() stream; added bankIndexSources() for bank
tests; both reuse the sourceView() helper
- Remove unused label parameter: dropped the label string from
Arguments.of() and test method signatures; the display name now shows
the full query source string via @ParameterizedTest(name="querySource={0}")
- Add sourceView(String) overload to PPLIntegTestCase: the single-arg
form returns the view prefix without a trailing pipe so callers can
append commands with " | <command>"; the two-arg form delegates to it
Signed-off-by: Radhakrishnan Pachyappan <gingeekrishna@gmail.com>
|
Persistent review updated to latest commit ec626fb |
…dex type names
- Add INDEX_VIEWS map and sourceViews(baseIndex) to PPLIntegTestCase so
new views only need to be registered in one place
- Remove unused sourceView(extendedIndex, query) overload
- Update FieldsCommandIT to delegate to sourceViews() instead of
building streams inline
- Fix ACCOUNT_EXTENDED and BANK_EXTENDED type names to avoid collisions
with ACCOUNT ("account") and BANK ("account")
|
Thanks for the review @Swiddis! Addressed all four comments in the latest commit:
|
|
Persistent review updated to latest commit 8bd33a8 |
Part of #5505
What does this PR do?
Implements phase 1 of the RFC: foundational infrastructure for WHERE-prefix view testing, with a proof-of-concept parametrization of
FieldsCommandIT.The core idea: extend the top test indices with a
_is_realboolean flag, then run the same test assertions through:This catches bugs where commands fail only when preceded by a
wherefilter — a gap identified in issues like #5482 and #5483.Changes
New extended test datasets
accounts_extended.json— 1000 real account records (_is_real: true) + 3 synthetic records (_is_real: false)bank_extended.json— 7 real bank records (_is_real: true) + 3 synthetic records (_is_real: false)_is_real: booleanInfrastructure
TestsConstants.javaTEST_INDEX_ACCOUNT_EXTENDED,TEST_INDEX_BANK_EXTENDEDconstantsTestUtils.javagetAccountExtendedIndexMapping(),getBankExtendedIndexMapping()helpersSQLIntegTestCase.IndexACCOUNT_EXTENDED,BANK_EXTENDEDenum entriesPPLIntegTestCase.javasourceView(extendedIndex, query)helper methodProof-of-concept:
FieldsCommandITThree tests converted to
@ParameterizedTestrunning on both the direct index and the WHERE-prefix view:testBasicFieldSelection(schema assertions)testMultipleFieldSelection(row assertions withhead 3)testSpecialDataTypes(all 7 bank birthdate rows)Why
fields - fieldnameworks without CalcitehasEnhancedFieldFeatures()inAstBuilder.javagates only wildcards and space-delimited fields — not the MINUS/exclusion form — so the view chain is safe in the default (non-Calcite) test mode.