Skip to content

[CALCITE-7457] VALUES and SELECT produce different validation results for the same expression#4858

Open
dssysolyatin wants to merge 1 commit intoapache:mainfrom
dssysolyatin:fix-top-level-values-validation
Open

[CALCITE-7457] VALUES and SELECT produce different validation results for the same expression#4858
dssysolyatin wants to merge 1 commit intoapache:mainfrom
dssysolyatin:fix-top-level-values-validation

Conversation

@dssysolyatin
Copy link
Copy Markdown
Contributor

@dssysolyatin dssysolyatin commented Apr 1, 2026

Jira Link

CALCITE-7457

Changes Proposed

  • VALUES now routes through validateQuery, triggering TableConstructorNamespace.validateImpl validation (which calls validateValues and inferUnknownTypes). The fix is a validateCall override in SqlValuesOperator.
  • inferUnknownTypes no longer descends into subqueries (node.isA(SqlKind.QUERY)). Each query handles its own inferUnknownTypes during its own validation.
  • Added additional tests inside existing tests which can be dropped. Just to show that now SELECT and VALUES behaves the same way

@dssysolyatin dssysolyatin marked this pull request as ready for review April 1, 2026 09:01
@dssysolyatin dssysolyatin changed the title [CALCITE-7457] Top-level VALUES and SELECT produce different validation results for the same expression [CALCITE-7457] VALUES and SELECT produce different validation results for the same expression Apr 1, 2026
@dssysolyatin dssysolyatin added the LGTM-will-merge-soon Overall PR looks OK. Only minor things left. label Apr 2, 2026

/** Test case for
* <a href="https://issues.apache.org/jira/browse/CALCITE-7457">[CALCITE-7457]
* Top-level VALUES and SELECT produce different validation results
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The javadoc should be removed "Top-level"?

@xiedeyantu
Copy link
Copy Markdown
Member

I have no other comments except for the above minor one, you can feel free to squash and merge.

expr("^period (date '1-2-3', date '1-2-3')\n"
+ " overlaps (date '1-2-3', date '1-2-3', date '1-2-3')^")
.fails("(?s).*Cannot apply 'OVERLAPS' to arguments of type .*");
.fails("(?s).*Unequal number of entries in ROW expressions.*");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a regression, there is not ROW expression in the original program. The user will have difficulties finding the location of this error.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct that the original query does not contain a ROW. However, during parsing, the OVERLAPS arguments are wrapped inside a ROW prior to any validation

Currently, VALUES follows the proper execution order:
a. Operand inference
b. Operand validation

Operand inference throws an exception, but it probably should not. It should probably simply leave the types unfilled. Operand validation would then fail with something like

Cannot apply 'OVERLAPS' to arguments of type '<RECORDTYPE(DATE EXPR$0, DATE EXPR$1)> OVERLAPS <RECORDTYPE(DATE EXPR$0, DATE EXPR$1, DATE EXPR$2)>'. Supported form(s): '(<DT>, <DT>) OVERLAPS (<DT>, <DT>)'
'(<DT>, <DT>) OVERLAPS (<DT>, <INTERVAL>)'
'(<DT>, <INTERVAL>) OVERLAPS (<DT>, <DT>)'
'(<DT>, <INTERVAL>) OVERLAPS (<DT>, <INTERVAL>)'

Regarding the regression, I'm not sure it is regression, since SELECT already had the same behavior. Also SELECT * FROM VALUES()/INSERT/UPDATE/DELETE behave the same as they call validateValues which first:

  1. Does operand inference
  2. And then does operand validation

Ironically, VALUES expression (when VALUES is top level node) were the exception.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know why no checks were run in CI?
I would expect this error using ROW to appear in many more cases.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is because I add LGMT-will-merge-soon after first approve, it run CI one more time but with specific CI tests

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please run all the tests to check that there aren't other tests affected by the ROW message?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those all ran before I added the LGTM-will-merge-soon label. Anyway, I made a small tweak based on #4858 (comment). Re-running now

.columnType("DECIMAL(3, 1)");
expr("values 1.0 + NULL")
.columnType("DECIMAL(2, 1)");
.columnType("DECIMAL(3, 1)");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should this inferred type be different?
Something is off.

Copy link
Copy Markdown
Contributor Author

@dssysolyatin dssysolyatin Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explained above, in the SELECT section, why this happens. I'm not saying it's the correct behavior and I would expect that 1.0 + NULL = NULL with type DECIMAL(2,1) (common not null values type) in this case, but currently it works like that

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you say SELECT 1.0 + NULL already was DECIMAL(3,1), but when used in VALUES the result was different?

Copy link
Copy Markdown
Contributor Author

@dssysolyatin dssysolyatin Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, if you open task description - https://issues.apache.org/jira/browse/CALCITE-7457. You can see test case for main branch

@dssysolyatin dssysolyatin removed the LGTM-will-merge-soon Overall PR looks OK. Only minor things left. label Apr 11, 2026
@dssysolyatin dssysolyatin force-pushed the fix-top-level-values-validation branch from 3b86cfa to 2eb8f6d Compare April 11, 2026 14:54
@sonarqubecloud
Copy link
Copy Markdown

@mihaibudiu mihaibudiu added the LGTM-will-merge-soon Overall PR looks OK. Only minor things left. label Apr 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

LGTM-will-merge-soon Overall PR looks OK. Only minor things left.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants