HIVE-29413: Avoid code duplication by updating getPartCols method for iceberg tables by ramitg254 · Pull Request #6413 · apache/hive

ramitg254 · 2026-04-07T12:51:34Z

What changes were proposed in this pull request?

added getEffectivePartCols() in most places possible to avoid code duplication.

Why are the changes needed?

getPartCols() does not have support for iceberg tables.

Does this PR introduce any user-facing change?

No

How was this patch tested?

ci tests and local build

deniskuzZ · 2026-04-09T20:12:19Z

@ramitg254 please take a look: 9e7535c. I would suggest following similar approach

ramitg254 · 2026-04-10T05:36:07Z

9e7535c

but here we are creating separate method getEffectivePartCols() and leaving getPartCols() as it is, which as per our discussion on that closed pr we shouldn't do that, and only go ahead with updating getPartCols()

deniskuzZ · 2026-04-10T06:50:49Z

9e7535c

but here we are creating separate method getEffectivePartCols() and leaving getPartCols() as it is, which as per our discussion on that closed pr we shouldn't do that, and only go ahead with updating getPartCols()

Where did I say that? The ask was to keep the original method unchanged. same here

ramitg254 · 2026-04-10T07:08:55Z

oh I got confused due to this comment: #6337 (comment) in which getSupportedPartCols() was just separate method similar to getEffectivePartCols()

ramitg254 · 2026-04-10T07:18:52Z

I am fine with that earlier approach as well but recently I saw this one: https://issues.apache.org/jira/browse/HIVE-29525 so I thought we should have unified getPartCols() and getCols() which gives similar results as native hive tables as first step towards solving this after that those plan logics can be taken care of later on when that ticket will be addressed.
So I was first focussing on making getPartCols() unified for iceberg tables as well.

please share your thoughts on this idea

deniskuzZ · 2026-04-24T17:46:43Z

  private static int calculatePartPrefix(Table tbl, Set<String> partSpecKeys) {
    int partPrefixToDrop = 0;
-    for (FieldSchema fs : tbl.getPartCols()) {
+    for (FieldSchema fs : tbl.getEffectivePartCols()) {


any tests covering this for iceberg?

I am not aware about that did this because of :#6413 (comment)

deniskuzZ · 2026-04-24T17:47:38Z

          } else  {
            // partition spec is not specified but column schema can have partitions specified
-            for(FieldSchema f : targetTable.getPartCols()) {
+            for(FieldSchema f : targetTable.getEffectivePartCols()) {


do we really need this? tests?

#6413 (comment)

deniskuzZ · 2026-04-24T17:48:59Z

      List<String> cols = new ArrayList<String>();
      if (qbp.getAnalyzeRewrite() != null) {
-        List<FieldSchema> partitionCols = tab.getPartCols();
+        List<FieldSchema> partitionCols = tab.getEffectivePartCols();


we don't even enter here, see if above - !tab.hasNonNativePartitionSupport()

#6413 (comment)

deniskuzZ · 2026-04-24T17:49:55Z

            }
          } else {
-            partColSchema.addAll(tbl.getPartCols());
+            partColSchema.addAll(tbl.getEffectivePartCols());


is this needed? tests?

#6413 (comment)

deniskuzZ · 2026-04-24T17:53:28Z

so many getPartCols to getEffectivePartCols changes make we wonder if we even need getEffectivePartCols. maybe we just need to drop partitionCols list from getCols() ?
cc @kasakrisz

ramitg254 · 2026-04-25T06:57:41Z

@deniskuzZ I was updating getPartCols() with getEffectivePartCols() to moste places as we should eventually move to this generic common method.
the only places I left getPartCols() are those where logic is broken for iceberg tables with respect to getCols() giving partition columns as well.
Since updating getCols() will cause many changes and we should take care of that in some separate ticket where it will easy to replace those left places of getPartCols().
So as of now switching to the newer method wherever it is not breaking any test and later when getPartCols() isn't needed after updation of logic of getCols() the getEffectivePartCols() can be renamed to getPartCols() and everything will come down to single method

deniskuzZ · 2026-04-25T08:37:37Z

@deniskuzZ I was updating getPartCols() with getEffectivePartCols() to moste places as we should eventually move to this generic common method. the only places I left getPartCols() are those where logic is broken for iceberg tables with respect to getCols() giving partition columns as well. Since updating getCols() will cause many changes and we should take care of that in some separate ticket where it will easy to replace those left places of getPartCols(). So as of now switching to the newer method wherever it is not breaking any test and later when getPartCols() isn't needed after updation of logic of getCols() the getEffectivePartCols() can be renamed to getPartCols() and everything will come down to single method

@ramitg254 i like the idea of having a single getPartCols() method.

the only places I left getPartCols() are those where logic is broken for iceberg tables with respect to getCols() giving partition columns as well.

Since you've already identified them, why not apply the getCols() patch by stripping partition columns in the same PR and reuse getPartCols() everywhere?

ramitg254 · 2026-04-25T12:15:04Z

I was planning to but updating getCols() will alone cause test failures for all q files whichever has describe command for iceberg tables and also query plans will itself get affected as stats logic current take this getCols() into account and there are around 90+ occurences of it in code so it will lead to breakage as well so I thought it will be better if we take care of it as a separate change

deniskuzZ · 2026-04-25T13:13:00Z

I was planning to but updating getCols() will alone cause test failures for all q files whichever has describe command for iceberg tables and also query plans will itself get affected as stats logic current take this getCols() into account and there are around 90+ occurences of it in code so it will lead to breakage as well so I thought it will be better if we take care of it as a separate change

I guess that was the main intent — to integrate Iceberg partition handling into the existing code with minimal workarounds/code duplication.

Maybe I’m missing something, but, unfortunately, I don’t see much value in the current state of PR, sorry.
It doesn’t seem to enable any missing partition optimizations (there are no q-test changes), including the one mentioned above in HIVE-29525, and instead appears to be more of a partial refactor.

Let’s see what Krisztian thinks about it.

… iceberg tables

# Conflicts: # ql/src/java/org/apache/hadoop/hive/ql/parse/ColumnStatsAutoGatherContext.java

# Conflicts: # ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java

sonarqubecloud · 2026-05-27T23:42:33Z

Quality Gate passed

Issues
30 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
1.3% Duplication on New Code

See analysis details on SonarQube Cloud

asf-ci-hive added tests pending tests unstable and removed tests pending labels Apr 7, 2026

ramitg254 force-pushed the HIVE-29413 branch from 0d2baee to d97e174 Compare April 8, 2026 13:31

asf-ci-hive added tests pending tests unstable and removed tests unstable tests pending labels Apr 8, 2026

ramitg254 force-pushed the HIVE-29413 branch from d97e174 to 9e87b12 Compare April 8, 2026 18:21

asf-ci-hive added tests pending tests unstable and removed tests unstable tests pending labels Apr 8, 2026

ramitg254 force-pushed the HIVE-29413 branch from 9e87b12 to 565a2eb Compare April 9, 2026 10:03

asf-ci-hive added tests pending tests unstable and removed tests unstable tests pending labels Apr 9, 2026

ramitg254 mentioned this pull request Apr 9, 2026

HIVE-29413: Avoid code duplication by updating getPartCols method for iceberg tables #6337

Closed

asf-ci-hive added tests pending and removed tests unstable labels Apr 9, 2026

asf-ci-hive added tests unstable and removed tests pending labels Apr 9, 2026

asf-ci-hive added tests pending and removed tests unstable labels Apr 10, 2026

deniskuzZ reviewed Apr 24, 2026

View reviewed changes

ramitg254 added 20 commits May 27, 2026 11:30

HIVE-29413: Avoid code duplication by updating getPartCols method for…

2ce98d5

… iceberg tables

commit-2

af646e7

corrected bucket-map-join test

d5f101c

corrected update statements

b248996

corrected load, partition evolution tests

70567f8

refractored

44d22f8

# Conflicts: # ql/src/java/org/apache/hadoop/hive/ql/parse/ColumnStatsAutoGatherContext.java

addressed sonar issues

24d37a8

# Conflicts: # ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java

updated table api and usage

1dca652

introduced index optimization

1b9709f

corrected implementation

aad2a39

updated describe implementation and outputs

27c649e

updated api and test

11c279d

updated update implementation

5b4e476

updated partition pruning and query rewriting

63c2ac2

changes related to metatable

2742cf8

corrected alter and semantic analyzer implementation

24542d1

updated merge implementation and test output

14467d8

updated ctas create and tests output

ebb9e01

updated stats autogather and test output

18243b1

updated getPartitionKeys

fef6436

Conversation

ramitg254 commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

deniskuzZ commented Apr 9, 2026

Uh oh!

ramitg254 commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

deniskuzZ commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ramitg254 commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ramitg254 commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

deniskuzZ Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

ramitg254 Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

deniskuzZ Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

ramitg254 Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

deniskuzZ Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

ramitg254 Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

deniskuzZ Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

ramitg254 Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

deniskuzZ commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ramitg254 commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

deniskuzZ commented Apr 25, 2026

Uh oh!

ramitg254 commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

deniskuzZ commented Apr 25, 2026

Uh oh!

sonarqubecloud Bot commented May 27, 2026

Quality Gate passed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ramitg254 commented Apr 7, 2026 •

edited

Loading

ramitg254 commented Apr 10, 2026 •

edited

Loading

deniskuzZ commented Apr 10, 2026 •

edited

Loading

ramitg254 commented Apr 10, 2026 •

edited

Loading

ramitg254 commented Apr 10, 2026 •

edited

Loading

ramitg254 Apr 25, 2026 •

edited

Loading

deniskuzZ commented Apr 24, 2026 •

edited

Loading

ramitg254 commented Apr 25, 2026 •

edited

Loading

ramitg254 commented Apr 25, 2026 •

edited

Loading