Skip to content

DATE/TIME label and rendering for date-only / time-only fields#5521

Open
vinaykpud wants to merge 5 commits into
opensearch-project:mainfrom
vinaykpud:fix/ai/cluster-all
Open

DATE/TIME label and rendering for date-only / time-only fields#5521
vinaykpud wants to merge 5 commits into
opensearch-project:mainfrom
vinaykpud:fix/ai/cluster-all

Conversation

@vinaykpud

@vinaykpud vinaykpud commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Summary

Sibling to opensearch-project/OpenSearch#22045. Two sql/core changes that complete the PPL DATE / TIME story for analytics-engine columns whose format mapping is logically date-only or time-only:

  • The user-visible response schema labels them as date / time (not timestamp).
  • The bucket / scalar values render as YYYY-MM-DD / HH:mm:ss (no 00:00:00 suffix on dates, no 1970-01-01 prefix on times).
  • ADDDATE / SUBDATE / ADDTIME / SUBTIME over those columns preserve DATE / TIME as the return type instead of widening to TIMESTAMP.

The OpenSearch sandbox PR introduces the DateOnlyType / TimeOnlyType UDT markers and the DateFormatClassifier. This PR teaches sql/core to recognize those markers in three places.

Commits

TIME-typed list elements format as HH:mm:ss

AnalyticsExecutionEngine.toExprValue strips the 1970-01-01[ T] prefix from list()-aggregation elements so stats list(<TIME-typed field>) returns ["19:36:22"] instead of ["1970-01-01 19:36:22"]. The list path bypasses the sandbox-side ArrowValues scalar formatter (DataFusion's list_merge doesn't carry column-level type hints), so the regex strip lives here.

span() output column type for date/time UDT

  • OpenSearchTypeFactory.convertAnalyticsEngineRelDataTypeToExprType: DateOnlyTypeDATE, TimeOnlyTypeTIME (schema label).
  • AnalyticsExecutionEngine.toExprValue: scalar values strip the midnight time suffix for DateOnlyType and the 1970-01-01 prefix for TimeOnlyType.

Preserve DATE/TIME return type for ADDDATE/SUBDATE/ADDTIME/SUBTIME (cherry-pick from @mch2)

  • New OpenSearchTypeFactory.isDateExprType / isTimeExprType helpers recognize the UDT markers as logically DATE / TIME.
  • PPLReturnTypes.TIME_APPLY_RETURN_TYPE and AddSubDateFunction use these helpers so ADDDATE(date_col, N) reports DATE (not TIMESTAMP).

Query shapes fixed

source=date_formats | head 1 | eval d = basic_date | fields d
# was: schema d:timestamp, value "1984-04-12 00:00:00"
# now: schema d:date,      value "1984-04-12"

source=date_formats | head 1 | eval t = hour_minute_second | fields t
# was: schema t:timestamp, value "1970-01-01 09:00:00"
# now: schema t:time,      value "09:00:00"

source=calcs | head 1 | eval l = list(time1) | fields l
# was: ["1970-01-01 19:36:22"]
# now: ["19:36:22"]

source=date_formats | head 1 | eval r = ADDDATE(basic_date, 5) | fields r
# was: schema r:timestamp
# now: schema r:date         (return type follows operand type)

Test plan

  • :core:check (compile + unit + spotless + jacoco) green.
  • Manual verification of all 4 query shapes against a Variant-B cluster running OpenSearch PR #22045.
  • Existing IT tests; the 18 sandbox-side ITs that depend on this PR are gated @AwaitsFix until this lands.

@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

PR Reviewer Guide 🔍

(Review updated until commit 0f781da)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ Recommended focus areas for review

Possible Issue

The regex DATE_WITH_MIDNIGHT_TIME matches timestamps with fractional seconds (e.g., "1984-04-12 00:00:00.123"), but the code strips everything after the date, discarding the fractional part. If a DateOnlyType scalar arrives with non-zero fractional seconds at midnight (e.g., "1984-04-12 00:00:00.001"), the date extraction is correct, but if the time is not exactly midnight due to precision issues, the date might be off by one day. This could occur if upstream rounding or timezone conversion produces a timestamp slightly before or after midnight.

if (type instanceof DateOnlyType && value instanceof String s) {
  var m = DATE_WITH_MIDNIGHT_TIME.matcher(s);
  if (m.matches()) {
    return ExprValueUtils.stringValue(m.group(1));
  }
}
Possible Issue

The stripEpochDatePrefixInList method only strips the epoch date prefix from string elements. If the list contains nested lists or other complex types that might also need the same transformation, they are passed through unchanged. This could lead to inconsistent rendering if TIME-typed nested structures exist. The scenario depends on whether DataFusion or the analytics engine can produce such nested TIME lists.

private static List<Object> stripEpochDatePrefixInList(List<?> list) {
  List<Object> out = new ArrayList<>(list.size());
  for (Object element : list) {
    if (element instanceof String s) {
      var m = EPOCH_DATE_TIME_PREFIX.matcher(s);
      out.add(m.matches() ? m.group(1) : s);
    } else {
      out.add(element);
    }
  }
  return out;
}

@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

PR Code Suggestions ✨

Latest suggestions up to 0f781da

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
General
Handle nested lists recursively

The method stripEpochDatePrefixInList doesn't handle nested lists. If a list
contains another list with time strings, those nested elements won't be processed.
Consider adding recursive handling if nested structures are possible in your data
model.

core/src/main/java/org/opensearch/sql/executor/analytics/AnalyticsExecutionEngine.java [255-266]

 private static List<Object> stripEpochDatePrefixInList(List<?> list) {
   List<Object> out = new ArrayList<>(list.size());
   for (Object element : list) {
     if (element instanceof String s) {
       var m = EPOCH_DATE_TIME_PREFIX.matcher(s);
       out.add(m.matches() ? m.group(1) : s);
+    } else if (element instanceof List<?> nestedList) {
+      out.add(stripEpochDatePrefixInList(nestedList));
     } else {
       out.add(element);
     }
   }
   return out;
 }
Suggestion importance[1-10]: 5

__

Why: This is a valid enhancement that would handle nested lists containing time strings. However, without evidence that nested lists are present in the data model or requirements, this may be premature optimization. The score reflects potential value if nested structures exist.

Low
Handle non-matching timestamp formats explicitly

Add explicit handling for when the regex patterns don't match. Currently, if the
pattern doesn't match, the code falls through to fromObjectValue, which may not
handle the malformed timestamp string correctly. Consider logging a warning or
throwing an exception for unexpected formats.

core/src/main/java/org/opensearch/sql/executor/analytics/AnalyticsExecutionEngine.java [236-248]

 if (type instanceof DateOnlyType && value instanceof String s) {
   var m = DATE_WITH_MIDNIGHT_TIME.matcher(s);
   if (m.matches()) {
     return ExprValueUtils.stringValue(m.group(1));
   }
+  // Fall through to fromObjectValue for non-matching formats
 }
 // TimeOnlyType scalar — strip 1970-01-01 prefix off the Timestamp(ms) wire.
 if (type instanceof TimeOnlyType && value instanceof String s) {
   var m = EPOCH_DATE_TIME_PREFIX.matcher(s);
   if (m.matches()) {
     return ExprValueUtils.stringValue(m.group(1));
   }
+  // Fall through to fromObjectValue for non-matching formats
 }
Suggestion importance[1-10]: 3

__

Why: While adding comments about fall-through behavior could improve code clarity, the current implementation already handles non-matching formats correctly by falling through to fromObjectValue. The suggestion adds minimal value beyond documentation.

Low

Previous suggestions

Suggestions up to commit 2b8bb61
CategorySuggestion                                                                                                                                    Impact
General
Handle null list parameter

Add null check for the list parameter to prevent NullPointerException when calling
list.size(). The method should handle null input gracefully or document that null is
not allowed.

core/src/main/java/org/opensearch/sql/executor/analytics/AnalyticsExecutionEngine.java [255-266]

 private static List<Object> stripEpochDatePrefixInList(List<?> list) {
+  if (list == null) {
+    return new ArrayList<>();
+  }
   List<Object> out = new ArrayList<>(list.size());
   for (Object element : list) {
     if (element instanceof String s) {
       var m = EPOCH_DATE_TIME_PREFIX.matcher(s);
       out.add(m.matches() ? m.group(1) : s);
     } else {
       out.add(element);
     }
   }
   return out;
 }
Suggestion importance[1-10]: 5

__

Why: Adding a null check for the list parameter would prevent potential NullPointerException and improve robustness. However, the method is private and called from toExprValue where the value comes from ExprValueUtils.fromObjectValue, making null input unlikely in practice.

Low
Suggestions up to commit 9353fcd
CategorySuggestion                                                                                                                                    Impact
Possible issue
Conditionally strip epoch prefix

The list processing applies epoch date stripping unconditionally to all lists,
regardless of element type. This could incorrectly modify legitimate timestamp
strings that happen to start with "1970-01-01". Consider checking the RelDataType to
determine if the list contains TIME-typed elements before applying the
transformation.

core/src/main/java/org/opensearch/sql/executor/analytics/AnalyticsExecutionEngine.java [249-251]

-if (value instanceof List<?> list) {
+if (value instanceof List<?> list && type.getComponentType() != null 
+    && type.getComponentType() instanceof TimeOnlyType) {
   return ExprValueUtils.collectionValue(stripEpochDatePrefixInList(list));
 }
+if (value instanceof List<?> list) {
+  return ExprValueUtils.collectionValue(list);
+}
Suggestion importance[1-10]: 7

__

Why: Valid concern about unconditionally stripping epoch prefixes from all lists. The suggestion to check type.getComponentType() for TimeOnlyType would prevent incorrect modifications to legitimate timestamp strings, improving correctness and type safety.

Medium
General
Optimize matcher instance creation

The method creates a new Matcher instance for every string element in the list. For
large lists, this can be inefficient. Consider reusing a single Matcher instance by
calling reset(CharSequence) for each string, or use Pattern.matcher() only when
necessary.

core/src/main/java/org/opensearch/sql/executor/analytics/AnalyticsExecutionEngine.java [255-266]

+private static List<Object> stripEpochDatePrefixInList(List<?> list) {
+  List<Object> out = new ArrayList<>(list.size());
+  for (Object element : list) {
+    if (element instanceof String s) {
+      var m = EPOCH_DATE_TIME_PREFIX.matcher(s);
+      out.add(m.matches() ? m.group(1) : s);
+    } else {
+      out.add(element);
+    }
+  }
+  return out;
+}
 
-
Suggestion importance[1-10]: 3

__

Why: The suggestion mentions optimization but provides identical code in existing_code and improved_code. While reusing Matcher instances could improve performance for large lists, the suggestion doesn't demonstrate the actual implementation, making it only marginally useful.

Low
Suggestions up to commit 05a2106
CategorySuggestion                                                                                                                                    Impact
General
Restrict list processing to TIME types

The list processing applies epoch date stripping to all lists regardless of their
element type. This could incorrectly modify date strings in non-TIME typed lists.
Consider checking the RelDataType to ensure this transformation only applies to
TIME-typed collections.

core/src/main/java/org/opensearch/sql/executor/analytics/AnalyticsExecutionEngine.java [249-251]

-if (value instanceof List<?> list) {
+if (value instanceof List<?> list && type.getComponentType() instanceof TimeOnlyType) {
   return ExprValueUtils.collectionValue(stripEpochDatePrefixInList(list));
+} else if (value instanceof List<?> list) {
+  return ExprValueUtils.fromObjectValue(value);
 }
Suggestion importance[1-10]: 7

__

Why: Valid concern about applying epoch date stripping to all lists. However, the improved_code has issues: type.getComponentType() may not exist or return the expected type for all list types. The suggestion correctly identifies a potential bug where non-TIME lists could be incorrectly modified, but the implementation needs refinement.

Medium
Handle nested lists recursively

The method doesn't handle nested lists. If a TIME-typed list contains nested lists
with epoch-prefixed timestamps, those nested elements won't be processed. Consider
adding recursive handling for nested list structures.

core/src/main/java/org/opensearch/sql/executor/analytics/AnalyticsExecutionEngine.java [255-266]

 private static List<Object> stripEpochDatePrefixInList(List<?> list) {
   List<Object> out = new ArrayList<>(list.size());
   for (Object element : list) {
     if (element instanceof String s) {
       var m = EPOCH_DATE_TIME_PREFIX.matcher(s);
       out.add(m.matches() ? m.group(1) : s);
+    } else if (element instanceof List<?> nestedList) {
+      out.add(stripEpochDatePrefixInList(nestedList));
     } else {
       out.add(element);
     }
   }
   return out;
 }
Suggestion importance[1-10]: 5

__

Why: While adding recursive handling for nested lists could be useful, there's no evidence in the PR or test cases that nested lists are expected or supported in this context. The suggestion adds complexity for an unproven use case, though it wouldn't break existing functionality if nested lists aren't present.

Low
Suggestions up to commit f94cc28
CategorySuggestion                                                                                                                                    Impact
General
Prevent unintended string transformations in lists

The list processing applies epoch date prefix stripping unconditionally to all
lists, regardless of element type. This could cause unintended transformations for
non-TIME lists containing strings that match the pattern. Consider checking the
RelDataType to ensure this transformation only applies to TIME-typed lists.

core/src/main/java/org/opensearch/sql/executor/analytics/AnalyticsExecutionEngine.java [249-251]

-if (value instanceof List<?> list) {
+if (value instanceof List<?> list && type.getComponentType() instanceof TimeOnlyType) {
   return ExprValueUtils.collectionValue(stripEpochDatePrefixInList(list));
 }
+if (value instanceof List<?> list) {
+  return ExprValueUtils.collectionValue(list);
+}
Suggestion importance[1-10]: 7

__

Why: Valid concern about unconditional list processing. The current implementation strips epoch date prefixes from all string lists, which could affect non-TIME lists. Checking type.getComponentType() would make the transformation more targeted and prevent unintended side effects.

Medium
Suggestions up to commit c23f1d5
CategorySuggestion                                                                                                                                    Impact
Possible issue
Conditional list transformation by type

The list processing applies epoch date stripping unconditionally to all lists,
regardless of element type. This could incorrectly modify date strings in non-TIME
typed lists. Consider checking the element type or list type before applying the
transformation.

core/src/main/java/org/opensearch/sql/executor/analytics/AnalyticsExecutionEngine.java [249-251]

-if (value instanceof List<?> list) {
+if (value instanceof List<?> list && type.getComponentType() instanceof TimeOnlyType) {
   return ExprValueUtils.collectionValue(stripEpochDatePrefixInList(list));
+} else if (value instanceof List<?> list) {
+  return ExprValueUtils.collectionValue(list);
 }
Suggestion importance[1-10]: 8

__

Why: The current implementation strips epoch date prefixes from all lists unconditionally, which could incorrectly modify date strings in non-TIME typed lists. The suggestion correctly identifies this issue and proposes checking type.getComponentType() before transformation, preventing potential data corruption.

Medium
General
Handle non-matching date patterns explicitly

Add an explicit return or fall-through handling when the pattern doesn't match for
DateOnlyType. Currently, if the string doesn't match the expected format, it falls
through to fromObjectValue, which may not handle the malformed date correctly.

core/src/main/java/org/opensearch/sql/executor/analytics/AnalyticsExecutionEngine.java [236-241]

 if (type instanceof DateOnlyType && value instanceof String s) {
   var m = DATE_WITH_MIDNIGHT_TIME.matcher(s);
   if (m.matches()) {
     return ExprValueUtils.stringValue(m.group(1));
   }
+  return ExprValueUtils.stringValue(s);
 }
Suggestion importance[1-10]: 5

__

Why: While the suggestion adds explicit handling for non-matching patterns, the current fall-through to fromObjectValue is likely intentional for handling edge cases. The improvement is minor and may not be necessary if malformed dates are expected to be handled by the default path.

Low

@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit c8ca661

@vinaykpud vinaykpud changed the title Fix/ai/cluster all [core] DATE/TIME label and rendering for date-only / time-only fields Jun 8, 2026
@vinaykpud vinaykpud changed the title [core] DATE/TIME label and rendering for date-only / time-only fields DATE/TIME label and rendering for date-only / time-only fields Jun 8, 2026
@vinaykpud vinaykpud force-pushed the fix/ai/cluster-all branch from c8ca661 to c23f1d5 Compare June 8, 2026 09:15
@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit c23f1d5

@vinaykpud vinaykpud force-pushed the fix/ai/cluster-all branch from c23f1d5 to f94cc28 Compare June 8, 2026 09:17
@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit f94cc28

@songkant-aws

Copy link
Copy Markdown
Collaborator

LGTM. @vinaykpud Could you also fix the DCO check? I'll wait for analytics-engine side type merge to see how CI passes then.

@vinaykpud vinaykpud force-pushed the fix/ai/cluster-all branch from f94cc28 to ec7bf56 Compare June 8, 2026 22:56
@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit ec7bf56

@vinaykpud vinaykpud force-pushed the fix/ai/cluster-all branch from ec7bf56 to 05a2106 Compare June 9, 2026 00:29
vinaykpud added 2 commits June 9, 2026 00:31
Companion to OpenSearch fix/ai/datetime-clusters @ 9009736c2dc.
list(<TIME-typed field>) now returns "HH:mm:ss[.fraction]" without the
1970-01-01 epoch-date prefix.

The analytics-engine path rewrites PPL list() to DataFusion's list_merge,
so the legacy ListAggFunction never fires. Instead, AnalyticsExecutionEngine
now post-processes List-typed cells in the result conversion: when an
element string matches "1970-01-01[ T]HH:mm:ss[.fraction]", only the time
portion is kept. Scalar cells are untouched, preserving the wider
timestamp-stringification regression baseline.

Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
Recognize the new sandbox DateOnlyType / TimeOnlyType UDT markers in:
- OpenSearchTypeFactory.convertAnalyticsEngineRelDataTypeToExprType:
  DateOnlyType → ExprCoreType.DATE, TimeOnlyType → ExprCoreType.TIME so the
  user-visible response schema labels span() bucket columns as `date` /
  `time` instead of `timestamp`.
- AnalyticsExecutionEngine.toExprValue: when the column carries a
  DateOnlyType marker, strip the trailing ` HH:MM:SS` from the
  Timestamp(ms)-formatted wire value so dates render as `YYYY-MM-DD`.
  Symmetric handling for TimeOnlyType strips the `1970-01-01 ` prefix.

Pairs with the sandbox schema-builder change in
opensearch-project/OpenSearch@b69c5ff8888.

Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 05a2106

… date/time columns

On the analytics-engine route a 'date'/'time' column is a TIMESTAMP-backed
DateOnlyType/TimeOnlyType marker, so operand-conditional return-type inference
mis-read it as TIMESTAMP — ADDDATE(date_col, N) reported (and rendered) TIMESTAMP
instead of DATE. Add isDateExprType/isTimeExprType helpers recognizing those
markers (off the general conversion path, no substrait round-trip risk) and use
them in AddSubDateFunction and PPLReturnTypes.TIME_APPLY_RETURN_TYPE. Fixes the
6 *Null column-operand cases end-to-end.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
@vinaykpud vinaykpud force-pushed the fix/ai/cluster-all branch from 05a2106 to 9353fcd Compare June 9, 2026 00:31
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 9353fcd

Comment trims on top of d12407b (Marc Handalian's cherry-pick) — collapse
the multi-line Javadoc on isDateExprType/isTimeExprType to one-line case
names, and drop the inline narration at the two call sites
(PPLReturnTypes.TIME_APPLY_RETURN_TYPE,
AddSubDateFunction#getReturnTypeInference) that restated the helper
contract. Behavior unchanged.

Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
@vinaykpud vinaykpud force-pushed the fix/ai/cluster-all branch from 9353fcd to 2b8bb61 Compare June 9, 2026 01:11
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 2b8bb61

Signed-off-by: Vinay Krishna Pudyodu <vinkrish.neo@gmail.com>
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Persistent review updated to latest commit 0f781da

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants