Skip to content

fix(glows): write L2 hist labels as strings and normalize global attr…#3196

Open
davidt0x wants to merge 4 commits into
IMAP-Science-Operations-Center:devfrom
davidt0x:fix/istp_glows_hist
Open

fix(glows): write L2 hist labels as strings and normalize global attr…#3196
davidt0x wants to merge 4 commits into
IMAP-Science-Operations-Center:devfrom
davidt0x:fix/istp_glows_hist

Conversation

@davidt0x
Copy link
Copy Markdown
Collaborator

Summary

This PR fixes the remaining GLOWS hist metadata issues around label serialization and global attribute typing.

It is scoped to imap_glows_l2_hist, consistent with the June validation focus under parent issue #2577 and the GLOWS validation parent issue #2751.

What Changed

Fix bins_label and flags_label serialization

bins_label and flags_label were previously being created in GLOWS L2 as scalar -1 placeholders with no explicit dimensions. That caused the written CDF variables to be emitted as integer types even though their metadata describes them as string labels.

This PR fixes that by writing them as real string arrays:

  • bins_label is now built from the stringified bin coordinate values
  • flags_label is now built from the ordered bad-time flag names
  • both variables now use their own explicit dimensions:
    • bins_label
    • flags_label

This follows the bug report and expected fix direction in #2807.

Share the bad-time flag ordering

The bad-time flag names are now defined once as a shared BAD_TIME_FLAG_NAMES constant and reused in:

  • PipelineSettings bad-time flag extraction
  • GLOWS L2 flags_label construction

That keeps the label ordering locked to the same processing order used by the pipeline.

Update flags_label metadata width

The L2 attrs now use:

  • bins_label FORMAT = A4
  • flags_label FORMAT = A42

A42 was chosen because the longest current bad-time flag name is 42 characters. A regression test now verifies that the shared flag names still fit within the declared format width.

Normalize flight_software_version global attr

flight_software_version was being written as a numeric global attribute, which the validator flagged as problematic.

This PR preserves the value content but normalizes it to a scalar string before assigning it to the L2 output attrs, so it is written as a character global attribute instead of CDF_INT8.

Rechecked bad_time_flag_occurrences

As part of this work, I revalidated the current dev write path for bad_time_flag_occurrences.

Result:

  • it already writes as CDF_UINT2
  • no product code change was needed for that variable in this PR

Testing

Added written-CDF regression coverage in test_glows_l2.py to verify:

  • bins_label is emitted as CDF_CHAR
  • flags_label is emitted as CDF_CHAR
  • flags_label contains the shared bad-time flag names in pipeline order
  • the current flag names do not exceed the declared A42 width
  • bad_time_flag_occurrences is still written as an integer CDF variable
  • flight_software_version is written as a string global attribute

davidt0x added 2 commits May 13, 2026 23:22
… typing

Clean up the remaining GLOWS L2 histogram metadata issues around label
variables and global attribute typing.

The current L2 hist write path was creating `bins_label` and
`flags_label` as scalar `-1` placeholders with no explicit dimensions.
That caused the emitted CDF variables to be written as integer types
even though the metadata declares them as string labels. This matches the
bug described in issue IMAP-Science-Operations-Center#2807.

Fix this by writing both label variables as real string arrays:
- `bins_label` now uses the stringified bin coordinate values with its
  own `bins_label` dimension, matching the existing L1A/L1B pattern
- `flags_label` now uses the ordered bad-time flag names with its own
  `flags_label` dimension instead of a scalar integer placeholder

To keep the label ordering consistent across the GLOWS pipeline, extract
the bad-time flag names into a shared `BAD_TIME_FLAG_NAMES` constant and
reuse it in both:
- `PipelineSettings` bad-time flag extraction
- GLOWS L2 `flags_label` construction

Update the L2 attrs to match the written string labels:
- keep `bins_label FORMAT` as `A4`
- widen `flags_label FORMAT` from `A4` to `A42` so the longest shared
  bad-time flag name fits without truncation

Also normalize `flight_software_version` to a string before assigning it
as an L2 global attribute. The value content is preserved, but this
avoids writing it as a numeric global attribute (`CDF_INT8`), which the
validator flags as problematic for this metadata.

As part of this work, rechecked `bad_time_flag_occurrences` on current
`dev`. It is already being written as `CDF_UINT2`, so no product code
change is made there in this commit.

Add written-CDF regression coverage for GLOWS L2 hist to verify:
- `bins_label` is emitted as `CDF_CHAR` with string bin labels
- `flags_label` is emitted as `CDF_CHAR` with the shared bad-time flag
  names in the expected order
- `flags_label FORMAT` remains wide enough for the longest current flag
  name
- `bad_time_flag_occurrences` remains an integer CDF variable
- `flight_software_version` is written as a string global attribute

This change is intentionally limited to the GLOWS `hist` metadata issues
covered by issue IMAP-Science-Operations-Center#2807 and the remaining related global-attribute typing
warning. It does not include the separate GLOWS fill-value fixes for
`histogram_flag_array` and `number_of_bins`.
@ahotasu ahotasu requested a review from laspsandoval May 14, 2026 16:51
Copy link
Copy Markdown
Contributor

@laspsandoval laspsandoval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks David. It looks good.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR resolves remaining GLOWS L2 hist CDF metadata issues by ensuring label coordinates are written as string arrays (not integer placeholders) and by normalizing a problematic global attribute to a string type for ISTP/SPDF validation compliance.

Changes:

  • Write bins_label and flags_label as real string arrays with explicit dimensions, and update flags_label FORMAT width to A42.
  • Centralize bad-time flag ordering via a shared BAD_TIME_FLAG_NAMES constant and derive FLAG_LENGTH from it.
  • Normalize flight_software_version to a scalar string before writing L2 global attributes; add regression tests that validate emitted CDF types/values.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
imap_processing/tests/glows/test_glows_l2.py Adds regression test that writes a CDF and verifies label/global-attr typing and label ordering.
imap_processing/glows/l2/glows_l2.py Implements string label arrays, shared flag-name usage, and global-attr normalization for L2 hist output.
imap_processing/glows/l1b/glows_l1b_data.py Reuses shared bad-time flag ordering when extracting pipeline bad-time flags.
imap_processing/glows/init.py Introduces BAD_TIME_FLAG_NAMES and derives FLAG_LENGTH from it to keep ordering/length consistent.
imap_processing/cdf/config/imap_glows_l2_variable_attrs.yaml Updates flags_label FORMAT to A42 to accommodate longest flag name.
Comments suppressed due to low confidence (1)

imap_processing/cdf/config/imap_glows_l2_variable_attrs.yaml:15

  • Typo in CATDESC: "daily-occurence" should be "daily-occurrence".
  CATDESC: Flag names for daily-occurence counters of L1B flags

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +229 to +255
bins_label_info = cdf_file.varinq("bins_label")
bins_label_attrs = cdf_file.varattsget("bins_label")
bins_label_values = cdf_file.varget("bins_label")
flags_label_info = cdf_file.varinq("flags_label")
flags_label_attrs = cdf_file.varattsget("flags_label")
flags_label_values = cdf_file.varget("flags_label")
bad_time_info = cdf_file.varinq("bad_time_flag_occurrences")
bad_time_attrs = cdf_file.varattsget("bad_time_flag_occurrences")
global_attrs = cdf_file.globalattsget()

assert bins_label_info.Data_Type_Description == "CDF_CHAR"
assert bins_label_attrs["FORMAT"] == "A4"
assert list(bins_label_values[:5]) == ["0", "1", "2", "3", "4"]

assert flags_label_info.Data_Type_Description == "CDF_CHAR"
assert flags_label_attrs["FORMAT"] == "A42"
assert list(flags_label_values) == list(BAD_TIME_FLAG_NAMES)
assert max(len(name) for name in BAD_TIME_FLAG_NAMES) <= int(
flags_label_attrs["FORMAT"][1:]
), (
"Update flags_label FORMAT in imap_glows_l2_variable_attrs.yaml "
"if a flag name exceeds A42."
)

assert bad_time_info.Data_Type_Description == "CDF_UINT2"
assert bad_time_attrs["FORMAT"] == "I5"
assert global_attrs["flight_software_version"] == ["131329"]
Comment thread imap_processing/cdf/config/imap_glows_l2_variable_attrs.yaml Outdated
@tech3371
Copy link
Copy Markdown
Contributor

once that minor type is fixed, we can merge this!

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants