Skip to content

Fix ISTP validation FILLVAL errors#3191

Open
davidt0x wants to merge 15 commits into
IMAP-Science-Operations-Center:devfrom
davidt0x:fix/istp_fillval
Open

Fix ISTP validation FILLVAL errors#3191
davidt0x wants to merge 15 commits into
IMAP-Science-Operations-Center:devfrom
davidt0x:fix/istp_fillval

Conversation

@davidt0x
Copy link
Copy Markdown
Collaborator

@davidt0x davidt0x commented May 13, 2026

Summary

This PR fixes the remaining known L2 ISTP FILLVAL findings by aligning emitted CDF variable types, FILLVALs, and FORMAT metadata across several instruments and products. This is PR is in service of the general metadata ISTP cleanup issue #2577

In most cases the underlying science values were already correct, but the written CDF metadata did not match the serialized variable type. In a few cases, the better fix was to restore the intended dtype before write-out instead of only changing attrs.

Why

Running the ISTP/SPDF CDF command line checker against all L2 files produced a set of non-standard FILLVAL findings caused by a few recurring patterns:

  • signed vs unsigned fill sentinels
  • attrs describing a different type than the actual serialized CDF variable
  • integer support variables widening to float after NaN is introduced for missing values
  • array padding/re-expansion changing dtype based on the fill value

This PR fixes those mismatches so the emitted L2 products are ISTP-compliant at the file level.

What Changed

SWE

  • Update L2 acq_duration to use the correct CDF_UINT4 fill sentinel: 4294967295.
  • Add a regression test that opens the written L2 CDF and verifies both dtype and FILLVAL.

SWAPI

  • Align esa_energy FILLVAL and FORMAT with the emitted floating-point support variable. Set VALIDMAX to 21000.

IDEX

  • Align spin_phase attrs with the serialized CDF_DOUBLE output in L2A.

HIT

  • Update dynamic_threshold_state to use CDF_UINT1 semantics with FILLVAL = 255.

MAG

  • Align vector and quality fill metadata with the emitted CDF types for:
    • b_dsrf
    • b_gse
    • b_gsm
    • b_rtn
    • b_srf
    • quality_flags
    • quality_bitmask

CoDICE

  • LO species:
    • Restore nso_esa_step and nso_spin_sector to uint8 before write-out.
    • Replace non-finite values with the integer fill sentinel instead of leaving these support vars widened to float.
  • LO / HI direct-events:
    • Fix data_quality fill metadata.
    • Fix LO direct-event spin_sector fill/format to match the emitted type.

GLOWS

  • Make number_of_bins explicitly uint16 with FILLVAL = 65535.
  • Preserve histogram_flag_array as uint8 during L2 padding/re-expansion instead of silently widening based on the fill value.
  • Add regression coverage to ensure float arrays such as photon_flux and ecliptic_lon remain floating-point through the same shared padding path.

Testing

  • Added/updated CDF-level regression tests that inspect the written files, not just the in-memory datasets.
  • Focused local pytest runs were completed for the touched SWE, SWAPI, IDEX, HIT, MAG, CoDICE, and GLOWS paths during development.

Notes

  • This PR is scoped to the remaining known L2 FILLVAL findings
  • No broad science-algorithm changes are intended; the goal is to make emitted CDF metadata and serialized variable types agree.

davidt0x added 8 commits May 11, 2026 16:42
SWE L2 writes acq_duration as CDF_UINT4, but the L2 variable
attribute template still used the signed int64 fill sentinel. When the
CDF is written, cdflib coerces FILLVAL to the variable's actual CDF
type, so the negative sentinel wraps to 0 for the emitted UINT4 field.

That produces a non-standard L2 CDF and causes SPDF to report:

  acq_duration
    FILLVAL value of '0' is non-standard.
    The recommended value is '4294967295'.

Fix the L2 metadata template to use the correct UINT4 fill value
(4294967295) for acq_duration.

Also extend the existing SWE L2 end-to-end test to inspect the written
CDF directly with cdflib and verify that acq_duration is emitted as
CDF_UINT4 with the expected FILLVAL. This gives us a regression test on
the actual serialized CDF metadata, not just the in-memory dataset.

This change is intentionally scoped to SWE L2. L1 metadata is unchanged.
Several products were writing variables with CDF types that did not match the
configured FILLVAL and FORMAT metadata. The science values were otherwise fine,
but the emitted CDFs failed SPDF/ISTP checks because the metadata still
described the pre-coercion or intended type rather than the serialized type.

Update the staged products so their attrs match what is actually written:

- CoDICE lo-direct-events
  - change data_quality to use the UINT2 fill value
  - change spin_sector to use floating-point fill/format because the written
    variable is CDF_DOUBLE after invalid positions are represented as NaN

- CoDICE hi-direct-events
  - change data_quality to use the UINT2 fill value

- CoDICE lo-sw-species / lo-nsw-species
  - restore nso_esa_step and nso_spin_sector to uint8 before write-out
  - replace non-finite values with the integer fill sentinel and recast to
    CDF_UINT1 semantics instead of leaving these support vars widened to float

- HIT L2
  - correct dynamic_threshold_state FILLVAL from the signed sentinel to the
    UINT1 sentinel expected by the emitted CDF

- IDEX
  - treat spin_phase as floating-point metadata in the attr template so its
    FILLVAL and FORMAT match the serialized CDF_DOUBLE variable

- MAG L2
  - correct vector fill metadata to use the standard floating-point fill value
    used by the emitted magnetic field vectors

- SWAPI
  - correct esa_energy FILLVAL and FORMAT to the floating-point form used by
    the written support variable

Also add CDF-level regression coverage for the products exercised in this
change:
- CoDICE tests now inspect the written CDF metadata for nso_esa_step,
  nso_spin_sector, lo-direct-events spin_sector, and direct-events data_quality
- HIT tests now verify dynamic_threshold_state is written as CDF_UINT1 with
  FILLVAL 255
- IDEX tests now verify spin_phase is written as CDF_DOUBLE with the expected
  floating-point fill value

This change is intentionally focused on the staged fill-value/type mismatches.
It does not include the separate SWE fix or the remaining CoDICE epoch_delta_*
cleanup.
GLOWS L2 re-expands daily lightcurve arrays back to STANDARD_BIN_COUNT
before writing the final CDF. That padding step was using
`np.full(..., fillval)` without controlling the dtype, which let NumPy
choose a default signed integer type based on the fill value.

That caused two metadata/type mismatches in the emitted CDF:

- `histogram_flag_array` is logically an 8-bit flag array, but padding
  with the fill value widened it away from `uint8`
- `number_of_bins` was written from a plain Python integer, so it did
  not reliably preserve the intended unsigned-16 representation

SPDF then flagged the resulting files because the serialized CDF types
and FILLVALs no longer matched the intended product schema.

Fix this in two parts:

1. Make the GLOWS L2 schema explicit
   - declare `histogram_flag_array` as `CDF_UINT1`
   - declare `number_of_bins` as `CDF_UINT2`
   - update `number_of_bins` to use the `65535` UINT2 fill sentinel

2. Preserve array dtypes when rebuilding padded bin arrays
   - write `number_of_bins` explicitly as `np.uint16`
   - when padding bin-dimensioned daily lightcurve arrays, start from the
     existing array dtype instead of letting `np.full` infer one from the
     fill value
   - if the configured fill value fits exactly in the current integer
     dtype, keep that dtype
   - only widen when the fill value cannot be represented without
     changing value

This keeps integer support arrays like `histogram_flag_array` in their
intended unsigned type while still allowing float arrays such as
`photon_flux` and `ecliptic_lon` to remain floating-point.

Add regression coverage for both the in-memory and written-CDF paths:
- verify `create_l2_dataset()` keeps `photon_flux` and `ecliptic_lon`
  as floating arrays after padding
- verify the emitted CDF writes
  - `histogram_flag_array` as `CDF_UINT1` with `FILLVAL=255`
  - `number_of_bins` as `CDF_UINT2` with `FILLVAL=65535`
  - `photon_flux` and `ecliptic_lon` as floating-point variables with
    the expected real fill value
CoDICE hi-omni and hi-sectored were still carrying
epoch_delta_minus/epoch_delta_plus through L1B as integer-valued support
variables even though the current schema intent for those fields is
real-valued temporal support data. That mismatch then propagated into the
serialized L2 products and conflicted with the float-style ISTP metadata
needed for the written CDFs.

Update the Hi L1B path to normalize epoch_delta_minus and
epoch_delta_plus to float64 before CDF write-out for:
- imap_codice_l1b_hi-omni
- imap_codice_l1b_hi-sectored

Also update the Hi L2 variable-attribute templates so the final products
describe those variables consistently with the written type:
- FILLVAL: real fill sentinel (-1.0e31)
- FORMAT: floating-point format (F19.1)

This keeps the fix split cleanly by responsibility:
- L1B now emits the support variables with the intended dtype
- L2 no longer advertises them with integer fill/format metadata

Test coverage is updated in two layers:

1. L1B metadata regression
   - add a parameterized CDF-level test for hi-omni and hi-sectored
   - assert epoch_delta_minus and epoch_delta_plus are written as
     CDF_DOUBLE with FILLVAL -1.0e31

2. L2 regression and metadata verification
   - keep the existing validation-based L2 science regression tests
   - add a separate parameterized write-path test that generates fresh
     Hi L1B inputs, runs L2, writes the output CDF, and verifies the
     emitted epoch_delta variables are CDF_DOUBLE with the expected
     float fill value

The generated-L1B L2 metadata test is intentional: the checked-in Hi L1B
validation artifacts still predate this dtype fix, so they cannot by
themselves verify the regenerated L1B -> L2 path. Science regression
coverage stays anchored to the historical validation files, while the new
metadata test proves the current pipeline output is correct.

This change is limited to CoDICE Hi epoch_delta support-variable typing
and metadata. It does not refresh the checked-in validation artifacts.
histogram_flag_array:
<<: *lightcurve_defaults
CATDESC: Bad-angle flags for histogram bins
CDF_DATA_TYPE: CDF_UINT1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be uint8? I don't know enough about this key CDF_DATA_TYPE though. This is new one. similar comment for below.

Copy link
Copy Markdown
Collaborator Author

@davidt0x davidt0x May 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, CDF_UINT1 is the CDF-side spelling for an unsigned 8-bit integer, so this is the explicit uint8 form for histogram_flag_array. The same idea applies to number_of_bins below: that one is CDF_UINT2 because it needs a uint16 range and the ISTP-recommended fill sentinel is 65535.

Based on the code, I think uint8 is the appropriate data type as well:

  • histogram_flag_array is already produced as np.uint8 in L1B
  • L2 preserves that and explicitly casts the OR-reduced result back to np.uint8

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So does adding CDF_DATA_TYPE cause cdflib to cast the variable to this specified type when writing the CDF? I believe that typically, the variable data type informs cdflib what to set the CDF_DATA_TYPE attribute to.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, cdflib.xarray_to_cdf treats CDF_DATA_TYPE as a write-time override when building the CDF variable spec. In this case the data is also preserved as uint8, so we are not relying only on the attr. The regression test opens the written CDF and verifies histogram_flag_array is actually written as CDF_UINT1 with a uint8(255) fill.

Comment thread imap_processing/codice/codice_l1b.py Outdated
l1b_dataset = l1a_dataset.copy(deep=True)

if descriptor in ["hi-omni", "hi-sectored"]:
l1b_dataset = _cast_epoch_delta_vars_to_float64(l1b_dataset)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember @jtniehof and @andrkoval , they want epoch delta to be int type rather than float.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason I made epoch_delta_plus / epoch_delta_minus float in the hi-omni and hi-sectored path is consistency with how CoDICE is already handling these variables elsewhere.

These two support variables were already effectively real-valued in other CoDICE products:

  • the CoDICE L1A attr template already defines them with real-valued metadata (FILLVAL = -1.0E31, float FORMAT)
  • the lo-direct-events and hi-direct-events products were already emitting them as CDF_DOUBLE

hi-omni and hi-sectored were the inconsistent pair still carrying them through L1B as integer arrays, which then conflicted with the float-style metadata needed for the written L2 CDFs.

So the change here was not meant to redefine the variable semantics from scratch. It was meant to make hi-omni / hi-sectored line up with the existing CoDICE behavior for the same epoch_delta_* support vars and eliminate the mixed integer/float treatment across products.

Maybe I went the wrong way though, if the CoDICE decision is that epoch_delta_* should remain integer everywhere, I think the cleaner fix would be to make that a broader product-wide change and update the direct-event paths and upstream attrs consistently, rather than leaving hi-omni / hi-sectored as the only products behaving differently.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, interesting... I see this comment. No need to respond to my individual comments regarding this.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I can make them integer valued on all the CoDICE products if that is better?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will revert these changes and address epoch_delta inconsistency in another PR.

for species in species_list:
dataset[species].data[half_spin_boundary] = np.nan

for var in ["nso_esa_step", "nso_spin_sector"]:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lacoak21 , can you check this one since CoDICE was picking up using specific fillval for their data.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For context, with these two variables, the intent is to preserve them as integer support variables in the final L2 product. They are indices / enumerated support values, not continuous measured quantities.

What was happening before is that older packet versions do not include these fields, so missing values were represented with NaN upstream. Once NaN is introduced, NumPy promotes the array to float, and that widened dtype propagated all the way to the written CDF. That is what led to the non-standard FILLVAL / FORMAT issues here.

So the change in this PR is not redefining the variable type. It is restoring the intended type before write-out:

  • replace non-finite values with the integer fill sentinel
  • cast back to uint8
  • write them out as integer support vars again

In other words, this fix is meant to undo float promotion caused by missing-data handling, not to introduce a new convention for these CoDICE variables.

Comment thread imap_processing/glows/l2/glows_l2.py Outdated
fillval = var_attrs["FILLVAL"]
padded = np.full(GlowsConstants.STANDARD_BIN_COUNT, fillval)
padded[:n_bins] = value
value_array = np.asarray(value)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maxinelasp or @mstrumik , can you guys check if this update looks ok?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For context, the GLOWS change is limited to preserving the existing dtype during the daily-lightcurve padding step. histogram_flag_array now stays uint8 instead of widening based on the fill value, while float arrays like photon_flux and ecliptic_lon still remain floating-point. I also added regression checks on the written CDF output for both cases.

@tech3371
Copy link
Copy Markdown
Contributor

Thanks @davidt0x. This PR is looking good in general. I pinged couple people on code changes.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aligns several emitted CDF variable dtypes, FILLVALs, and related metadata with ISTP expectations across L2 products, with added regression tests that inspect written CDF files.

Changes:

  • Updates CDF metadata templates for SWE, SWAPI, IDEX, HIT, MAG, GLOWS, and CoDICE products.
  • Restores/normalizes selected runtime dtypes before CDF write-out for GLOWS and CoDICE.
  • Adds CDF-level metadata regression coverage and ignores generated _version.py.

Reviewed changes

Copilot reviewed 23 out of 24 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
.gitignore Ignores generated dynamic-versioning file.
imap_processing/cdf/config/imap_codice_l2-hi-direct-events_variable_attrs.yaml Updates direct-event data_quality fill metadata.
imap_processing/cdf/config/imap_codice_l2-hi-omni_variable_attrs.yaml Updates HI omni epoch delta fill/format metadata.
imap_processing/cdf/config/imap_codice_l2-hi-sectored_variable_attrs.yaml Updates HI sectored epoch delta fill/format metadata.
imap_processing/cdf/config/imap_codice_l2-lo-direct-events_variable_attrs.yaml Updates LO direct-event data_quality and spin_sector metadata.
imap_processing/cdf/config/imap_glows_l2_variable_attrs.yaml Adds explicit unsigned CDF types/fills for GLOWS support variables.
imap_processing/cdf/config/imap_hit_l2_variable_attrs.yaml Updates HIT dynamic threshold fill metadata.
imap_processing/cdf/config/imap_idex_l1b_variable_attrs.yaml Updates IDEX spin phase metadata for floating output.
imap_processing/cdf/config/imap_mag_l2_variable_attrs.yaml Updates MAG vector fill metadata.
imap_processing/cdf/config/imap_swapi_variable_attrs.yaml Updates SWAPI ESA energy fill/format metadata.
imap_processing/cdf/config/imap_swe_l2_variable_attrs.yaml Updates SWE acquisition duration fill metadata.
imap_processing/codice/codice_l1b.py Casts CoDICE HI epoch deltas to float64.
imap_processing/codice/codice_l2.py Restores CoDICE LO NSO support variables to uint8.
imap_processing/glows/l2/glows_l2.py Preserves integer/floating dtypes during GLOWS padding.
imap_processing/tests/codice/test_codice_hi_l2.py Adds regenerated HI L1B/L2 metadata regression tests.
imap_processing/tests/codice/test_codice_l1b.py Adds CoDICE L1B epoch delta metadata checks.
imap_processing/tests/codice/test_codice_l2.py Adds CoDICE L2 CDF dtype/fill checks.
imap_processing/tests/glows/test_glows_l2.py Adds GLOWS dtype/fill regression coverage.
imap_processing/tests/hit/test_hit_l2.py Adds HIT written-CDF metadata assertions.
imap_processing/tests/idex/conftest.py Makes mocked spin phase floating-point.
imap_processing/tests/idex/test_idex_l2a.py Adds IDEX spin phase CDF metadata assertions.
imap_processing/tests/mag/test_mag_l2.py Adds MAG vector CDF metadata assertions.
imap_processing/tests/swapi/test_swapi_l2.py Adds SWAPI ESA energy CDF metadata assertions.
imap_processing/tests/swe/test_swe_l2.py Adds SWE acquisition duration CDF metadata assertions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread imap_processing/tests/mag/test_mag_l2.py Outdated
Comment thread imap_processing/tests/codice/test_codice_hi_l2.py Outdated
davidt0x added 2 commits May 13, 2026 19:05
Clean up the files touched by the ISTP fill-value work so they pass the
current pre-commit hooks without changing the intended product behavior.

- simplify the GLOWS L2 refactor to keep only the shared
  daily-lightcurve bin padding helper extracted
- add complete numpydoc sections for the new CoDICE L1B helper
- shorten test setup in the CoDICE L1B metadata regression to satisfy
  line-length checks

This is a style and maintainability cleanup only. It keeps the
underlying fill-value and dtype fixes intact while making the updated
files pass ruff, numpydoc-validation, and mypy.
Comment thread imap_processing/cdf/config/imap_codice_l2-hi-sectored_variable_attrs.yaml Outdated
DICT_KEY: SPASE>Support>SupportQuantity:DataQuality
FIELDNAM: Data Quality
FILLVAL: *uint8_fillval
FILLVAL: *uint16_fillval
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VALIDMIN/MAX for this variable are for a uint8 data type, so are you sure that it has been changed to a uint16? If so, update VALIDMIN/MAX

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, the written data_quality variable is currently CDF_UINT2, so the FILLVAL has to use the uint16 sentinel 65535. I left VALIDMIN/MAX as 0/255 intentionally because those describe the valid flag values, not the full storage range. I thought the the fill value should remain outside that valid range.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into the direct-events data_quality metadata mismatch as a possible follow-up, but I’m not including it in this PR as it is already getting a bit complex.

What is happening now:

  • in the direct-events L1A builder, data_quality is initialized together with num_events as a (epoch, priority) uint16 array filled with 65535
  • later, the actual data_quality values assigned into that array are just the unpacked suspect bit, so the real data values are only 0 or 1
  • because the backing array is uint16, the written CDF type becomes CDF_UINT2, and the matching L2 metadata/tests on this branch were updated to use FILLVAL = 65535

Why this looks odd:

  • in the rest of CoDICE, data_quality is treated as a simple 0/1 flag with uint8-style metadata (FILLVAL = 255, VALIDMAX = 1)
  • so direct-events are currently an outlier, most likely because data_quality was grouped into the same shared 2D initialization path as num_events, which actually does need a larger integer range

What the cleaner fix would be:

  • keep num_events as uint16
  • split data_quality out so it is initialized as uint8 with fill 255
  • update the direct-events L2 metadata/tests so data_quality writes as CDF_UINT1 again

I verified that the source suspect values are already uint8, so this would be a small structural cleanup rather than a science-algorithm change. I’m leaving it for a separate PR to keep the current branch focused. Also, maybe its better left as uint16 for something downstream? I think I will leave these YAML config changes as it though since we can be confident the VALIDMAX won't exceed 255.

Comment on lines +263 to +264
FILLVAL: *real_fillval
FORMAT: F8.1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect a variable that is an index to be an integer type.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. This one is an index semantically, and representing missing/invalid sectors with NaN is what widened it to float. I’ll change this path to use the integer fill sentinel for invalid spin sectors and keep the written variable as an integer type instead of making the metadata follow the float promotion.

histogram_flag_array:
<<: *lightcurve_defaults
CATDESC: Bad-angle flags for histogram bins
CDF_DATA_TYPE: CDF_UINT1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So does adding CDF_DATA_TYPE cause cdflib to cast the variable to this specified type when writing the CDF? I believe that typically, the variable data type informs cdflib what to set the CDF_DATA_TYPE attribute to.

Comment on lines 116 to 117
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like these should also be updated.

Comment thread imap_processing/codice/codice_l1b.py Outdated
Comment thread imap_processing/codice/codice_l1b.py Outdated
l1b_dataset = l1a_dataset.copy(deep=True)

if descriptor in ["hi-omni", "hi-sectored"]:
l1b_dataset = _cast_epoch_delta_vars_to_float64(l1b_dataset)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, interesting... I see this comment. No need to respond to my individual comments regarding this.

davidt0x added 5 commits May 14, 2026 17:00
Preserve spin_sector as an unsigned integer index in the LO direct-event
L2 output by replacing invalid sectors with the uint8 fill sentinel instead
of NaN. This prevents NumPy from widening the index variable to float and
keeps the written CDF type/FILLVAL aligned with the variable semantics.

Update the regression assertion to verify spin_sector is written as
CDF_UINT1 with FILLVAL 255.
Update esa_energy VALIDMIN/VALIDMAX to floating-point values so the
metadata is consistent with the CDF_DOUBLE variable type, real-valued
FILLVAL, and float FORMAT.

This preserves the existing 65535 upper bound for now. That value is
probably a legacy/raw-storage bound rather than the most scientifically
meaningful maximum for esa_energy, so a future update should use an
instrument-approved value from the SWAPI scientists.
Remove the CoDICE Hi-only `epoch_delta_minus` / `epoch_delta_plus`
float-casting changes from the ISTP fill-value branch.

This branch had picked up an additional CoDICE-specific workaround that:
- cast `epoch_delta_*` to `float64` in the Hi L1B path
- changed the Hi omni and Hi sectored L2 attr templates to float-style
  `FILLVAL` / `FORMAT`
- added float-specific regression tests around that behavior

That work is no longer intended to ship as part of the fill-value PR.
The final CoDICE `epoch_delta_*` datatype decision will be handled in a
separate follow-up PR.

Revert the branch-local workaround by:
- removing `_cast_epoch_delta_vars_to_float64()` from `codice_l1b.py`
- removing the `hi-omni` / `hi-sectored` call site in
  `process_codice_l1b()`
- restoring the Hi omni and Hi sectored `epoch_delta_*` L2 metadata to
  the current `upstream/dev` state
- deleting the float-specific CoDICE epoch-delta regression tests from
  `test_codice_l1b.py` and `test_codice_hi_l2.py`

No new epoch-delta behavior is introduced here. This change only removes
the extra branch complexity so `fix/istp_fillval` goes back to its
intended scope.

Validated locally with:
- `pytest -q imap_processing/tests/codice/test_codice_l1b.py`
- `pytest -q imap_processing/tests/codice/test_codice_hi_l2.py`
- `pytest -q imap_processing/tests/codice/test_codice_l2.py -k 'lo_de or hi_de or sw_species'`
- `pre-commit` on the touched CoDICE files
Margaret says that 21000 is a better VALIDMAX for esa_energy on SWAPI.
UNITS: eV / q
VALIDMAX: 65535
VALIDMIN: 0
VALIDMAX: 21000.0
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tmplummer, I think you had some concerns the VALIDMAX on this one being 65535 (int16 max). I checked in with @mmshaw and she said 21000 would be a better value.

@davidt0x davidt0x requested a review from tmplummer May 15, 2026 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants