-
Notifications
You must be signed in to change notification settings - Fork 34
Fix ISTP validation FILLVAL errors #3191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Changes from all commits
19148a0
faf20af
c2d1cc7
19c6c36
3acd5ed
d3a64b6
e650200
0c4f750
8b01919
2ace40d
5cb9124
0570a7f
cde1dba
d93f105
20e4bfa
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -483,6 +483,7 @@ flux_uncertainties: | |
| histogram_flag_array: | ||
| <<: *lightcurve_defaults | ||
| CATDESC: Bad-angle flags for histogram bins | ||
| CDF_DATA_TYPE: CDF_UINT1 | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should this be uint8? I don't know enough about this key
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, Based on the code, I think uint8 is the appropriate data type as well:
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So does adding CDF_DATA_TYPE cause
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, cdflib.xarray_to_cdf treats CDF_DATA_TYPE as a write-time override when building the CDF variable spec. In this case the data is also preserved as uint8, so we are not relying only on the attr. The regression test opens the written CDF and verifies histogram_flag_array is actually written as CDF_UINT1 with a uint8(255) fill. |
||
| DICT_KEY: SPASE>Support>SupportQuantity:DataQuality | ||
| FIELDNAM: Bad-angle flags for histogram | ||
| FILLVAL: 255 | ||
|
|
@@ -522,9 +523,10 @@ ecliptic_lat: | |
| number_of_bins: | ||
| <<: *support_data_defaults | ||
| CATDESC: Number of bins in histogram | ||
| CDF_DATA_TYPE: CDF_UINT2 | ||
| DICT_KEY: SPASE>Support>SupportQuantity:Other | ||
| FIELDNAM: Number of bins in histogram | ||
| FILLVAL: -9223372036854775808 | ||
| FILLVAL: *max_uint16 | ||
| FORMAT: I4 | ||
| LABLAXIS: No. of bins | ||
| UNITS: ' ' | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -107,14 +107,14 @@ esa_energy: | |
| DEPEND_1: esa_step | ||
| DICT_KEY: SPASE>Particle>ParticleType:Ion,ParticleQuantity:EnergyPerCharge | ||
| FIELDNAM: ESA Energy | ||
| FILLVAL: -9223372036854775808 | ||
| FORMAT: I5 | ||
| FILLVAL: -1.0000000E+31 | ||
| FORMAT: F8.1 | ||
| LABLAXIS: Energy(eV) | ||
| LABL_PTR_1: esa_step_label | ||
| SCALETYP: linear | ||
| UNITS: eV / q | ||
| VALIDMAX: 65535 | ||
| VALIDMIN: 0 | ||
| VALIDMAX: 21000.0 | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @tmplummer, I think you had some concerns the VALIDMAX on this one being 65535 (int16 max). I checked in with @mmshaw and she said 21000 would be a better value. |
||
| VALIDMIN: 0.0 | ||
| VAR_TYPE: support_data | ||
|
|
||
| metadata_default: &metadata_default | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -563,6 +563,20 @@ def process_lo_species_intensity( | |
| for species in species_list: | ||
| dataset[species].data[half_spin_boundary] = np.nan | ||
|
|
||
| for var in ["nso_esa_step", "nso_spin_sector"]: | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @lacoak21 , can you check this one since CoDICE was picking up using specific fillval for their data.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For context, with these two variables, the intent is to preserve them as integer support variables in the final L2 product. They are indices / enumerated support values, not continuous measured quantities. What was happening before is that older packet versions do not include these fields, so missing values were represented with So the change in this PR is not redefining the variable type. It is restoring the intended type before write-out:
In other words, this fix is meant to undo float promotion caused by missing-data handling, not to introduce a new convention for these CoDICE variables. |
||
| if var in dataset: | ||
| fillval = dataset[var].attrs["FILLVAL"] | ||
| restored_values = dataset[var].data.astype(np.float64, copy=True) | ||
| restored_values = np.nan_to_num( | ||
| restored_values, nan=fillval, posinf=fillval, neginf=fillval | ||
| ) | ||
| restored_values = np.clip(np.rint(restored_values), 0, 255).astype(np.uint8) | ||
| dataset[var] = xr.DataArray( | ||
| restored_values, | ||
| dims=dataset[var].dims, | ||
| attrs=dataset[var].attrs, | ||
| ) | ||
|
|
||
| return dataset | ||
|
|
||
|
|
||
|
|
@@ -1093,6 +1107,10 @@ def process_lo_direct_events(dependencies: ProcessingInputCollection) -> xr.Data | |
| l2_dataset["position"].dims, | ||
| elevation_angle.astype(np.float32), | ||
| ) | ||
| spin_sector_attrs = cdf_attrs.get_variable_attributes( | ||
| "spin_sector", check_schema=False | ||
| ) | ||
| spin_sector_fillval = np.uint8(spin_sector_attrs["FILLVAL"]) | ||
| # Convert spin_sector to spin_angle in degrees | ||
| # Use equation from section 11.2.2 of algorithm document | ||
| # Shift all spin sectors for all positions 13 - 24 adding 12 and mod 24 | ||
|
|
@@ -1104,13 +1122,16 @@ def process_lo_direct_events(dependencies: ProcessingInputCollection) -> xr.Data | |
| ) | ||
| l2_dataset["spin_angle"] = l2_dataset["spin_sector"].astype(np.float32) * 15.0 + 7.5 | ||
|
|
||
| # Set spin angle and sector to NaN for invalid positions (>23) | ||
| # Preserve spin_sector as an integer index while marking invalid sectors. | ||
| invalid_spin_sector = ~np.isfinite(original_spin_sector) | ( | ||
| original_spin_sector > 23 | ||
| ) | ||
| l2_dataset["spin_angle"] = xr.where( | ||
| (original_spin_sector > 23), np.nan, l2_dataset["spin_angle"] | ||
| invalid_spin_sector, np.nan, l2_dataset["spin_angle"] | ||
| ) | ||
| l2_dataset["spin_sector"] = xr.where( | ||
| (original_spin_sector > 23), np.nan, l2_dataset["spin_sector"] | ||
| ) | ||
| invalid_spin_sector, spin_sector_fillval, l2_dataset["spin_sector"] | ||
| ).astype(np.uint8) | ||
| # convert apd energy to physical units | ||
| # Set the gain labels based on gain values | ||
| gains = l2_dataset["gain"].values.ravel() | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VALIDMIN/MAX for this variable are for a
uint8data type, so are you sure that it has been changed to a uint16? If so, update VALIDMIN/MAXThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, the written
data_qualityvariable is currentlyCDF_UINT2, so theFILLVALhas to use theuint16sentinel65535. I leftVALIDMIN/MAXas0/255intentionally because those describe the valid flag values, not the full storage range. I thought the the fill value should remain outside that valid range.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked into the direct-events
data_qualitymetadata mismatch as a possible follow-up, but I’m not including it in this PR as it is already getting a bit complex.What is happening now:
data_qualityis initialized together withnum_eventsas a(epoch, priority)uint16array filled with65535data_qualityvalues assigned into that array are just the unpackedsuspectbit, so the real data values are only0or1uint16, the written CDF type becomesCDF_UINT2, and the matching L2 metadata/tests on this branch were updated to useFILLVAL = 65535Why this looks odd:
data_qualityis treated as a simple0/1flag withuint8-style metadata (FILLVAL = 255,VALIDMAX = 1)data_qualitywas grouped into the same shared 2D initialization path asnum_events, which actually does need a larger integer rangeWhat the cleaner fix would be:
num_eventsasuint16data_qualityout so it is initialized asuint8with fill255data_qualitywrites asCDF_UINT1againI verified that the source
suspectvalues are alreadyuint8, so this would be a small structural cleanup rather than a science-algorithm change. I’m leaving it for a separate PR to keep the current branch focused. Also, maybe its better left as uint16 for something downstream? I think I will leave these YAML config changes as it though since we can be confident the VALIDMAX won't exceed 255.