Skip to content

release: Final edits for v2#40

Open
icweaver wants to merge 17 commits into
mainfrom
fixup
Open

release: Final edits for v2#40
icweaver wants to merge 17 commits into
mainfrom
fixup

Conversation

@icweaver

@icweaver icweaver commented Jun 5, 2026

Copy link
Copy Markdown
Member

Final polish for v2 release based on notes from 05 Jun 2026 office hour:

  1. build: Update ASDF standard 1.2.0 --> 1.6.0
  2. fix: Don't print empty block lists to file
  3. fix: Typo asdf/library --> asdf_library
  4. fix: Warn + preserve unknown tags
  5. feat: Materialize ucs4/ascii array data as string views
  6. fix: Catch ndarray version + float16 incompatibility

Try this PR

  1. Install Julia: https://julialang.org/downloads/

  2. Create a new environment:

    > julia
    
    julia> # Press ] to enter pkg mode
    
    pkg> activate my-env
  3. Install this branch:

    (my-env) pkg> add ASDF#fixup
  4. Load a sample asdf file:

    (my-env) pkg> # Press backspace to return to Julia REPL
    
    julia> using ASDF
    
    julia> load("path to your asdf file")

    Sample asdf files are available here.

    Documentation preview: https://juliaastro.org/ASDF.jl/previews/PR40/

@codecov

codecov Bot commented Jun 5, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.82%. Comparing base (23d2c02) to head (218d524).

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #40      +/-   ##
==========================================
+ Coverage   99.79%   99.82%   +0.03%     
==========================================
  Files           1        1              
  Lines         479      583     +104     
==========================================
+ Hits          478      582     +104     
  Misses          1        1              

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@icweaver

icweaver commented Jun 5, 2026

Copy link
Copy Markdown
Member Author

Hi @braingram, thanks for meeting with us today! I've started collecting my disorganized notes from office hours here. Please feel free to add/modify the to-do list above. I'll see if I can give you collaborator access too

@icweaver icweaver marked this pull request as draft June 5, 2026 18:07
@icweaver icweaver changed the title build: Update package + spec version release: Final edits for v2 Jun 5, 2026
@icweaver icweaver mentioned this pull request Jun 6, 2026
10 tasks
Base automatically changed from datatype to main June 6, 2026 00:03
@icweaver icweaver added this to the Release v2 milestone Jun 6, 2026
@icweaver icweaver linked an issue Jun 6, 2026 that may be closed by this pull request
10 tasks
@icweaver icweaver force-pushed the fixup branch 3 times, most recently from f512166 to 03cfeed Compare June 7, 2026 16:59
@icweaver

icweaver commented Jun 7, 2026

Copy link
Copy Markdown
Member Author

Okie doke, things look to be working alright for the test files I've been working with, including for Roman Level 3 data products like this one.

I've organized changes into the 6 self-contained commits summarized in the PR description. Do folks see any remaining issues that should be addressed here before making our first v2 release? @eschnett @braingram @cgarling

@icweaver icweaver marked this pull request as ready for review June 7, 2026 17:42
@cgarling

cgarling commented Jun 7, 2026

Copy link
Copy Markdown
Member

I'm afraid I don't really have time to give you a review on this, but I love seeing the progress!

@braingram

braingram commented Jun 7, 2026

Copy link
Copy Markdown
Collaborator

Thanks for putting this together! Testing it with a roman level 2 (cal) file, reading it in and writing it back out with:

af = load("data/r0000101001001001001_0001_wfi01_f158_cal.asdf"; extensions=true, validate_checksum=false)
save("foo.asdf", af)

Attempting to read the file into the python asdf implementation fails for a few reasons:

  • the extensions in history aren't tagged, this causes an exception in python asdf but that's more an issue for the python library: Assumed tag for file metadata leads to error asdf-format/asdf#2063 For maximum compatibility, can these be tagged like the input file?
  • the ndarray tag (1.0.0) doesn't match the ASDF_STANDARD version (1.6.0) for non-float16 arrays. This results in the python library treating the arrays as "tagged" objects and not arrays causing the wcs to fail to deserialize as a float32 array for the polynomial coefficients gets a 1.0.0 tag. Since the file is written with 1.6.0 core schemas can the arrays be tagged at 1.1.0?
  • the file also fails validation due to the formatting of some floats leading them to be read in as strings. For YAML 1.1 the expectation is that the value for e will be prefixed with a sign: https://yaml.org/type/float.html For example:
spatial_x: -5.204446308234682e7

loads in as a string instead of a float. I'm not familiar enough with julia to propose a fix but what would be needed to add a sign before the e value (-5.204446308234682e+7)? EDIT: thanks for fixing my mistake :) Yes the sign should be after the e (before the value).

@icweaver

icweaver commented Jun 8, 2026

Copy link
Copy Markdown
Member Author

This is awesome, thanks for the quick feedback, Brett!

the extensions in history aren't tagged

Ah, I see that my tag handling was too narrow (was only handling custom/unknown tags). I've opened it up to preserve other tags too now (2736a52)

the ndarray tag (1.0.0) doesn't match the ASDF_STANDARD version (1.6.0) for non-float16 arrays

Roger, I've changed the behavior to automatically bump all instances of ndarray-1.0.0 to ndarray-1.1.0 to match the ASDF 1.6.0 spec now. (d50298b).

This seemed like the most straightforward approach to me, but I think I am seeing that the python impl actually keeps existing ndarray-1.0.0 tags unchanged. Is there a preference in behavior?

For YAML 1.1 the expectation is that the e will be prefixed with a sign

oof, that is a def a sharp edge, thanks for pointing this out. I'll sit down and give this some thought tomorrow to make sure I follow the spec. Thanks for the link!

what would be needed to add a sign before the e (-5.204446308234682+e7)?

Just to make sure I am following, did you mean putting the sign after the e, not before, i.e., -5.204446308234682e+7 instead of -5.204446308234682+e7? The latter seems to read as a string for me instead of a float when I try it in the python impl

@braingram

braingram commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

Thanks for fixing my mistake on the scientific notation format. I updated the comment above.

@icweaver

icweaver commented Jun 9, 2026

Copy link
Copy Markdown
Member Author

That was an interesting rabbit hole. I think I have a fix in place for the scientific notation now (adcfa53)

I've added some of the lore as comments in the source file for future me / other maintainers, but the basic summary IIUC is:

  • In contrast to Python, Julia does not include a + sign in its sci notation output: 1.0e20 # Outputs 1.0e20
  • This is all well and good under YAML 1.2, but under the YAML 1.1 spec used by the current Python impl of ASDF, this is an invalid format, which will lead to subtle roundtripping bugs when writing/reading files between the two implementations.
  • A few ways around this are:
    1. Parse the final file for e and replace with e+. Easy to rig up for simple files, but can be brittle/slow for more complex data (binaries, comments, etc.).
    2. Hook into YAML.jl's printing for floats to do this replacement for us. This is simple to implement, but would be an example of type piracy.
    3. Go with approach ii, but on our own types. Avoids type piracy, but requires a bit of additional code on our end to write up and maintain.

I went with Option iii, with the hope that the tests and comments included help mitigate the extra complexity introduced. Does this seem reasonable to y'all?

@cgarling

cgarling commented Jun 9, 2026

Copy link
Copy Markdown
Member

Yeah from a design perspective I like extending YAML._print on your own YAMLScalar type, looks like a good solution to me

@braingram

Copy link
Copy Markdown
Collaborator

Thanks! I just testing this with 2 example roman L2 and L3 files and I'm able to:

  • read in the files
  • inspect the tag on the "roman" key to verify the type of file
  • write out the data to a new file
  • read in that new file in python (asdf/roman_datamodels) and validate it without issue!

Excellent work! It's exciting to see this library so far along.

@icweaver

Copy link
Copy Markdown
Member Author

Thanks, Brett!

I've added a couple more bookkeeping thing like a very brief Python <--> Julia interop page and a little bit of clean up on our tree display

I'll give things another week to settle / receive any more feedback from folks before getting this merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Unknown tags and roundtripping Files produce asdf/library key which looks like it should be asdf_library Release version 2.0

3 participants