feat(tests): add HETA 1.2.0 parquet size checks and GeoJSON parity validation#640
Open
ari-nz wants to merge 2 commits into
Open
feat(tests): add HETA 1.2.0 parquet size checks and GeoJSON parity validation#640ari-nz wants to merge 2 commits into
ari-nz wants to merge 2 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Adds end-to-end test updates for HETA 1.2.0 outputs by expanding expected result artifacts to include the new parquet polygon exports and validating parquet↔GeoJSON feature parity.
Changes:
- Extend
SPOT_0_EXPECTED_RESULT_FILES/SPOT_1_EXPECTED_RESULT_FILESto includetissue_qc,tissue_segmentation, andcell_classificationparquet outputs (now 12 expected files). - Update GUI/CLI e2e tests to assert 12 downloaded result files instead of 9.
- Add parquet↔GeoJSON parity assertions by comparing parquet row counts to GeoJSON
featurescounts for the three paired outputs.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
tests/constants_test.py |
Updates expected output file lists and byte-size tolerances to include the three new parquet outputs for both production and staging. |
tests/aignostics/application/gui_test.py |
Adjusts expected result file count to 12 and adds parquet↔GeoJSON parity validation after download. |
tests/aignostics/application/cli_test.py |
Adjusts expected result file count to 12 and adds parquet↔GeoJSON parity validation after execution/download. |
Comment on lines
85
to
+97
| SPOT_0_EXPECTED_RESULT_FILES = [ | ||
| ("tissue_qc_segmentation_map_image.tiff", 1642856, 10), | ||
| ("tissue_qc_geojson_polygons.json", 259955, 10), | ||
| ("tissue_segmentation_geojson_polygons.json", 887003, 10), | ||
| ("readout_generation_slide_readouts.csv", 303217, 10), | ||
| ("readout_generation_cell_readouts.csv", 1658344, 10), | ||
| ("cell_classification_geojson_polygons.json", 11218951, 10), | ||
| ("tissue_segmentation_segmentation_map_image.tiff", 2945078, 10), | ||
| ("tissue_segmentation_csv_class_information.csv", 452, 10), | ||
| ("tissue_qc_csv_class_information.csv", 285, 10), | ||
| ("tissue_qc_segmentation_map_image.tiff", 470150, 10), | ||
| ("tissue_qc_geojson_polygons.json", 171251, 10), | ||
| ("tissue_segmentation_geojson_polygons.json", 185516, 10), | ||
| ("readout_generation_slide_readouts.csv", 300205, 10), | ||
| ("readout_generation_cell_readouts.csv", 2417117, 10), | ||
| ("cell_classification_geojson_polygons.json", 16673412, 10), | ||
| ("tissue_segmentation_segmentation_map_image.tiff", 527264, 10), | ||
| ("tissue_segmentation_csv_class_information.csv", 443, 10), | ||
| ("tissue_qc_csv_class_information.csv", 286, 10), | ||
| ("tissue_qc_parquet_polygons.parquet", 34346, 10), | ||
| ("tissue_segmentation_parquet_polygons.parquet", 39185, 10), | ||
| ("cell_classification_parquet_polygons.parquet", 5476364, 10), |
Comment on lines
+443
to
+444
| assert len(files_in_results_dir) == 12, ( | ||
| f"Expected 12 files in {results_dir}, but found {len(files_in_results_dir)}: " |
Comment on lines
+1108
to
+1109
| assert len(files_in_dir) == 12, ( | ||
| f"Expected 12 files in {results_dir}, but found {len(files_in_dir)}: {[f.name for f in files_in_dir]}" |
bd5f44a to
1a3e050
Compare
…_test file count SPOT_0_EXPECTED_RESULT_FILES updated with 3 new parquet artifacts (tissue_qc, tissue_segmentation, cell_classification) from a HETA 1.2.0 run. gui_test updated to assert 12 result files and validate parquet↔GeoJSON row count parity for all 3 paired outputs.
47de64d to
4bf84bb
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds validation for the 3 new parquet outputs introduced in HETA 1.2.0 (
tissue_qc,tissue_segmentation,cell_classification).cell_detectionparquet outputs are intentionally excluded as they are being removed from the pipeline.SPOT_0_EXPECTED_RESULT_FILESandSPOT_1_EXPECTED_RESULT_FILESto include the 3 new parquet entries (12 files total)cli_test.pyandgui_test.pyto assert 12 result files instead of 9len(pd.read_parquet(...))must equallen(geojson["features"])for each paired outputTest plan