Update differ to ouptut MCF files#1998
Conversation
There was a problem hiding this comment.
Code Review
This pull request refactors the import_differ tool to generate diffs in MCF format and a consolidated JSON summary, replacing the previous CSV-based outputs. It introduces a direct runner mode for executing a Java-based differ via subprocess and updates the validation logic to consume these new formats. Feedback from the review highlights a mismatch in the glob pattern used to locate MCF diff files, a regression in defensive error handling during JSON parsing, and confusing logic regarding the runner_mode flag mapping where the local mode triggers the Java runner instead of the native Python implementation.
866f95e to
aca06b8
Compare
| for diff_type in [ | ||
| Diff.ADDED.name, Diff.DELETED.name, Diff.MODIFIED.name | ||
| ]: | ||
| df_type = diff_df[diff_df[Column.diff_type.name] == diff_type] |
There was a problem hiding this comment.
If the obs nodes are being filtered separately by type, can we write them into separate mcf files too?
That may help with other analysis and also skip the diffType property in the node.
There was a problem hiding this comment.
Updated to create 3 separate files nodes-added.mcf, nodes-deleted.mcf, and nodes-modified.mcf
No description provided.