Map driven demand parsing by dylanjmcconnell · Pull Request #56 · Open-ISP/isp-trace-parser

dylanjmcconnell · 2026-06-18T01:00:27Z

This PR tries to mirror what has been done to date for the resource mapping (and removes / deletes regex extractors and tests completely).

Some slight differences (an extra function in the demand_trace_metadata.py c.f. the trace version, that unpacks the different demand dimensions from the yaml file) - _expand_lookup().

Main changes

New mappings/2024/demand.yaml: Includes scenarios (raw AEMO code → IASR display name), poe_levels, demand_types. Subregion axis sourced from topography.yaml (as previously discussed in ADR-001 / and used in with the resource metadata).
New demand_trace_metadata.py with build() and internal _expand_lookup(). The YAML is option-keyed, so _expand_lookup first expands the dimensions into a stem-keyed dict; build() then resolves each filename via a single dict lookup. Same dict shape / pattern as resource_trace_metadata.build(), namely, (dict[Path, dict])
demand_traces.py now uses this demand_trace_metadata.build() (same pattern as in solar_traces.py/wind_traces.py) . It called once at the top of parse_demand_traces, metadata dict passed down into restructure_demand_file (which now looks up its row instead of regex-parsing the filename).
Deletions: metadata_extractors.py, mappings/2024/demand_scenario_mapping.yaml (folded into demand.yaml), tests/test_trace_file_meta_data_extraction.py.
New tests: tests/test_demand_trace_metadata.py

Notes:

Demand pipeline now mirrors solar/wind shape (pre-built metadata dict passed via functools.partial).
Same dict shape returned (filename-key, with metadata dicts) - to probably be eventually replaced with pydantic model
No remaing imports of metadata_extractors or demand_scenario_mapping.yaml.

Things to come soon:

Remove output-filename change (currently actually unnecessary / not used, given hive partitioning and related changes ~6 months ago)
Use typed pydantic models instead of dicts - made somewhat easier by removing / simplyfying filename changes.

… dict) (Note, as metadata no longer derived from the filename inside the function (via regex) looked up from a stem-keyed dict built once by demand_tracee_metadata.build(), so that dict needs to be passed in to the function). Mirrors resource trace approach

codecov · 2026-06-18T01:02:17Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

Files with missing lines	Coverage Δ
src/isp_trace_parser/demand_trace_metadata.py	`100.00% <100.00%> (ø)`
src/isp_trace_parser/demand_traces.py	`97.91% <100.00%> (+3.79%)`	⬆️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

nick-gorman

It looks good Dylan. I got into the weeds on reability, but feel free to just ignore.

nick-gorman · 2026-06-18T02:10:23Z

+    for path in files:
+        subregion, sep, after = path.stem.partition("_RefYear_")
+        if not sep:
+            raise ValueError(f"Unexpected trace filename: {path.name}")
+        year_str, _, rest = after.partition("_")
+        key = f"{subregion}_{rest}"
+        if not year_str.isdigit() or not rest or key not in lookup:
+            raise ValueError(f"Unexpected trace filename: {path.name}")
+        file_metadata[path] = {**lookup[key], "reference_year": int(year_str)}
+    return file_metadata


I actually found the code in the for loop pretty hard to understand. This is total overkill, but I was curious on how it might be made clearer, so here's what Claude and I came up with. Please, just treat as a comment for you to take or leave as you please.

Suggested change

for path in files:

subregion, sep, after = path.stem.partition("_RefYear_")

if not sep:

raise ValueError(f"Unexpected trace filename: {path.name}")

year_str, _, rest = after.partition("_")

key = f"{subregion}_{rest}"

if not year_str.isdigit() or not rest or key not in lookup:

raise ValueError(f"Unexpected trace filename: {path.name}")

file_metadata[path] = {**lookup[key], "reference_year": int(year_str)}

return file_metadata

for path in files:

reference_year, dimension_key = _parse_filename(path)

if dimension_key not in lookup:

raise ValueError(f"Unexpected trace filename: {path.name}")

file_metadata[path] = {

**lookup[dimension_key],

"reference_year": reference_year,

}

return file_metadata

def _parse_filename(path: Path) -> tuple[int, str]:

"""Split a demand filename into its reference year and dimension key.

`<subregion>_RefYear_<year>_<rest>` -> `(year, "<subregion>_<rest>")`: the

reference year is pulled out and the surviving dimension fields are rejoined

into the key that `_expand_lookup` builds.

"""

name = path.stem # filename minus the .csv suffix

subregion, stamp, after = name.partition("_RefYear_")

year, _, rest = after.partition("_")

if not stamp or not rest or not year.isdigit():

raise ValueError(f"Unexpected trace filename: {path.name}")

return int(year), f"{subregion}_{rest}"

This also might be clearer. Anyway, I'll stop now.

def _parse_filename(path: Path) -> tuple[int, str]: """Split a demand filename into its reference year and dimension key. `<subregion>_RefYear_<year>_<remaining_dimensions>` -> `(year, "<subregion>_<remaining_dimensions>")`: the reference year is pulled out and the surviving dimension fields are rejoined into the key that `_expand_lookup` builds.""" match = re.fullmatch(r"(.+)_RefYear_(\d{4})_(.+)", path.stem) if not match: raise ValueError(f"Unexpected trace filename: {path.name}") subregion, year, remaining_dimensions = match.groups() return int(year), f"{subregion}_{remaining_dimensions}"

Thanks Nick - yeah think you are right .. will make some changes (..probably what you've suggested)

Address review feedback from Nick (#56) - Simplify the parse loop: drop redundant `if not sep` check - Rename for clarity - removed synethic rejoin

dylanjmcconnell · 2026-06-22T01:27:54Z

Good call / catch on the readability @nick-gorman - I started to basically implement your first suggestion ( .. but wanted to keep sep rather than stamp - think used sep elsewhere and thought made more sense than stamp).

But through doing that I realized there was a bit of redundancy in what I had - and ended up going with tightening / clarifying (.. hopefully) the loop rather than adding extra helper function. Specifically,

dropped the first (and redundant) if not sep (.. captured with same error message in year.isdigit() check)
renamed a couple of vars ( e.g. location_prefix / dimensions_suffix) so the two halves read as literal filename
slices
switched to a tuple key so the lookup mirrors how resource_trace_metadata keys off the stem (it's only internal, but avoids making an arbitrary key).
update to docstring

Loop reads as:

    for path in files:
        location_prefix, _, after = path.stem.partition("_RefYear_")
        refyear, _, dimensions_suffix = after.partition("_")
        key = (location_prefix, dimensions_suffix)
        if not refyear.isdigit() or key not in lookup:
            raise ValueError(f"Unexpected trace filename: {path.name}")
        file_metadata[path] = {**lookup[key], "reference_year": int(refyear)}

(Noting some of this will change with eventual move to dataclasses/ pydantic model rather than plain dict).

Will merge now - but keep it in mind, maybe revisit down the track as 2026 ISP version added and/or dataclasses introduced.

dylanjmcconnell added 10 commits June 17, 2026 15:44

Add new mapping file for demand data

e99b396

Update demand yaml to correctly map IASR names

d03195b

Deleted old scenario mapping

0b1e430

Completely removed old regex extractors

f9d4013

Added tests for demand_trace_metadata

7bb55b3

Updated docstring for restructure_demand_file

cddc83e

Minor edit to doc str

e6b4fca

Minor change to doc string

5e7e280

Added demand_trace_metadata.py - similar to resource_trace_metadata.py

cec23b8

dylanjmcconnell requested a review from nick-gorman June 18, 2026 01:00

nick-gorman approved these changes Jun 18, 2026

View reviewed changes

dylanjmcconnell added 2 commits June 21, 2026 11:07

Simplify and tighten demand filename parsing

efbdbb2

Address review feedback from Nick (#56) - Simplify the parse loop: drop redundant `if not sep` check - Rename for clarity - removed synethic rejoin

clarified name (renamed year--> refyear)

126bb9e

dylanjmcconnell merged commit 089e800 into main Jun 22, 2026
18 checks passed

dylanjmcconnell deleted the map-driven-demand-parsing branch June 22, 2026 01:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Map driven demand parsing#56

Map driven demand parsing#56
dylanjmcconnell merged 12 commits into
mainfrom
map-driven-demand-parsing

dylanjmcconnell commented Jun 18, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Jun 18, 2026 •

edited

Loading

Uh oh!

nick-gorman left a comment

Uh oh!

nick-gorman Jun 18, 2026

Uh oh!

nick-gorman Jun 18, 2026

Uh oh!

dylanjmcconnell Jun 18, 2026

Uh oh!

dylanjmcconnell commented Jun 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

dylanjmcconnell commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Main changes

Notes:

Things to come soon:

Uh oh!

codecov Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

nick-gorman left a comment

Choose a reason for hiding this comment

Uh oh!

nick-gorman Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

nick-gorman Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

dylanjmcconnell Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

dylanjmcconnell commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dylanjmcconnell commented Jun 18, 2026 •

edited

Loading

codecov Bot commented Jun 18, 2026 •

edited

Loading

dylanjmcconnell commented Jun 22, 2026 •

edited

Loading