Skip to content

Migrate reVeal2ReEDS pipeline to Hourlize#80

Open
kodiobika wants to merge 9 commits into
mainfrom
ko/reveal2reeds
Open

Migrate reVeal2ReEDS pipeline to Hourlize#80
kodiobika wants to merge 9 commits into
mainfrom
ko/reveal2reeds

Conversation

@kodiobika

@kodiobika kodiobika commented May 1, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR updates hourlize to include the reVeal2ReEDS pipeline, which replaces EER data center demand projections with user-specified data center demand projections from reVeal or other similarly formatted sources.

Technical details

  • Adds hourlize/reveal2reeds/reveal2reeds.py, which contains the functions that execute the pipeline

Validation, testing, and comparison report(s)

As a test, I ran the updated hourlize workflow with the EER IRA_low scenario, replacing its data center demand projections with the central scenario from the reVeal results. The differences in overall CONUS-wide load over time are below.

eer_vs_reveal

A comparison between the national reference case in main and the national reference case using this updated load profile is below. The overall increase in load results in greater capacity buildout and generation, which is predominantly wind and solar with some batteries and gas.

results-0529_Main_USA_defaults1,0601_r2R_USA_defaults.pptx

Checklist for author

Details to double-check

  • Charge code provided to reviewers
  • Included comparison reports for appropriate test cases
  • Code formatting standardized
  • Reusable functions used where possible instead of copy/pasted code

General information to guide review

  • Zero impact on results of default case
  • No large data file(s) added/modified
  • No substantive impact on runtime for full-US reference case
  • No substantive impact on folder size for full-US reference case
  • No change to process flow (runbatch.py, d_solve_iterate.py)
  • No change to code organization
  • No change to package requirements (environment.yml or Project.toml)

Did you use LLM tools (chatbot or copilot) in the preparation of this PR? If so, describe how

No

Tag points of contact here if you would like additional review of the relevant parts of the model

@kodiobika kodiobika requested a review from clairehalloran May 4, 2026 17:05
@kodiobika kodiobika self-assigned this May 4, 2026
@clairehalloran clairehalloran requested a review from SoLaraS2 June 12, 2026 20:05

@SoLaraS2 SoLaraS2 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comments are more so questions I had while walking through the code, they might be less helpful if the "issues" I addressed are overshadowed by something else, and as such the code needs to be as it currently is. Looking good though! :D

Comment thread hourlize/load.py
for sector in replace_sectors:
# Skip 'data centers' sector, as it was already processed above
if sector == 'Data Centers':
continue

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why the DC replacement can't happen inside this preexisting loop, is this a coding convention? I just don't see a reason on the processing side, it's at the same level as the loop already existing so we aren't skipping or double calculating anything. Ignore if this is just how it has to be :)

'weather_datetime',
'sector',
'subsector',
'dispatch_feeder'

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the "dispatch_feeder" level is generally summed over. I spoke about this with Anne Hamilton when I first started working with the EER files, so you should be fine to drop it as a level distinction in general. I thought I'd share this bit of info! (It doesn't always make a huge difference, but to get the exact same results as current scripts it'd probably be best to sum over it)

# by multiplying the propagation factors by national data
# center demand for the model year.
national_data_center_demand_hourly = pd.DataFrame(
index=df_load['weather_datetime'].drop_duplicates()

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to trace where df_load is coming from at this point, but if it has repeated timestamps it's likely because of dispatch_feeder, and it makes a bigger difference when we are considering the rest-of-economy subsectors, so I generally try to sum them to make unique timestamps rather than dropping. But if by this point you are absolutely sure there are no duplicates of this nature, what are the other options that would cause a need for this process?

@clairehalloran clairehalloran left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for integrating this feature! I tested this on the HPC and it worked for me.

I'd like to see some documentation added about this in a README in hourlize/input/load similar to the one in hourlize/input/config. In particular, it would be good to clarify that this pipeline expects total data center load, and then splits that into IT and cooling profiles based on EER's assumption of an annual PUE of 1.15. (I think I followed that logic correctly, let me know if not.)

"scenario": "central",
"national_demand_source": "/projects/largeload/geospatial/runs/random_forest_base_weights_01_09_2026/downscaling_2026-01-07_agg64/eer_national_central/eer_national_central_downscaled_projections.csv",
"cooling_proportions_source": "/projects/largeload/reVeal2ReEDS/files/{scenario}_dc_cooling_prop.csv",
"propagation_source": "/projects/largeload/reVeal2ReEDS/files/weather_year_propagation.csv",

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SoLaraS2 do the original cooling_proportions_source and propagation_source files exist on the HPC in the /projects/eerload directory? It would be better to point to those locations than a location in the largeloads project.

"timezone": "Etc/GMT+6",
"regional_scope": "state"
"model_years": [2025, 2030, 2035, 2040, 2045, 2050],
"scenario": "central",

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to add a README in hourlize/inputs/load that explains these config options. In particular, does scenario here refer to EER baseline vs. IRA low vs. 100by2050 (which EER calls "central") or central vs. high data center demand?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants