Migrate reVeal2ReEDS pipeline to Hourlize#80
Conversation
SoLaraS2
left a comment
There was a problem hiding this comment.
My comments are more so questions I had while walking through the code, they might be less helpful if the "issues" I addressed are overshadowed by something else, and as such the code needs to be as it currently is. Looking good though! :D
| for sector in replace_sectors: | ||
| # Skip 'data centers' sector, as it was already processed above | ||
| if sector == 'Data Centers': | ||
| continue |
There was a problem hiding this comment.
I don't understand why the DC replacement can't happen inside this preexisting loop, is this a coding convention? I just don't see a reason on the processing side, it's at the same level as the loop already existing so we aren't skipping or double calculating anything. Ignore if this is just how it has to be :)
| 'weather_datetime', | ||
| 'sector', | ||
| 'subsector', | ||
| 'dispatch_feeder' |
There was a problem hiding this comment.
Actually the "dispatch_feeder" level is generally summed over. I spoke about this with Anne Hamilton when I first started working with the EER files, so you should be fine to drop it as a level distinction in general. I thought I'd share this bit of info! (It doesn't always make a huge difference, but to get the exact same results as current scripts it'd probably be best to sum over it)
| # by multiplying the propagation factors by national data | ||
| # center demand for the model year. | ||
| national_data_center_demand_hourly = pd.DataFrame( | ||
| index=df_load['weather_datetime'].drop_duplicates() |
There was a problem hiding this comment.
I tried to trace where df_load is coming from at this point, but if it has repeated timestamps it's likely because of dispatch_feeder, and it makes a bigger difference when we are considering the rest-of-economy subsectors, so I generally try to sum them to make unique timestamps rather than dropping. But if by this point you are absolutely sure there are no duplicates of this nature, what are the other options that would cause a need for this process?
clairehalloran
left a comment
There was a problem hiding this comment.
Thanks for integrating this feature! I tested this on the HPC and it worked for me.
I'd like to see some documentation added about this in a README in hourlize/input/load similar to the one in hourlize/input/config. In particular, it would be good to clarify that this pipeline expects total data center load, and then splits that into IT and cooling profiles based on EER's assumption of an annual PUE of 1.15. (I think I followed that logic correctly, let me know if not.)
| "scenario": "central", | ||
| "national_demand_source": "/projects/largeload/geospatial/runs/random_forest_base_weights_01_09_2026/downscaling_2026-01-07_agg64/eer_national_central/eer_national_central_downscaled_projections.csv", | ||
| "cooling_proportions_source": "/projects/largeload/reVeal2ReEDS/files/{scenario}_dc_cooling_prop.csv", | ||
| "propagation_source": "/projects/largeload/reVeal2ReEDS/files/weather_year_propagation.csv", |
There was a problem hiding this comment.
@SoLaraS2 do the original cooling_proportions_source and propagation_source files exist on the HPC in the /projects/eerload directory? It would be better to point to those locations than a location in the largeloads project.
| "timezone": "Etc/GMT+6", | ||
| "regional_scope": "state" | ||
| "model_years": [2025, 2030, 2035, 2040, 2045, 2050], | ||
| "scenario": "central", |
There was a problem hiding this comment.
It would be good to add a README in hourlize/inputs/load that explains these config options. In particular, does scenario here refer to EER baseline vs. IRA low vs. 100by2050 (which EER calls "central") or central vs. high data center demand?
Summary
This PR updates hourlize to include the reVeal2ReEDS pipeline, which replaces EER data center demand projections with user-specified data center demand projections from reVeal or other similarly formatted sources.
Technical details
hourlize/reveal2reeds/reveal2reeds.py, which contains the functions that execute the pipelineValidation, testing, and comparison report(s)
As a test, I ran the updated hourlize workflow with the EER
IRA_lowscenario, replacing its data center demand projections with thecentralscenario from the reVeal results. The differences in overall CONUS-wide load over time are below.A comparison between the national reference case in
mainand the national reference case using this updated load profile is below. The overall increase in load results in greater capacity buildout and generation, which is predominantly wind and solar with some batteries and gas.results-0529_Main_USA_defaults1,0601_r2R_USA_defaults.pptx
Checklist for author
Details to double-check
General information to guide review
Did you use LLM tools (chatbot or copilot) in the preparation of this PR? If so, describe how
No
Tag points of contact here if you would like additional review of the relevant parts of the model