Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -53,3 +53,6 @@ slurm*.out
# plots
**/plots
zones/*/*.gpkg

# state_policies generated outputs (regenerated by data_processing.py;
state_policies/outputs/
107 changes: 107 additions & 0 deletions state_policies/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# Overview

This folder produces three CSV files that ReEDS consumes as state-level
renewable / clean energy policy inputs:

| Output | Description |
| -------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `outputs/rps_fraction.csv` | Required RPS fraction of retail sales by state and year, plus voluntary RPS and Nova Scotia (NS) rows. Columns:`t, st, rps_all, rps_solar, rps_wind`. |
| `outputs/ces_fraction.csv` | Required CES fraction of retail sales by state and year. Columns:`*t, st, Value`. |
| `outputs/hydrofrac_policy.csv` | State-level fraction of existing hydro / non-RE generation that already counts toward each state's RPS_All and CES targets. Columns:`*st, RPS_All, CES`. |

Both `rps_fraction.csv` and `ces_fraction.csv` are produced as piecewise-linear
ramps between policy "change points", so the year-over-year trajectory is smooth
even when the underlying LBNL data jumps (for example when a state has a 2050
target with otherwise flat interim values). The un-interpolated values are
saved in `outputs/intermediate outputs/` for diagnostic purposes and are used
by the comparison plots described below.

# Running the script

From this folder:

```
python data_processing.py
```

The script reads everything from `inputs/` and writes the three CSVs above to
`outputs/`, plus diagnostic intermediates to `outputs/intermediate outputs/`.

# Input files

All located in `inputs/`:

| Input | Description |
| --- | --- |
| `RPS & CES Targets and Demand_June 2026.xlsx` | Annual LBNL state RPS / CES dataset, provided by Galen Barbose. Source: https://emp.lbl.gov/projects/renewables-portfolio. The script reads four sheets from this file: `Statewide Sales`, `RPS & CES Demand (GWh)`, and `Non-RE Accounting`. Note: the `Non-RE Accounting` sheet is not in the public workbook. Galen sends it separately, and we paste it back into this file as the `Non-RE Accounting` tab so the workbook is self-contained. |
| `nrel-green-power-data-v2024.xlsx` | NLR Green Power Data (formerly NREL), used for the voluntary RPS row. Source: https://www.nlr.gov/analysis/voluntary-power-procurement. |
| `RPS_nonUS.csv` | Non-US RPS data (currently only Nova Scotia, `NS`). |
| `hierarchy.csv` | Region hierarchy from a recent ReEDS Reference run, used to map BAs to states for the hydrofrac calculation. |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you give a version number or date rather than "recent" to make things clearer down the road?

| `gen_ann.csv` | Annual generation by tech and BA from a recent ReEDS Reference run, used to compute hydro / non-RE shares for the hydrofrac calculation. |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.


# Annual update procedure

The LBNL state RPS / CES dataset is refreshed roughly once per year. To pick up
a new release:

1. Download the new LBNL RPS dataset from
https://emp.lbl.gov/projects/renewables-portfolio and place it in
`inputs/` (e.g. `RPS & CES Targets and Demand_June 2026.xlsx`). To get the `Non-RE Accounting`
sheet, request it from Galen Barbose and paste it back into the workbook
as a sheet named `Non-RE Accounting`.
Comment thread
Yunzhi-Chen marked this conversation as resolved.
2. Open `data_processing.py` and update the parameters at the top of the file:
- `filename` — point to the new file.
- `Salessheetname`, `Salessheet_usecols`, `Salessheet_skiprows`, `Salessheet_nrows` — match the new sheet's name and header layout.
- `RPSsheetname`, `RPSsheet_usecols`, `RPSsheet_skiprows`, `RPSsheet_nrows` — match the new sheet's name and header layout.
- `Hydrosheet_*` — re-check the row counts on the `Non-RE Accounting` sheet (e.g. whether a totals row was added at the bottom of the RPS block).
- Confirm `hydro_year` is still appropriate for the latest ReEDS Reference run.
3. Check https://www.nlr.gov/analysis/voluntary-power-procurement for a newer
NLR Green Power Data release (e.g. `nrel-green-power-data-v2025.xlsx`). If a
newer file exists, download it into `inputs/`, update `filename_voluntary` in
`data_processing.py`, and bump `Voluntarysheet_nrows` to match the number of
historical years in the new file. Also update the hardcoded `<= 2024` year
cap in the voluntary projection block (search for `rps_series[rps_series['t'] <= 2024]`) to the new last historical year. The NLR file is updated on a
different schedule from the LBNL workbook, so this step is independent and
may be skipped if no newer NLR release is available.
4. (Optional, but recommended for PR review) Stage the previous run's outputs
for comparison. From `state_policies/`:
```
copy outputs\rps_fraction.csv "old and new data comparison\old ReEDS input\rps_fraction0.csv"
copy outputs\ces_fraction.csv "old and new data comparison\old ReEDS input\ces_fraction0.csv"
copy outputs\hydrofrac_policy.csv "old and new data comparison\old ReEDS input\hydrofrac_policy0.csv"
```
5. Run the processing script:
```
python data_processing.py
```
6. Generate before/after comparison plots:
```
cd "old and new data comparison"
python generate_comparison_plots.py
```

This writes six PNGs to `old and new data comparison/plots/`:
`rps_all_comparison.png`, `rps_solar_comparison.png`,
`rps_wind_comparison.png`, `ces_fraction_comparison.png`,
`hydrofrac_RPS_All_comparison.png`, and
`hydrofrac_CES_comparison.png`. Attach these to the PR so reviewers can see
what changed.

# Output files

The `outputs/` folder is **generated by `data_processing.py` and is not tracked
in git** (see `.gitignore`). The canonical copies of the three files below live
in the ReEDS repository after the annual update PR is merged; the diagnostic
intermediates are only needed locally to regenerate the comparison plots.

Files that get copied into ReEDS (`inputs/state_policies/` in the ReEDS repo):

* `rps_fraction.csv`
* `ces_fraction.csv`
* `hydrofrac_policy.csv`

Diagnostic / intermediate files (not used directly by ReEDS) live in
`outputs/intermediate outputs/`:

* `rps_fraction_intermediate.csv` — RPS fractions before piecewise interpolation.
* `ces_fraction_intermediate.csv` — CES fractions before piecewise interpolation.
55 changes: 34 additions & 21 deletions state_policies/data_processing.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,47 +7,59 @@

print("...Starting data processing for RPS, CES and hydrofrac...")


### Format numeric values written to CSV: 6 decimal places, except exactly 1 -> "1"
### and exactly 0 -> "0". Used as the `float_format` argument to `DataFrame.to_csv`.
def _format_value(x):
if pd.isna(x):
return ''
if x == 0:
return '0'
if x == 1:
return '1'
return f'{x:.6f}'

### ===========================================
### ===========Load Input Data=================
### ===========================================

### Input `RPS data for NREL_June 2025.xlsx` file and convert it to a DataFrame
### Input `RPS & CES Targets and Demand_June 2026.xlsx` file and convert it to a DataFrame
### If update the input file, please make sure the below table parameters are updated accordingly.
### This file is provided annually by Galen Barbose at LBNL
### ----------------------------------------------------------------------------

filename = os.path.join("inputs", "RPS data for NREL_June 2025.xlsx")
filename = os.path.join("inputs", "RPS & CES Targets and Demand_June 2026.xlsx")

# Statewide Load sheet as sales data and RPS & CES Demand Projections sheet for RPS and CES data
# are used for capculate `rps_fraction` and `ces_fraction`.
Salessheetname = "Statewide Load"
Salessheet_usecols = "B:BC"
Salessheet_skiprows = 5
# `Statewide Sales` sheet as sales data and `RPS & CES Demand (GWh)` sheet for RPS and CES data
# are used to calculate `rps_fraction` and `ces_fraction`.
Salessheetname = "Statewide Sales"
Salessheet_usecols = "A:BB"
Salessheet_skiprows = 19
Salessheet_nrows = 52

RPSsheetname = "RPS & CES Demand Projections"
RPSsheetname = "RPS & CES Demand (GWh)"
RPSsheet_usecols = "A:BB"
RPSsheet_skiprows = 2
RPSsheet_nrows = 97
RPSsheet_skiprows = 28
RPSsheet_nrows = 96

# Hydro sheet is used for hydrofrac calculation.
### Hydro / Non-RE Accounting input
Hydrosheetname = "Non-RE Accounting"

Hydrosheet_RPS_usecols = "A:E"
Hydrosheet_RPS_skiprows = 2
Hydrosheet_RPS_nrows = 33
Hydrosheet_RPS_nrows = 32

Hydrosheet_CES_usecols = "A:D"
Hydrosheet_CES_skiprows = 39
Hydrosheet_CES_nrows = 16

### Input voluntary RPS data which is downloaded from NLR Green Power Data
### Input voluntary RPS data which is downloaded from NLR Voluntary Power Procurement website
### If update the input file, please make sure the below table parameters are updated accordingly.
### https://www.nlr.gov/analysis/green-power
### https://www.nlr.gov/analysis/voluntary-power-procurement
### -----------------------------------------------------------------------------

# These data are used to calculate voluntary RPS fraction and will be appended to `rps_fraction.csv`
filename_voluntary = os.path.join("inputs", "nrel-green-power-data-v2023.xlsx")
filename_voluntary = os.path.join("inputs", "nrel-green-power-data-v2024.xlsx")
Voluntarysheetname = "Marketwide Estimates"
Voluntarysheet_usecols = "A:C"
Voluntarysheet_skiprows = 2
Expand Down Expand Up @@ -173,7 +185,7 @@ def calculate_solar_wind(RPS_reshaped, In_Retail_Sales, tech):
RPStarget.columns = ['t', 'st', 'rps_all', 'rps_solar', 'rps_wind']

# Append voluntary RPS data
# Use 2010-2023 data as historical data
# Use 2010-2024 data as historical data
# and project future data until 2050 using the minimum absolute growth rate from historical data.
voluntary_data = pd.read_excel(voluntary_file, sheet_name=Voluntarysheetname, usecols=Voluntarysheet_usecols, skiprows=Voluntarysheet_skiprows, nrows=Voluntarysheet_nrows)
voluntary_data = voluntary_data.rename(columns={'Year': 'Year'})
Expand All @@ -187,7 +199,7 @@ def calculate_solar_wind(RPS_reshaped, In_Retail_Sales, tech):
voluntary_data_historical = voluntary_data_historical[['t', 'st', 'rps_all']].assign(rps_solar=0.0, rps_wind=0.0)

rps_series = voluntary_data_historical[['t', 'rps_all']].dropna()
rps_series = rps_series[rps_series['t'] <= 2023].sort_values('t')
rps_series = rps_series[rps_series['t'] <= 2024].sort_values('t')
min_growth = rps_series['rps_all'].diff().min()

last_year = rps_series['t'].max()
Expand All @@ -208,7 +220,7 @@ def calculate_solar_wind(RPS_reshaped, In_Retail_Sales, tech):
final_rps = final_rps[final_rps['t'] > 2009].sort_values(by=['st', 't'])

output_path = os.path.join("outputs", "intermediate outputs", "rps_fraction_intermediate.csv")
final_rps.to_csv(output_path, index=False)
final_rps.to_csv(output_path, index=False, float_format=_format_value)
print(f"...Intermediate RPS data processed and saved to {output_path}")

### Function to calculate CES fractions
Expand Down Expand Up @@ -277,7 +289,7 @@ def calculate_ces_fraction(main_excel_file):

# Save to CSV
output_path = os.path.join("outputs", "intermediate outputs", "ces_fraction_intermediate.csv")
out_df.to_csv(output_path, index=False)
out_df.to_csv(output_path, index=False, float_format=_format_value)
print(f"...Intermediate CES data processed and saved to {output_path}")

### Function to calculate hydro fractions in selected year
Expand Down Expand Up @@ -329,7 +341,7 @@ def calculate_hydrofrac(main_excel_file, gen_df, hierarchy_df):
# Format and save
outdf = df_final[["State", "hydrofrac_RPS", "hydrofrac_CES"]].rename(columns={"State": "st", "hydrofrac_RPS": "RPS_All", "hydrofrac_CES": "CES"})
output_path = os.path.join("outputs", "hydrofrac_policy.csv")
outdf.sort_values("st").round(9).to_csv(output_path, index=False)
outdf.sort_values("st").round(9).to_csv(output_path, index=False, float_format=_format_value)
print(f"...hydrofrac data generated and saved to {output_path}")


Expand Down Expand Up @@ -396,7 +408,7 @@ def piecewise_interpolate(series, base_year, tolerance):
df_out = pd.merge(df_out, df_interp_long, on=[index_col, state_col], how='left')

df_out = df_out.sort_values([state_col, index_col])
df_out.to_csv(output_path, index=False)
df_out.to_csv(output_path, index=False, float_format=_format_value)


### ===========================================
Expand All @@ -406,6 +418,7 @@ def piecewise_interpolate(series, base_year, tolerance):
if __name__ == "__main__":

os.makedirs("outputs", exist_ok=True)
os.makedirs(os.path.join("outputs", "intermediate outputs"), exist_ok=True)

# --- Run Processing Functions ---

Expand Down
Binary file not shown.
Binary file not shown.
Loading