Retrieve Data#

Numerous datasets used in PyPSA USA are large and are not stored on GitHub. Insted, data is stored on Zenodo or supplier websites, and the workflow will automatically download these datasets via the retrieve rules

Note

If you recieve the follwing error while running a retrieve rule on Linux

FileNotFoundError: [Errno 2] No such file or directory: 'unzip'

Run the command sudo apt install zip

Rule retrieve_zenodo_databundles#

Data used to create the base electrical network is pulled from Breakthrough Energy (~4.3GB). This includes geolocated data on substations, power lines, generators, electrical demand, and resource potentials.

DOI

Protected land area data for the USA is retrieved from Protected Planet via the PyPSA Meets-Earth data deposit (natura_global) (~100MB).

DOI

Baythymetry data via GEBCO and a cutout of USA Copernicus Global Land Service data are downloaded from a PyPSA USA Zenodo depost (~2GB).

DOI

Rule retrieve_sector_databundle#

Retrives data for sector coupling

DOI

Geographic Data

Geographic boundaries of the United States counties are taken from the United States Census Bureau. Note, these follow 2020 boundaries to match census numbers

URL

County level populations are taken from the United States Census Bureau. Filters applied:

  • Geography: All Counties within United States and Puerto Rico

  • Year: 2020

  • Surveys: Decennial Census, Demographic and Housing Characteristics

Sheet Name: Decennial Census - P1 | Total Population - 2020: DEC Demographic and Housing Characteristics

URL

County level urbanization rates are taken from the United States Census Bureau. Filters applied:

  • Geography: All Counties within United States and Puerto Rico

  • Year: 2020

  • Surveys: Decennial Census, Demographic and Housing Characteristics

Sheet Name: Decennial Census - H1 | Housing Units - 2020: DEC Demographic and Housing Characteristics

URL

Natural Gas Data

Natural Gas infrastructure includes:

  • State to State pipeline capacity

  • State level tranmsission pipeline volume

  • Natural gas processing facility locations

  • Natural gas processing facility locations (via EIA API)

  • Natural gas underground storage (via EIA API)

  • Natural Gas imports/exports by point of entry (via EIA API)

URL URL URL

Rule retrieve_gridemissions_data#

Description

Historical electrical generation, demand, interchange, and emissions data are retrieved from the GridEmissions. Data is downloaded at hourly temporal resolution and at a spatial resolution of balancing authority region.

Outputs

  • data/GridEmissions/EIA_DMD_2018_2024.csv

  • data/eia/EIA_DMD_*.csv

Rule retrieve_nrel_efs_data#

The Electrification Futures Study (EFS) are a series of publications from the NREL that explore the impacts of electrification in all USA economic sectors. As part of this, study are the EFS hourly load profiles. These load profiles represent projected end-use electricity demand for various scenarios. Load profiles are provided for a subset of years (2018, 2020, 2024, 2030, 2040, 2050) and are aggregated to the state, sector, and select subsector level. See the EFS Load Profile Data Catalog for full details.

URL

Rule retrieve_cutout#

Cutouts are spatio-temporal subsets of the USA weather data from the ERA5 dataset. They have been prepared by and are for use with the atlite tool. You can either generate them yourself using the build_cutouts rule or retrieve them directly from zenodo through the rule retrieve_cutout.

DOI

Note

Only the 2019 interconnects based on ERA5 have been prepared and saved to Zenodo for download

Rule retrieve_cost_data#

This rule downloads economic assumptions from various sources.

The NREL Annual Technology Baseline provides economic parameters on capital costs, fixed operation costs, variable operating costs, fuel costs, technology specific discount rates, average capacity factors, and efficiencies.

URL

AWS

State level capital cost supply side generator cost multipliers are pulled from the “Capital Cost and Performance Characteristic Estimates for Utility Scale Electric Power Generating Technologies” by the EIA. Note, these have been saved as CSV’s and come with the repository download

URL

State level historial monthly natural gas fuel prices are taken from the EIA. This includes seperate prices for electrical power producers, industrial customers, commercial customers, and residential customers.

URL

State level historical coal fuel prices are taken from the EIA.

URL

The Annual Technology Baseline also provides data on the transportation sector, including fuel usage and capital costs.

URL

To populate any missing data, the PyPSA/technology-data project is used. Data from here is only used when no other sources can be found, as it is mostly European focused.

GitHub

Relevant Settings

enable:
    retrieve_cost_data:

costs:
    year:
    version:

See also

Documentation of the configuration file config/config.yaml at :ref:costs_cf

Outputs

  • resources/costs.csv

Rule retrieve_caiso_data#

Historical daily natural gas fuel prices are retrieved from CAISO’s Open Access Same-time Information System (OASIS). Data is collected on a daily basis for each Balancing Area and Fuel Region that had joined the Western Energy Imbalance Market (WEIM) during the time period designated in the configuration fuel_year.

CAISO

Relevant Settings

fuel_year:

Inputs

  • repo_data/wecc_fuelregions.xlsx: A list of fuel regions and their corresponding Balancing Authorities.

Outputs

  • data/fuel_prices.csv: A CSV file containing the daily average fuel prices for each Balancing Authority in the WEIM.