Retrieve Data#
Numerous datasets used in PyPSA USA are large and are not stored on GitHub. Insted, data is stored on Zenodo or supplier websites, and the workflow will automatically download these datasets via the retrieve
rules
Note
If you recieve the follwing error while running a retrieve rule on Linux
FileNotFoundError: [Errno 2] No such file or directory: 'unzip'
Run the command sudo apt install zip
Rule retrieve_zenodo_databundles
#
Data used to create the base electrical network is pulled from Breakthrough Energy (~4.3GB). This includes geolocated data on substations, power lines, generators, electrical demand, and resource potentials.
Protected land area data for the USA is retrieved from Protected Planet via the PyPSA Meets-Earth data deposit (natura_global
) (~100MB).
Baythymetry data via GEBCO and a cutout of USA Copernicus Global Land Service data are downloaded from a PyPSA USA Zenodo depost (~2GB).
Rule retrieve_sector_databundle
#
Retrives data for sector coupling
Geographic Data
Geographic boundaries of the United States counties are taken from the United States Census Bureau. Note, these follow 2020 boundaries to match census numbers
County level populations are taken from the United States Census Bureau. Filters applied:
Geography: All Counties within United States and Puerto Rico
Year: 2020
Surveys: Decennial Census, Demographic and Housing Characteristics
Sheet Name: Decennial Census - P1 | Total Population - 2020: DEC Demographic and Housing Characteristics
County level urbanization rates are taken from the United States Census Bureau. Filters applied:
Geography: All Counties within United States and Puerto Rico
Year: 2020
Surveys: Decennial Census, Demographic and Housing Characteristics
Sheet Name: Decennial Census - H1 | Housing Units - 2020: DEC Demographic and Housing Characteristics
Natural Gas Data
Natural Gas infrastructure includes:
State to State pipeline capacity
State level tranmsission pipeline volume
Natural gas processing facility locations
Natural gas processing facility locations (via EIA API)
Natural gas underground storage (via EIA API)
Natural Gas imports/exports by point of entry (via EIA API)
Rule retrieve_gridemissions_data
#
Description
Historical electrical generation, demand, interchange, and emissions data are retrieved from the GridEmissions. Data is downloaded at hourly temporal resolution and at a spatial resolution of balancing authority region.
Outputs
data/GridEmissions/EIA_DMD_2018_2024.csv
data/eia/EIA_DMD_*.csv
Rule retrieve_nrel_efs_data
#
The Electrification Futures Study (EFS) are a series of publications from the NREL that explore the impacts of electrification in all USA economic sectors. As part of this, study are the EFS hourly load profiles. These load profiles represent projected end-use electricity demand for various scenarios. Load profiles are provided for a subset of years (2018, 2020, 2024, 2030, 2040, 2050) and are aggregated to the state, sector, and select subsector level. See the EFS Load Profile Data Catalog for full details.
Rule retrieve_cutout
#
Cutouts are spatio-temporal subsets of the USA weather data from the ERA5 dataset. They have been prepared by and are for use with the atlite tool. You can either generate them yourself using the build_cutouts rule or retrieve them directly from zenodo through the rule retrieve_cutout
.
Note
Only the 2019 interconnects based on ERA5 have been prepared and saved to Zenodo for download
Rule retrieve_cost_data
#
This rule downloads economic assumptions from various sources.
The NREL Annual Technology Baseline provides economic parameters on capital costs, fixed operation costs, variable operating costs, fuel costs, technology specific discount rates, average capacity factors, and efficiencies.
State level capital cost supply side generator cost multipliers are pulled from the “Capital Cost and Performance Characteristic Estimates for Utility Scale Electric Power Generating Technologies” by the EIA. Note, these have been saved as CSV’s and come with the repository download
State level historial monthly natural gas fuel prices are taken from the EIA. This includes seperate prices for electrical power producers, industrial customers, commercial customers, and residential customers.
State level historical coal fuel prices are taken from the EIA.
The Annual Technology Baseline also provides data on the transportation sector, including fuel usage and capital costs.
To populate any missing data, the PyPSA/technology-data project is used. Data from here is only used when no other sources can be found, as it is mostly European focused.
Relevant Settings
enable:
retrieve_cost_data:
costs:
year:
version:
See also
Documentation of the configuration file config/config.yaml
at
:ref:costs_cf
Outputs
resources/costs.csv
Rule retrieve_caiso_data
#
Historical daily natural gas fuel prices are retrieved from CAISO’s Open Access
Same-time Information System (OASIS). Data is collected on a daily basis for
each Balancing Area and Fuel Region that had joined the Western Energy
Imbalance Market (WEIM) during the time period designated in the configuration
fuel_year
.
Relevant Settings
fuel_year:
Inputs
repo_data/wecc_fuelregions.xlsx
: A list of fuel regions and their corresponding Balancing Authorities.
Outputs
data/fuel_prices.csv
: A CSV file containing the daily average fuel prices for each Balancing Authority in the WEIM.