Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add retrieval function for ERA5 reanalysis data #1264

Open
wants to merge 44 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 35 commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
8e8661c
Create era5.py
AdamRJensen Jun 16, 2021
25662a4
Update era5.py
AdamRJensen Jun 17, 2021
f6bc57e
Update era5.py
AdamRJensen Jun 17, 2021
57eecb5
Update era5.py
AdamRJensen Jul 27, 2021
c135481
Fix stickler
AdamRJensen Jul 27, 2021
f4e9242
Add xarray and cdsapi to requirements and setup.py
AdamRJensen Aug 8, 2021
af3067e
Remove grib support and update variable_map
AdamRJensen Aug 8, 2021
b2c400c
Fix stickler
AdamRJensen Aug 8, 2021
bb62d7f
Add era5 to api.rst, __init__, and whatsnewfile
AdamRJensen Aug 9, 2021
9f6b15f
Fix error in __init__.py
AdamRJensen Aug 9, 2021
9a22c04
Change parse to read in __init__.py
AdamRJensen Aug 9, 2021
21bedd1
Update get_era5 documentation
AdamRJensen Aug 9, 2021
6c76efd
Merge remote-tracking branch 'upstream/master' into era5
AdamRJensen Aug 9, 2021
3422ddf
Improve docs and use open_mfdataset in read_era5
AdamRJensen Aug 11, 2021
545bafa
Coverage for ERA5 incl. test file
AdamRJensen Aug 11, 2021
8ed6df5
Export CDSAPI_KEY in conda_linux.yml
AdamRJensen Aug 11, 2021
041732f
Fix stickler
AdamRJensen Aug 11, 2021
93bd4ec
Install xarray with pip in requirements-py36
AdamRJensen Aug 11, 2021
519b060
Include xarray in TEST_REQUIRE in setup.py
AdamRJensen Aug 11, 2021
b732c76
Add UID to era5 tests
AdamRJensen Aug 11, 2021
4c911ac
Add requires_xarray to conftest.py
AdamRJensen Aug 11, 2021
f1b83a2
Renaming get_era5 inputs
AdamRJensen Aug 11, 2021
7a13531
Updated CDSAPI_KEY usage
AdamRJensen Aug 11, 2021
87b6a50
Extend test coverage
AdamRJensen Aug 11, 2021
bff7299
Add dask as optional dependency
AdamRJensen Aug 16, 2021
5de52bc
Update documentation
AdamRJensen Aug 16, 2021
0721f15
More documentation updates
AdamRJensen Aug 16, 2021
2e90c67
Extend parsed metadata
AdamRJensen Aug 16, 2021
9f0a624
Coverage for output_format parameter
AdamRJensen Aug 16, 2021
7921666
Fix stickler
AdamRJensen Aug 16, 2021
03e4a5f
Update description of test_get_cams_bad_request
AdamRJensen Aug 16, 2021
5e1db86
Implement changes from review by kanderso-nrel
AdamRJensen Aug 16, 2021
dd443e4
Remove cds_client input parameter
AdamRJensen Aug 17, 2021
c25780e
Localize dataframe and add helper functions to pvlib.tools
AdamRJensen Aug 17, 2021
5a00ad5
Reformat imports of non-standard packages
AdamRJensen Aug 19, 2021
46f38c1
Set quit=True in cds_client and add depdencies in whatsnew
AdamRJensen Aug 20, 2021
cde8a4a
Rename variables before extracting metadata
AdamRJensen Aug 20, 2021
3f96eec
Change file_location to file_url and remove lat/lon offset
AdamRJensen Aug 27, 2021
6cd44b1
Merge branch 'master' into era5
AdamRJensen Aug 27, 2021
5db2fef
Remove has_tables from conftest.py
AdamRJensen Aug 27, 2021
f6443fc
Merge branch 'era5' of https://github.com/AdamRJensen/pvlib-python in…
AdamRJensen Aug 27, 2021
6534c66
Revert "Merge branch 'era5' of https://github.com/AdamRJensen/pvlib-p…
AdamRJensen Aug 27, 2021
3ae5327
Revert back to 3f96eec
AdamRJensen Aug 27, 2021
faa5a45
Merge remote-tracking branch 'upstream/master' into era5
AdamRJensen Sep 13, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions ci/azure/conda_linux.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ jobs:
export NREL_API_KEY=$(nrelApiKey)
export BSRN_FTP_USERNAME=$(BSRN_FTP_USERNAME)
export BSRN_FTP_PASSWORD=$(BSRN_FTP_PASSWORD)
export CDSAPI_KEY=$(CDSAPI_KEY)
pytest pvlib --remote-data --junitxml=junit/test-results.xml --cov --cov-report=xml --cov-report=html
displayName: 'pytest'
- task: PublishTestResults@2
Expand Down
3 changes: 3 additions & 0 deletions ci/requirements-py36.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,10 @@ channels:
- defaults
- conda-forge
dependencies:
- cdsapi
- coveralls
- cython
- dask
- ephem
- netcdf4
- nose
Expand All @@ -27,6 +29,7 @@ dependencies:
- shapely # pvfactors dependency
- siphon # conda-forge
- statsmodels
- xarray
- pip:
- dataclasses
- nrel-pysam>=2.0
Expand Down
3 changes: 3 additions & 0 deletions ci/requirements-py37.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,10 @@ channels:
- defaults
- conda-forge
dependencies:
- cdsapi
- coveralls
- cython
- dask
- ephem
- netcdf4
- nose
Expand All @@ -27,6 +29,7 @@ dependencies:
- shapely # pvfactors dependency
- siphon # conda-forge
- statsmodels
- xarray
- pip:
- nrel-pysam>=2.0
- pvfactors==1.4.1
3 changes: 3 additions & 0 deletions ci/requirements-py38.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,10 @@ channels:
- defaults
- conda-forge
dependencies:
- cdsapi
- coveralls
- cython
- dask
- ephem
- netcdf4
- nose
Expand All @@ -27,6 +29,7 @@ dependencies:
- shapely # pvfactors dependency
- siphon # conda-forge
- statsmodels
- xarray
- pip:
- nrel-pysam>=2.0
- pvfactors==1.4.1
3 changes: 3 additions & 0 deletions ci/requirements-py39.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,10 @@ channels:
- defaults
- conda-forge
dependencies:
- cdsapi
- coveralls
- cython
- dask
- ephem
# - netcdf4 # pulls in a different version of numpy with ImportError
- nose
Expand All @@ -27,6 +29,7 @@ dependencies:
- shapely # pvfactors dependency
# - siphon # conda-forge
- statsmodels
- xarray
- pip:
# - nrel-pysam>=2.0 # install error on windows
- pvfactors==1.4.1
2 changes: 2 additions & 0 deletions docs/sphinx/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -497,6 +497,8 @@ of sources and file formats relevant to solar energy modeling.
iotools.get_cams
iotools.read_cams
iotools.parse_cams
iotools.get_era5
iotools.read_era5

A :py:class:`~pvlib.location.Location` object may be created from metadata
in some files.
Expand Down
4 changes: 4 additions & 0 deletions docs/sphinx/source/whatsnew/v0.9.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,10 @@ Deprecations

Enhancements
~~~~~~~~~~~~
* Add :func:`~pvlib.iotools.get_era5` and
:func:`~pvlib.iotools.read_era5` for retrieving and reading
ERA5 reanalysis netcdf files from the Climate Data Store (CDS)
files. (:pull:`1264`)
* Added :func:`~pvlib.iotools.read_pvgis_hourly` and
:func:`~pvlib.iotools.get_pvgis_hourly` for reading and retrieving hourly
solar radiation data and PV power output from PVGIS. (:pull:`1186`,
Expand Down
Binary file added pvlib/data/era5_testfile.nc
Binary file not shown.
Binary file added pvlib/data/era5_testfile_1day.nc
Binary file not shown.
2 changes: 2 additions & 0 deletions pvlib/iotools/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,5 @@
from pvlib.iotools.sodapro import get_cams # noqa: F401
from pvlib.iotools.sodapro import read_cams # noqa: F401
from pvlib.iotools.sodapro import parse_cams # noqa: F401
from pvlib.iotools.era5 import get_era5 # noqa: F401
from pvlib.iotools.era5 import read_era5 # noqa: F401
276 changes: 276 additions & 0 deletions pvlib/iotools/era5.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,276 @@
"""Functions to retreive and read ERA5 data from the CDS.
.. codeauthor:: Adam R. Jensen<[email protected]>
"""
# The functions only support single-level 2D data and not 3D / pressure-level
# data. Also, monthly datasets and grib files are no supported.

import requests
from pvlib.tools import (_extract_metadata_from_dataset,
_convert_C_to_K_in_dataset)

try:
import cdsapi
except ImportError:
cdsapi = None

try:
import xarray as xr
except ImportError:
xr = None

# The returned data uses shortNames, whereas the request requires variable
# names according to the CDS convention - passing shortNames results in an
# "Ambiguous" error being raised
ERA5_DEFAULT_VARIABLES = [
'2m_temperature', # t2m
'10m_u_component_of_wind', # u10
'10m_v_component_of_wind', # v10
'surface_pressure', # sp
'mean_surface_downward_short_wave_radiation_flux', # msdwswrf
'mean_surface_downward_short_wave_radiation_flux_clear_sky', # msdwswrfcs
'mean_surface_direct_short_wave_radiation_flux', # msdrswrf
'mean_surface_direct_short_wave_radiation_flux_clear_sky', # msdrswrfcs
]

ERA5_VARIABLE_MAP = {
't2m': 'temp_air',
'd2m': 'temp_dew',
'sp': 'pressure',
'msdwswrf': 'ghi',
'msdwswrfcs': 'ghi_clear',
'msdwlwrf': 'lwd',
'msdwlwrfcs': 'lwd_clear',
'msdrswrf': 'bhi',
'msdrswrfcs': 'bhi_clear',
'mtdwswrf': 'ghi_extra'}

ERA5_HOURS = [
'00:00', '01:00', '02:00', '03:00', '04:00', '05:00', '06:00', '07:00',
'08:00', '09:00', '10:00', '11:00', '12:00', '13:00', '14:00', '15:00',
'16:00', '17:00', '18:00', '19:00', '20:00', '21:00', '22:00', '23:00']

CDSAPI_URL = 'https://cds.climate.copernicus.eu/api/v2'


def get_era5(latitude, longitude, start, end, api_key=None,
variables=ERA5_DEFAULT_VARIABLES,
dataset='reanalysis-era5-single-levels',
product_type='reanalysis', grid=(0.25, 0.25), save_path=None,
output_format=None, map_variables=True):
"""
Retrieve ERA5 reanalysis data from the Copernicus Data Store (CDS).

* Temporal coverage: 1979 to present (latency of ~5 days)
* Temporal resolution: hourly
* Spatial coverage: global
* Spatial resolution: 0.25° by 0.25°

An overview of ERA5 is given in [1]_ and [2]_. Data is retrieved using the
CDSAPI [3]_.

.. admonition:: Time reference

ERA5 time stamps are in UTC and corresponds to the end of the period
(right labeled). E.g., the time stamp 12:00 for hourly data refers to
the period from 11:00 to 12:00.

.. admonition:: Usage notes

To use this function the package CDSAPI [4]_ needs to be installed
[3]_. The CDSAPI keywords are described in [5]_.

Requested variables should be specified according to the naming
convention used by the CDS. The returned data contains the short-name
versions of the variables. See [2]_ for a list of variables names and
units.

Access to the CDS requires user registration, see [6]_. The obtaining
API key can either be passed directly to the function or be saved in a
local file as described in [3]_.

It is possible to check your
`request status <https://cds.climate.copernicus.eu/cdsapp#!/yourrequests>`_
and the `status of all queued requests <https://cds.climate.copernicus.eu/live/queue>`_.

Parameters
----------
latitude: float or list
in decimal degrees, between -90 and 90, north is positive (ISO 19115).
If latitude is a list, it should have the format [S, N] and
latitudes within the range are selected according to the grid.
longitude: float or list
in decimal degrees, between -180 and 180, east is positive (ISO 19115).
If longitude is a list, it should have the format [W, E] and
longitudes within the range are selected according to the grid.
start: datetime like
First day of the requested period
end: datetime like
Last day of the requested period
api_key: str, optional
Personal API key for the CDS with the format "uid:key" e.g.
'00000:aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee'
variables: list, default: ERA5_DEFAULT_VARIABLES
List of variables to retrieve (according to CDS naming convention)
dataset: str, default 'reanalysis-era5-single-levels'
Name of the dataset to retrieve the variables from. Can be either
'reanalysis-era5-single-levels' or 'reanalysis-era5-land'.
product_type: str, {'reanalysis', 'ensemble_members', 'ensemble_mean', 'ensemble_spread'}, default: 'reanalysis'
ERA5 product type
grid: list or tuple, default: (0.25, 0.25)
User specified grid resolution
save_path: str or path-like, optional
Filename of where to save data. Should have ".nc" extension.
output_format: {'dataframe', 'dataset'}, optional
Type of data object to return. Default is to return a pandas DataFrame
if file only contains one location and otherwise return an xarray
Dataset.
map_variables: bool, default: True
When true, renames columns of the DataFrame to pvlib variable names
where applicable. See variable ERA5_VARIABLE_MAP.

Notes
-----
The returned data includes the following fields by default:

======================== ====== =========================================
Key, mapped key Format Description
======================== ====== =========================================
*Mapped field names are returned when the map_variables argument is True*
---------------------------------------------------------------------------
2tm, temp_air float Air temperature at 2 m above ground [K]
u10 float Horizontal airspeed towards east at 10 m [m/s]
v10 float Horizontal airspeed towards north at 10 m [m/s]
sp, pressure float Atmospheric pressure at the ground [Pa]
msdwswrf, ghi float Mean surface downward short-wave radiation flux [W/m^2]
msdwswrfcs, ghi_clear float Mean surface downward short-wave radiation flux, clear sky [W/m^2]
msdrswrf, bhi float Mean surface direct short-wave radiation flux [W/m^2]
msdrswrfcs, bhi_clear float Mean surface direct short-wave radiation flux, clear sky [W/m^2]
======================== ====== =========================================

Returns
-------
data: DataFrame
ERA5 time-series data, fields depend on the requested data. The
returned object is either a pandas DataFrame or an xarray dataset,
depending on the output_format parameter.
metadata: dict
Metadata for the time-series.

See Also
--------
pvlib.iotools.read_era5

References
----------
.. [1] `ERA5 hourly data on single levels from 1979 to present
<https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-single-levels?tab=overview>`_
.. [2] `ERA5 data documentation
<https://confluence.ecmwf.int/display/CKB/ERA5%3A+data+documentation>`_
.. [3] `How to use the CDS API
<https://cds.climate.copernicus.eu/api-how-to>`_
.. [4] `CDSAPI source code
<https://github.com/ecmwf/cdsapi>`_
.. [5] `Climate Data Store (CDS) API Keywords
<https://confluence.ecmwf.int/display/CKB/Climate+Data+Store+%28CDS%29+API+Keywords>`_
.. [6] `Climate Data Storage user registration
<https://cds.climate.copernicus.eu/user/register>`_
""" # noqa: E501
if cdsapi is None:
raise ImportError('Retrieving ERA5 data requires cdsapi to be installed.') # noqa: E501

cds_client = cdsapi.Client(url=CDSAPI_URL, key=api_key, verify=1)

# Area is selected by a box made by the four coordinates: [N, W, S, E]
try:
area = [latitude[1], longitude[0], latitude[0], longitude[1]]
except TypeError:
area = [latitude+0.005, longitude-0.005,
latitude-0.005, longitude+0.005]

params = {
'product_type': product_type,
'variable': variables,
'date': start.strftime('%Y-%m-%d') + '/' + end.strftime('%Y-%m-%d'),
'time': ERA5_HOURS,
'grid': grid,
'area': area,
'format': 'netcdf'}

# Retrieve remote path to the file
file_location = cds_client.retrieve(dataset, params)

# Load file into memory
with requests.get(file_location.location) as res:

# Save the file locally if save_path has been specified
if save_path is not None:
with open(save_path, 'wb') as f:
f.write(res.content)

return read_era5(res.content, map_variables=map_variables,
output_format=output_format)


def read_era5(filename, output_format=None, map_variables=True):
"""Read one or more ERA5 netcdf files.

Parameters
----------
filename: str or path-like or list
Filename of a netcdf file containing ERA5 data or a list of filenames.
output_format: {'dataframe', 'dataset'}, optional
Type of data object to return. Default is to return a pandas DataFrame
if file only contains one location and otherwise return an xarray
dataset.
map_variables: bool, default: True
When true, renames columns to pvlib variable names where applicable.
See variable ERA5_VARIABLE_MAP.

Returns
-------
data: DataFrame
ERA5 time-series data, fields depend on the requested data. The
returned object is either a pandas DataFrame or an xarray dataset,
depending on the output_format parameter.
metadata: dict
Metadata for the time-series.

See Also
--------
pvlib.iotools.get_era5

References
----------
.. [1] `ERA5 hourly data on single levels from 1979 to present
<https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-single-levels?tab=overview>`_
.. [2] `ERA5 data documentation
<https://confluence.ecmwf.int/display/CKB/ERA5%3A+data+documentation>`_
"""
if xr is None:
raise ImportError('Reading ERA5 data requires xarray to be installed.')

# open multiple-files (mf) requires dask
if isinstance(filename, (list, tuple)):
ds = xr.open_mfdataset(filename)
else:
ds = xr.open_dataset(filename)

ds = _convert_C_to_K_in_dataset(ds)
metadata = _extract_metadata_from_dataset(ds)

if map_variables:
# Renaming of xarray datasets throws an error if keys are missing
ds = ds.rename_vars(
{k: v for k, v in ERA5_VARIABLE_MAP.items() if k in list(ds)})

if (output_format == 'dataframe') or (
(output_format is None) & (ds['latitude'].size == 1) &
(ds['longitude'].size == 1)):
data = ds.to_dataframe()
# Localize timezone to UTC
data.index = data.index.set_levels(data.index.get_level_values('time').tz_localize('utc'), 'time') # noqa: E501
if (ds['latitude'].size == 1) & (ds['longitude'].size == 1):
data = data.droplevel(['latitude', 'longitude'])
return data, metadata
else:
return ds, metadata
Loading