Skip to content

Commit 65c7b28

Browse files
schlunmavaleriupredoibouweandelabettina-gier
authored
Add support for native ERA5 data in GRIB format (#2178)
Co-authored-by: Valeriu Predoi <[email protected]> Co-authored-by: Bouwe Andela <[email protected]> Co-authored-by: Bettina Gier <[email protected]>
1 parent 4c36a0c commit 65c7b28

File tree

19 files changed

+1296
-197
lines changed

19 files changed

+1296
-197
lines changed

doc/quickstart/configure.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -974,7 +974,7 @@ infrastructure. The following example illustrates the concept.
974974
.. _extra-facets-example-1:
975975

976976
.. code-block:: yaml
977-
:caption: Extra facet example file `native6-era5.yml`
977+
:caption: Extra facet example file `native6-era5-example.yml`
978978
979979
ERA5:
980980
Amon:

doc/quickstart/find_data.rst

Lines changed: 101 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -107,18 +107,27 @@ The following native reanalysis/observational datasets are supported under the
107107
To use these datasets, put the files containing the data in the directory that
108108
you have :ref:`configured <config_options>` for the ``rootpath`` of the
109109
``native6`` project, in a subdirectory called
110-
``Tier{tier}/{dataset}/{version}/{frequency}/{short_name}``.
110+
``Tier{tier}/{dataset}/{version}/{frequency}/{short_name}`` (assuming you are
111+
using the ``default`` DRS for ``native6``).
111112
Replace the items in curly braces by the values used in the variable/dataset
112113
definition in the :ref:`recipe <recipe_overview>`.
113-
Below is a list of native reanalysis/observational datasets currently
114-
supported.
115114

116-
.. _read_native_era5:
115+
.. _read_native_era5_nc:
117116

118-
ERA5
119-
^^^^
117+
ERA5 (in netCDF format downloaded from the CDS)
118+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
119+
120+
ERA5 data can be downloaded from the Copernicus Climate Data Store (CDS) using
121+
the convenient tool `era5cli <https://era5cli.readthedocs.io>`__.
122+
For example for monthly data, place the files in the
123+
``/Tier3/ERA5/version/mon/pr`` subdirectory of your ``rootpath`` that you have
124+
configured for the ``native6`` project (assuming you are using the ``default``
125+
DRS for ``native6``).
120126

121-
- Supported variables: ``cl``, ``clt``, ``evspsbl``, ``evspsblpot``, ``mrro``, ``pr``, ``prsn``, ``ps``, ``psl``, ``ptype``, ``rls``, ``rlds``, ``rsds``, ``rsdt``, ``rss``, ``uas``, ``vas``, ``tas``, ``tasmax``, ``tasmin``, ``tdps``, ``ts``, ``tsn`` (``E1hr``/``Amon``), ``orog`` (``fx``)
127+
- Supported variables: ``cl``, ``clt``, ``evspsbl``, ``evspsblpot``, ``mrro``,
128+
``pr``, ``prsn``, ``ps``, ``psl``, ``ptype``, ``rls``, ``rlds``, ``rsds``,
129+
``rsdt``, ``rss``, ``uas``, ``vas``, ``tas``, ``tasmax``, ``tasmin``,
130+
``tdps``, ``ts``, ``tsn`` (``E1hr``/``Amon``), ``orog`` (``fx``).
122131
- Tier: 3
123132

124133
.. note:: According to the description of Evapotranspiration and potential Evapotranspiration on the Copernicus page
@@ -131,6 +140,85 @@ ERA5
131140
of both liquid and solid phases to vapor (from underlying surface and vegetation)."
132141
Therefore, the ERA5 (and ERA5-Land) CMORizer switches the signs of ``evspsbl`` and ``evspsblpot`` to be compatible with the CMOR standard used e.g. by the CMIP models.
133142

143+
.. _read_native_era5_grib:
144+
145+
ERA5 (in GRIB format available on DKRZ's Levante or downloaded from the CDS)
146+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
147+
148+
ERA5 data in monthly, daily, and hourly resolution is `available on Levante
149+
<https://docs.dkrz.de/doc/dataservices/finding_and_accessing_data/era_data/index.html#era-data>`__
150+
in its native GRIB format.
151+
152+
.. note::
153+
ERA5 data in its native GRIB format can also be downloaded from the
154+
`Copernicus Climate Data Store (CDS)
155+
<https://cds.climate.copernicus.eu/datasets>`__.
156+
For example, hourly data on pressure levels is available `here
157+
<https://cds.climate.copernicus.eu/datasets/reanalysis-era5-pressure-levels?tab=download>`__.
158+
Reading self-downloaded ERA5 data in GRIB format is experimental and likely
159+
requires additional setup from the user like setting up the proper directory
160+
structure for the input files and/or creating a custom :ref:`DRS
161+
<config_option_drs>`.
162+
163+
To read these data with ESMValCore, use the :ref:`rootpath
164+
<config_option_rootpath>` ``/pool/data/ERA5`` with :ref:`DRS
165+
<config_option_drs>` ``DKRZ-ERA5-GRIB`` in your configuration, for example:
166+
167+
.. code-block:: yaml
168+
169+
rootpath:
170+
...
171+
native6:
172+
/pool/data/ERA5: DKRZ-ERA5-GRIB
173+
...
174+
175+
The `naming conventions
176+
<https://docs.dkrz.de/doc/dataservices/finding_and_accessing_data/era_data/index.html#file-and-directory-names>`__
177+
for input directories and files for native ERA5 data in GRIB format on Levante
178+
are
179+
180+
* input directories: ``{family}/{level}/{type}/{tres}/{grib_id}``
181+
* input files: ``{family}{level}{typeid}_{tres}_*_{grib_id}.grb``
182+
183+
All of these facets have reasonable defaults preconfigured in the corresponding
184+
:ref:`extra facets<extra_facets>` file, which is available here:
185+
:download:`native6-era5.yml
186+
</../esmvalcore/config/extra_facets/native6-era5.yml>`.
187+
If necessary, these facets can be overwritten in the recipe.
188+
189+
Thus, example dataset entries could look like this:
190+
191+
.. code-block:: yaml
192+
193+
datasets:
194+
- {project: native6, dataset: ERA5, timerange: '2000/2001',
195+
short_name: tas, mip: Amon}
196+
- {project: native6, dataset: ERA5, timerange: '2000/2001',
197+
short_name: cl, mip: Amon, tres: 1H, frequency: 1hr}
198+
- {project: native6, dataset: ERA5, timerange: '2000/2001',
199+
short_name: ta, mip: Amon, type: fc, typeid: '12'}
200+
201+
The native ERA5 output in GRIB format is stored on a `reduced Gaussian grid
202+
<https://confluence.ecmwf.int/display/CKB/ERA5:+data+documentation#ERA5:datadocumentation-SpatialgridSpatialGrid>`__.
203+
By default, these data are regridded to a regular 0.25°x0.25° grid as
204+
`recommended by the ECMWF
205+
<https://confluence.ecmwf.int/display/CKB/ERA5%3A+What+is+the+spatial+reference#heading-Interpolation>`__
206+
using bilinear interpolation.
207+
208+
To disable this, you can use the facet ``automatic_regrid: false`` in the
209+
recipe:
210+
211+
.. code-block:: yaml
212+
213+
datasets:
214+
- {project: native6, dataset: ERA5, timerange: '2000/2001',
215+
short_name: tas, mip: Amon, automatic_regrid: false}
216+
217+
- Supported variables: ``albsn``, ``cl``, ``cli``, ``clt``, ``clw``, ``hur``,
218+
``hus``, ``o3``, ``prw``, ``ps``, ``psl``, ``rainmxrat27``, ``sftlf``,
219+
``snd``, ``snowmxrat27``, ``ta``, ``tas``, ``tdps``, ``toz``, ``ts``, ``ua``,
220+
``uas``, ``va``, ``vas``, ``wap``, ``zg``.
221+
134222
.. _read_native_mswep:
135223

136224
MSWEP
@@ -140,7 +228,10 @@ MSWEP
140228
- Supported frequencies: ``mon``, ``day``, ``3hr``.
141229
- Tier: 3
142230

143-
For example for monthly data, place the files in the ``/Tier3/MSWEP/version/mon/pr`` subdirectory of your ``native6`` project location.
231+
For example for monthly data, place the files in the
232+
``/Tier3/MSWEP/version/mon/pr`` subdirectory of your ``rootpath`` that you have
233+
configured for the ``native6`` project (assuming you are using the ``default``
234+
DRS for ``native6``).
144235

145236
.. note::
146237
For monthly data (``V220``), the data must be postfixed with the date, i.e. rename ``global_monthly_050deg.nc`` to ``global_monthly_050deg_197901-201710.nc``
@@ -642,6 +733,8 @@ first discuss the ``drs`` parameter: as we've seen in the previous section, the
642733
DRS as a standard is used for both file naming conventions and for directory
643734
structures.
644735

736+
.. _config_option_drs:
737+
645738
Explaining ``drs: CMIP5:`` or ``drs: CMIP6:``
646739
---------------------------------------------
647740
Whereas ESMValCore will by default use the CMOR standard for file naming (please

esmvalcore/_provenance.py

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
import logging
55
import os
66
from functools import total_ordering
7+
from pathlib import Path
78

89
from netCDF4 import Dataset
910
from PIL import Image
@@ -209,9 +210,10 @@ def _initialize_entity(self):
209210
"""Initialize the entity representing the file."""
210211
if self.attributes is None:
211212
self.attributes = {}
212-
with Dataset(self.filename, "r") as dataset:
213-
for attr in dataset.ncattrs():
214-
self.attributes[attr] = dataset.getncattr(attr)
213+
if "nc" in Path(self.filename).suffix:
214+
with Dataset(self.filename, "r") as dataset:
215+
for attr in dataset.ncattrs():
216+
self.attributes[attr] = dataset.getncattr(attr)
215217

216218
attributes = {
217219
"attribute:" + str(k).replace(" ", "_"): str(v)

esmvalcore/_recipe/recipe.py

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@
3737
PreprocessorFile,
3838
)
3939
from esmvalcore.preprocessor._area import _update_shapefile_path
40+
from esmvalcore.preprocessor._io import GRIB_FORMATS
4041
from esmvalcore.preprocessor._multimodel import _get_stat_identifier
4142
from esmvalcore.preprocessor._regrid import (
4243
_spec_to_latlonvals,
@@ -230,6 +231,34 @@ def _get_default_settings(dataset):
230231
return settings
231232

232233

234+
def _add_dataset_specific_settings(dataset: Dataset, settings: dict) -> None:
235+
"""Add dataset-specific settings."""
236+
project = dataset.facets["project"]
237+
dataset_name = dataset.facets["dataset"]
238+
file_suffixes = [Path(file.name).suffix for file in dataset.files]
239+
240+
# Automatic regridding for native ERA5 data in GRIB format if regridding
241+
# step is not already present (can be disabled with facet
242+
# automatic_regrid=False)
243+
if all(
244+
[
245+
project == "native6",
246+
dataset_name == "ERA5",
247+
any(grib_format in file_suffixes for grib_format in GRIB_FORMATS),
248+
"regrid" not in settings,
249+
dataset.facets.get("automatic_regrid", True),
250+
]
251+
):
252+
# Settings recommended by ECMWF
253+
# (https://confluence.ecmwf.int/display/CKB/ERA5%3A+What+is+the+spatial+reference#heading-Interpolation)
254+
settings["regrid"] = {"target_grid": "0.25x0.25", "scheme": "linear"}
255+
logger.debug(
256+
"Automatically regrid native6 ERA5 data in GRIB format with the "
257+
"settings %s",
258+
settings["regrid"],
259+
)
260+
261+
233262
def _exclude_dataset(settings, facets, step):
234263
"""Exclude dataset from specific preprocessor step if requested."""
235264
exclude = {
@@ -546,6 +575,7 @@ def _get_preprocessor_products(
546575
_apply_preprocessor_profile(settings, profile)
547576
_update_multi_dataset_settings(dataset.facets, settings)
548577
_update_preproc_functions(settings, dataset, datasets, missing_vars)
578+
_add_dataset_specific_settings(dataset, settings)
549579
check.preprocessor_supplementaries(dataset, settings)
550580
input_datasets = _get_input_datasets(dataset)
551581
missing = _check_input_files(input_datasets)

esmvalcore/cmor/_fixes/fix.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -845,6 +845,8 @@ def _fix_time_bounds(self, cube: Cube, cube_coord: Coord) -> None:
845845
"""Fix time bounds."""
846846
times = {"time", "time1", "time2", "time3"}
847847
key = times.intersection(self.vardef.coordinates)
848+
if not key: # cube has time, but CMOR variable does not
849+
return
848850
cmor = self.vardef.coordinates[" ".join(key)]
849851
if cmor.must_have_bounds == "yes" and not cube_coord.has_bounds():
850852
cube_coord.bounds = get_time_bounds(cube_coord, self.frequency)

0 commit comments

Comments
 (0)