Skip to content

Commit

Permalink
Merge branch 'main' into fix_race_condition_cleanup
Browse files Browse the repository at this point in the history
  • Loading branch information
schlunma authored Mar 1, 2023
2 parents 73c6929 + 40e00e8 commit 2fcdf1f
Show file tree
Hide file tree
Showing 26 changed files with 717 additions and 296 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/run-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ name: Test

# runs on a push on main and at the end of every day
on:
# triggering on push without branch name will run tests everytime
# triggering on push without branch name will run tests every time
# there is a push on any branch
# turn it on only if needed
push:
Expand Down
6 changes: 3 additions & 3 deletions doc/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -232,7 +232,7 @@ Highlights

- The new preprocessor :func:`~esmvalcore.preprocessor.extract_location` can extract arbitrary locations on the Earth using the `geopy <https://pypi.org/project/geopy/>`__ package that connects to OpenStreetMap. For details, see :ref:`Extract location <extract_location>`.
- Time ranges can now be extracted using the `ISO 8601 format <https://en.wikipedia.org/wiki/ISO_8601>`_. In addition, wildcards are allowed, which makes the time selection much more flexible. For details, see :ref:`Recipe section: Datasets <Datasets>`.
- The new preprocessor :func:`~esmvalcore.preprocessor.ensemble_statistics` can calculate arbitrary statitics over all ensemble members of a simulation. In addition, the preprocessor :func:`~esmvalcore.preprocessor.multi_model_statistics` now accepts the keyword ``groupy``, which allows the calculation of multi-model statistics over arbitrary multi-model ensembles. For details, see :ref:`Ensemble statistics <ensemble statistics>` and :ref:`Multi-model statistics <multi-model statistics>`.
- The new preprocessor :func:`~esmvalcore.preprocessor.ensemble_statistics` can calculate arbitrary statistics over all ensemble members of a simulation. In addition, the preprocessor :func:`~esmvalcore.preprocessor.multi_model_statistics` now accepts the keyword ``groupy``, which allows the calculation of multi-model statistics over arbitrary multi-model ensembles. For details, see :ref:`Ensemble statistics <ensemble statistics>` and :ref:`Multi-model statistics <multi-model statistics>`.

This release includes

Expand Down Expand Up @@ -327,7 +327,7 @@ Automatic testing
- Switch to Mambaforge in Github Actions tests (`#1438 <https://github.com/ESMValGroup/ESMValCore/pull/1438>`__) `Valeriu Predoi <https://github.com/valeriupredoi>`__
- Turn off conda lock file creation on any push on `main` branch from Github Action test (`#1489 <https://github.com/ESMValGroup/ESMValCore/pull/1489>`__) `Valeriu Predoi <https://github.com/valeriupredoi>`__
- Add DRS path test for IPSLCM files (`#1490 <https://github.com/ESMValGroup/ESMValCore/pull/1490>`__) `Stéphane Sénési <https://github.com/senesis>`__
- Add a test module that runs tests of `iris` I/O everytime we notice serious bugs there (`#1510 <https://github.com/ESMValGroup/ESMValCore/pull/1510>`__) `Valeriu Predoi <https://github.com/valeriupredoi>`__
- Add a test module that runs tests of `iris` I/O every time we notice serious bugs there (`#1510 <https://github.com/ESMValGroup/ESMValCore/pull/1510>`__) `Valeriu Predoi <https://github.com/valeriupredoi>`__
- [Github Actions] Trigger Github Actions tests (`run-tests.yml` workflow) from a comment in a PR (`#1520 <https://github.com/ESMValGroup/ESMValCore/pull/1520>`__) `Valeriu Predoi <https://github.com/valeriupredoi>`__
- Update Linux condalock file (various pull requests) github-actions[bot]

Expand Down Expand Up @@ -617,7 +617,7 @@ Automatic testing
- Report coverage for tests that run on any pull request (`#994 <https://github.com/ESMValGroup/ESMValCore/pull/994>`__) `Bouwe Andela <https://github.com/bouweandela>`__
- Install ESMValTool sample data from PyPI (`#998 <https://github.com/ESMValGroup/ESMValCore/pull/998>`__) `Javier Vegas-Regidor <https://github.com/jvegasbsc>`__
- Fix tests for multi-processing with spawn method (i.e. macOSX with Python>3.8) (`#1003 <https://github.com/ESMValGroup/ESMValCore/pull/1003>`__) `Barbara Vreede <https://github.com/bvreede>`__
- Switch to running the Github Action test workflow every 3 hours in single thread mode to observe if Sementation Faults occur (`#1022 <https://github.com/ESMValGroup/ESMValCore/pull/1022>`__) `Valeriu Predoi <https://github.com/valeriupredoi>`__
- Switch to running the Github Action test workflow every 3 hours in single thread mode to observe if Segmentation Faults occur (`#1022 <https://github.com/ESMValGroup/ESMValCore/pull/1022>`__) `Valeriu Predoi <https://github.com/valeriupredoi>`__
- Revert to original Github Actions test workflow removing the 3-hourly test run with -n 1 (`#1025 <https://github.com/ESMValGroup/ESMValCore/pull/1025>`__) `Valeriu Predoi <https://github.com/valeriupredoi>`__
- Avoid stale cache for multimodel statistics regression tests (`#1030 <https://github.com/ESMValGroup/ESMValCore/pull/1030>`__) `Bouwe Andela <https://github.com/bouweandela>`__
- Add newer Python versions in OSX to Github Actions (`#1035 <https://github.com/ESMValGroup/ESMValCore/pull/1035>`__) `Barbara Vreede <https://github.com/bvreede>`__
Expand Down
71 changes: 41 additions & 30 deletions doc/quickstart/configure.rst
Original file line number Diff line number Diff line change
Expand Up @@ -53,38 +53,40 @@ omitted in the file.
# Includes log files and performance stats.
output_dir: ~/esmvaltool_output
# Directory for storing downloaded climate data
download_dir: ~/climate_data
# Disable automatic downloads --- [true]/false
# Disable the automatic download of missing CMIP3, CMIP5, CMIP6, CORDEX,
# and obs4MIPs data from ESGF by default. This is useful if you are working
# on a computer without an internet connection.
offline: true
# Search ESGF for files even when files are available locally --- true/[false]
# This option is useful to make sure you have the latest version of all files.
# Remember to set ``offline: false`` if this option is set to ``true``.
always_search_esgf: false
# Auxiliary data directory
# Used by some recipes to look for additional datasets.
auxiliary_data_dir: ~/auxiliary_data
# Automatic data download from ESGF --- [never]/when_missing/always
# Use automatic download of missing CMIP3, CMIP5, CMIP6, CORDEX, and obs4MIPs
# data from ESGF. ``never`` disables this feature, which is useful if you are
# working on a computer without an internet connection, or if you have limited
# disk space. ``when_missing`` enables the automatic download for files that
# are not available locally. ``always`` will always check ESGF for the latest
# version of a file, and will only use local files if they correspond to that
# latest version.
search_esgf: never
# Directory for storing downloaded climate data
# Make sure to use a directory where you can store multiple GBs of data. Your
# home directory on a HPC is usually not suited for this purpose, so please
# change the default value in this case!
download_dir: ~/climate_data
# Rootpaths to the data from different projects
# This default setting will work if files have been downloaded by the
# ESMValTool via ``offline=False``. Lists are also possible. For
# site-specific entries, see the default ``config-user.yml`` file that can be
# installed with the command ``esmvaltool config get_config_user``. For each
# project, this can be either a single path or a list of paths. Comment out
# these when using a site-specific path.
# This default setting will work if files have been downloaded by ESMValTool
# via ``search_esgf``. Lists are also possible. For site-specific entries,
# see the default ``config-user.yml`` file that can be installed with the
# command ``esmvaltool config get_config_user``. For each project, this can
# be either a single path or a list of paths. Comment out these when using a
# site-specific path.
rootpath:
default: ~/climate_data
# Directory structure for input data --- [default]/ESGF/BADC/DKRZ/ETHZ/etc.
# This default setting will work if files have been downloaded by the
# ESMValTool via ``offline=False``. See ``config-developer.yml`` for
# definitions. Comment out/replace as per needed.
# This default setting will work if files have been downloaded by ESMValTool
# via ``search_esgf``. See ``config-developer.yml`` for definitions. Comment
# out/replace as per needed.
drs:
CMIP3: ESGF
CMIP5: ESGF
Expand Down Expand Up @@ -136,11 +138,19 @@ omitted in the file.
# ``config-developer.yml`` for an example. Set to ``null`` to use the default.
config_developer_file: null
The ``offline`` setting can be used to disable or enable automatic downloads from ESGF.
If ``offline`` is set to ``false``, the tool will automatically download
any CMIP3, CMIP5, CMIP6, CORDEX, and obs4MIPs data that is required to run a recipe
but not available locally and store it in ``download_dir`` using the ``ESGF``
The ``search_esgf`` setting can be used to disable or enable automatic
downloads from ESGF.
If ``search_esgf`` is set to ``never``, the tool does not download any data
from the ESGF.
If ``search_esgf`` is set to ``when_missing``, the tool will download any CMIP3,
CMIP5, CMIP6, CORDEX, and obs4MIPs data that is required to run a recipe but
not available locally and store it in ``download_dir`` using the ``ESGF``
directory structure defined in the :ref:`config-developer`.
If ``search_esgf`` is set to ``always``, the tool will first check the ESGF for
the needed data, regardless of any local data availability; if the data found
on ESGF is newer than the local data (if any) or the user specifies a version
of the data that is available only from the ESGF, then that data will be
downloaded; otherwise, local data will be used.

The ``auxiliary_data_dir`` setting is the path to place any required
additional auxiliary data files. This is necessary because certain
Expand Down Expand Up @@ -199,9 +209,10 @@ The ``esmvaltool run`` command can automatically download the files required
to run a recipe from ESGF for the projects CMIP3, CMIP5, CMIP6, CORDEX, and obs4MIPs.
The downloaded files will be stored in the ``download_dir`` specified in the
:ref:`user configuration file`.
To enable automatic downloads from ESGF, set ``offline: false`` in
the :ref:`user configuration file` or provide the command line argument
``--offline=False`` when running the recipe.
To enable automatic downloads from ESGF, set ``search_esgf: when_missing`` or
``search_esgf: always`` in the :ref:`user configuration file`, or provide the
corresponding command line arguments ``--search_esgf=when_missing`` or
``--search_esgf=always`` when running the recipe.

.. note::

Expand Down
9 changes: 6 additions & 3 deletions doc/quickstart/find_data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -483,9 +483,12 @@ retrieval parameters is explained below.

Enabling automatic downloads from the ESGF
------------------------------------------
To enable automatic downloads from ESGF, set ``offline: false`` in
the :ref:`user configuration file` or provide the command line argument
``--offline=False`` when running the recipe.
To enable automatic downloads from ESGF, set ``search_esgf: when_missing`` (use
local files whenever possible) or ``search_esgf: always`` (always search ESGF
for latest version of files and only use local data if it is the latest
version) in the :ref:`user configuration file`, or provide the corresponding
command line arguments ``--search_esgf=when_missing`` or
``--search_esgf=always`` when running the recipe.
The files will be stored in the ``download_dir`` set in
the :ref:`user configuration file`.

Expand Down
14 changes: 11 additions & 3 deletions doc/quickstart/run.rst
Original file line number Diff line number Diff line change
Expand Up @@ -60,12 +60,20 @@ It is also possible to explicitly change values from the config file using flags
esmvaltool run --argument_name argument_value recipe_example.yml
To automatically download the files required to run a recipe from ESGF, set
``offline`` to ``false`` in the :ref:`user configuration file`
or run the tool with the command
``search_esgf`` to ``when_missing`` (use local files whenever possible) or
``always`` (always search ESGF for latest version of files and only use local
data if it is the latest version) in the :ref:`user configuration file` or run
the tool with the corresponding commands

.. code:: bash
esmvaltool run --offline=False recipe_example.yml
esmvaltool run --search_esgf=when_missing recipe_example.yml
or

.. code:: bash
esmvaltool run --search_esgf=always recipe_example.yml
This feature is available for projects that are hosted on the ESGF, i.e.
CMIP3, CMIP5, CMIP6, CORDEX, and obs4MIPs.
Expand Down
8 changes: 4 additions & 4 deletions doc/recipe/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -129,10 +129,10 @@ Reading facet values from file names is not yet supported.
See :ref:`CMOR-DRS` for more information on this kind of file organization.

When (some) files are available locally, the tool will not automatically look
for more files on ESGF. To populate a recipe with all available datasets from
ESGF, ``offline`` should be set to ``false`` and ``always_search_esgf`` should
be set to ``true`` in the
:ref:`user configuration file<user configuration file>`.
for more files on ESGF.
To populate a recipe with all available datasets from ESGF, ``search_esgf``
should be set to ``always`` in the :ref:`user configuration file<user
configuration file>`.

For more control over which datasets are selected, it is recommended to use
a Python script or `Jupyter notebook <https://jupyter.org/>`_ to compose
Expand Down
1 change: 1 addition & 0 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ dependencies:
- nested-lookup
- netcdf4
- numpy
- packaging
- pandas
- pillow
- pip!=21.3
Expand Down
20 changes: 19 additions & 1 deletion esmvalcore/_main.py
Original file line number Diff line number Diff line change
Expand Up @@ -329,6 +329,7 @@ def run(self,
max_years=None,
skip_nonexistent=None,
offline=None,
search_esgf=None,
diagnostics=None,
check_level=None,
**kwargs):
Expand Down Expand Up @@ -357,6 +358,19 @@ def run(self,
If True, the run will not fail if some datasets are not available.
offline: bool, optional
If True, the tool will not download missing data from ESGF.
.. deprecated:: 2.8.0
This option has been deprecated in ESMValCore version 2.8.0 and
is scheduled for removal in version 2.10.0. Please use the
options `search_esgf=never` (for `offline=True`) or
`search_esgf=when_missing` (for `offline=False`). These are
exact replacements.
search_esgf: str, optional
If `never`, disable automatic download of data from the ESGF. If
`when_missing`, enable the automatic download of files that are not
available locally. If `always`, always check ESGF for the latest
version of a file, and only use local files if they correspond to
that latest version.
diagnostics: list(str), optional
Only run the selected diagnostics from the recipe. To provide more
than one diagnostic to filter use the syntax 'diag1 diag2/script1'
Expand Down Expand Up @@ -384,12 +398,16 @@ def run(self,
session['max_years'] = max_years
if offline is not None:
session['offline'] = offline
if search_esgf is not None:
session['search_esgf'] = search_esgf
if skip_nonexistent is not None:
session['skip_nonexistent'] = skip_nonexistent
session['resume_from'] = parse_resume(resume_from, recipe)
session.update(kwargs)

self._run(recipe, session)
# Print warnings about deprecated configuration options again:
CFG.reload()

@staticmethod
def _create_session_dir(session):
Expand Down Expand Up @@ -421,7 +439,7 @@ def _run(self, recipe: Path, session) -> None:
console_log_level=session['log_level'])
self._log_header(session['config_file'], log_files)

if not session['offline']:
if session['search_esgf'] != 'never':
from .esgf._logon import logon
logon()

Expand Down
19 changes: 11 additions & 8 deletions esmvalcore/_recipe/recipe.py
Original file line number Diff line number Diff line change
Expand Up @@ -946,21 +946,22 @@ def _set_use_legacy_supplementaries(self):
logger.info("Running with --use-legacy-supplementaries=True")
self.session['use_legacy_supplementaries'] = True

# Also set the global config because it is used to check if
# mismatching shapes should be ignored when attaching
# Also adapt the global config if necessary because it is used to check
# if mismatching shapes should be ignored when attaching
# supplementary variables in `esmvalcore.preprocessor.
# _supplementary_vars.add_supplementary_variables` to avoid having to
# introduce a new function argument that is immediately deprecated.
option = 'use_legacy_supplementaries'
CFG[option] = self.session[option]
session_use_legacy_supp = self.session['use_legacy_supplementaries']
if session_use_legacy_supp is not None:
CFG['use_legacy_supplementaries'] = session_use_legacy_supp

def _log_recipe_errors(self, exc):
"""Log a message with recipe errors."""
logger.error(exc.message)
for task in exc.failed_tasks:
logger.error(task.message)

if self.session['offline'] and any(
if self.session['search_esgf'] == 'never' and any(
isinstance(err, InputFilesNotFound)
for err in exc.failed_tasks):
logger.error(
Expand All @@ -972,8 +973,10 @@ def _log_recipe_errors(self, exc):
"configuration file %s", self.session['config_file'])
logger.error(
"To automatically download the required files to "
"`download_dir: %s`, set `offline: false` in %s or run the "
"recipe with the extra command line argument --offline=False",
"`download_dir: %s`, set `search_esgf: when_missing` or "
"`search_esgf: always` in %s, or run the recipe with the "
"extra command line argument --search_esgf=when_missing or "
"--search_esgf=always",
self.session['download_dir'],
self.session['config_file'],
)
Expand Down Expand Up @@ -1299,7 +1302,7 @@ def run(self):
filled_recipe = self.write_filled_recipe()

# Download required data
if not self.session['offline']:
if self.session['search_esgf'] != 'never':
esgf.download(self._download_files, self.session['download_dir'])

self.tasks.run(max_parallel_tasks=self.session['max_parallel_tasks'])
Expand Down
Loading

0 comments on commit 2fcdf1f

Please sign in to comment.