Skip to content

Using ESMVALTOOL_CONFIG_DIR will force the usage of new configuration system and ignore old configuration #2736

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

schlunma
Copy link
Contributor

@schlunma schlunma commented May 20, 2025

Description

This PR makes it possible to force using our new configuration system via the environment variable ESMVALTOOL_CONFIG_DIR.

This is important feature for "encapsulated" ESMValTool runs that should work independently from the user's personal ESMValTool configuration. This is also relevant for the REF (see Climate-REF/climate-ref#315).


Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.


To help with the number pull requests:

@schlunma schlunma added this to the v2.13.0 milestone May 20, 2025
Copy link

codecov bot commented May 20, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.20%. Comparing base (0dc146c) to head (ff8c691).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2736   +/-   ##
=======================================
  Coverage   95.20%   95.20%           
=======================================
  Files         259      259           
  Lines       15211    15217    +6     
=======================================
+ Hits        14481    14488    +7     
+ Misses        730      729    -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@bouweandela
Copy link
Member

Instead of adding a new environmental variable, could we use the new config system exclusively if ESMVALTOOL_CONFIG_DIR is set?

@schlunma
Copy link
Contributor Author

Instead of adding a new environmental variable, could we use the new config system exclusively if ESMVALTOOL_CONFIG_DIR is set?

Done!

Copy link
Contributor

@ehogan ehogan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to update the documentation at https://docs.esmvaltool.org/projects/ESMValCore/en/latest/quickstart/configure.html#yaml-files to be very explicit that:

  • Using --config_dir does not override the configuration in the default location (~/.config/esmvaltool)
  • Configuration files in ~/.esmvaltool are also included

please? 😊

@schlunma
Copy link
Contributor Author

Using --config_dir does not override the configuration in the default location (~/.config/esmvaltool)

Would you have a suggestion for that? I think it already (kind of) says that ("ESMValCore searches for all YAML files within each of these directories [...]"):

These files can live in any of the following locations:

    - The directory specified via the --config_dir command line argument.
    - The user configuration directory: by default ~/.config/esmvaltool, but this can be changed with the ESMVALTOOL_CONFIG_DIR environment variable. If ~/.config/esmvaltool does not exist, this will be silently ignored.

ESMValCore searches for all YAML files within each of these directories [...].

Configuration files in ~/.esmvaltool are also included

Not all files in there are included, only ~/.esmvaltool/config-user.yml (if present). The only reason for this is backwards-compatibility; we will remove support for this in v2.14.0. You'll also get lots of deprecation warnings if you use that. So we certainly don't want to advertise this "feature". In my opinion, our documentation should not mention that, but we should rather rely on the warnings.

@ehogan
Copy link
Contributor

ehogan commented May 21, 2025

Using --config_dir does not override the configuration in the default location (~/.config/esmvaltool)

Would you have a suggestion for that?

When I was reading before, I didn't quite make it to the end of this sentence:

ESMValCore searches for all YAML files within each of these directories and merges them together [...]

Would it be possible to add this to the first sentence in the Specify configuration for esmvaltool command line tool section, please? For example:

When running recipes via the command line, configuration options can be specified via YAML files and command line arguments, which are merged together to create a single configuration object.

Also, I'm wondering if it would be clearer to moving the "but this can be changed with the ESMVALTOOL_CONFIG_DIR environment variable." when listing the location of the configuration files. Especially now that this option completely overrides all configuration rather than merges.

Also, the documentation states:

Preference follows the order in the list above (i.e., the directory specified via command line argument is preferred over the user configuration directory).

I don't believe this is true. In the RTW we had provided a configuration file via --config_dir, but the configuration in my home area took precedence.

Configuration files in ~/.esmvaltool are also included

Not all files in there are included, only ~/.esmvaltool/config-user.yml (if present). The only reason for this is backwards-compatibility; we will remove support for this in v2.14.0. You'll also get lots of deprecation warnings if you use that.

Ah, I can confirm that removing the ~/.esmvaltool/config-user.yml file does mean all the deprecation warnings disappear.

So we certainly don't want to advertise this "feature". In my opinion, our documentation should not mention that, but we should rather rely on the warnings.

I understand not wanting to advertise this, but I feel it may cause issues if people don't realise that the old configuration is merged with the new configuration (the RTW is now failing because of this!) 😭 How would you feel about adding another "Warning" admonition (or extending the current one) to explain this? It would be less of an advertisement if it's shown as a warning! 😊

@schlunma
Copy link
Contributor Author

schlunma commented May 22, 2025

I tried to improve this in 9885393, could you please have a look? 😊 Here is the rendered version.

Also, I'm wondering if it would be clearer to moving the "but this can be changed with the ESMVALTOOL_CONFIG_DIR environment variable." when listing the location of the configuration files. Especially now that this option completely overrides all configuration rather than merges.

No, this will NOT override all configuration. Setting this variable just changes the location where ESMValCore looks for the "user configuration". By default this is ~/.config/esmvaltool/. If ESMVALTOOL_CONFIG_DIR is set, ESMValCore changes this location to the value of the env variable, but will still use the exact same procedure to determine the final configuration object (i.e., it will merge the content of all YAML files into one object).

The behavior that changes with this PR is that the old configuration files are completely ignored if ESMVALTOOL_CONFIG_DIR is set. Before, this was not the case (see below).

I don't believe this is true. In the RTW we had provided a configuration file via --config_dir, but the configuration in my home area took precedence.

I would certainly hope this is true, otherwise our tests would not be set up correctly 😬 Could you please post your configuration files and the problem you're having? Sometimes the merge can be a bit counterintuitive. For example, I had an issue where I specified a CMIP6 rootpath in my user config file, and only a default rootpath in the config file given by --config-dir. Due to the nested update, both keys ended up in the final config, and the CMIP6 entry is always preferred over the default entry. That way the option in the user config file "took precedence" over the --config-dir entry.

I understand not wanting to advertise this, but I feel it may cause issues if people don't realise that the old configuration is merged with the new configuration (the RTW is now failing because of this!) 😭 How would you feel about adding another "Warning" admonition (or extending the current one) to explain this? It would be less of an advertisement if it's shown as a warning! 😊

If an old configuration file is found (i.e., either the file ~/.esmvaltool/config-user.yml exists or the command line argument --config-file is given), this file will be read just like before our configuration overhaul to be fully backwards-compatible. There is NO merging going on in this case. All potentially available new configuration files will be IGNORED.

With this PR, setting ESMVALTOOL_CONFIG_DIR disables this search for the old configuration and ESMValCore will always use the new system and ignore potentially available old configuration files.

@schlunma schlunma changed the title Allow forcing the usage of new configuration system with environment variable Using ESMVALTOOL_CONFIG_DIR will force the usage of new configuration system and ignore old configuration May 22, 2025
@schlunma schlunma force-pushed the ignore_old_config branch from 062853e to cae7539 Compare May 28, 2025 06:31
@schlunma schlunma force-pushed the ignore_old_config branch from 807a9e4 to cae7539 Compare May 28, 2025 08:54
@ehogan
Copy link
Contributor

ehogan commented May 30, 2025

Many thanks @schlunma 🎉

I tried to improve this in 9885393, could you please have a look? 😊 Here is the rendered version.

This is great, thank you! 😊

Also, I'm wondering if it would be clearer to moving the "but this can be changed with the ESMVALTOOL_CONFIG_DIR environment variable." when listing the location of the configuration files. Especially now that this option completely overrides all configuration rather than merges.

No, this will NOT override all configuration. Setting this variable just changes the location where ESMValCore looks for the "user configuration". By default this is ~/.config/esmvaltool/. If ESMVALTOOL_CONFIG_DIR is set, ESMValCore changes this location to the value of the env variable, but will still use the exact same procedure to determine the final configuration object (i.e., it will merge the content of all YAML files into one object).

Just to check my understanding, if I don't provide anything via the --config_dir command line argument, then set ESMVALTOOL_CONFIG_DIR, it will only look in this directory for configuration files (and merges any it finds in this directory). Correct? 🤞

The behavior that changes with this PR is that the old configuration files are completely ignored if ESMVALTOOL_CONFIG_DIR is set. Before, this was not the case (see below).

I don't believe this is true. In the RTW we had provided a configuration file via --config_dir, but the configuration in my home area took precedence.

I would certainly hope this is true, otherwise our tests would not be set up correctly 😬 Could you please post your configuration files and the problem you're having? Sometimes the merge can be a bit counterintuitive. For example, I had an issue where I specified a CMIP6 rootpath in my user config file, and only a default rootpath in the config file given by --config-dir. Due to the nested update, both keys ended up in the final config, and the CMIP6 entry is always preferred over the default entry. That way the option in the user config file "took precedence" over the --config-dir entry.

To clarify, when I said "the configuration in my home area took precedence" I meant "the configuration in the ~/.esmvaltool directory took precedence", and can be reproduced below. I'm wondering if you interpreted this as "the configuration in the ~/.config/esmvaltool directory took precedence", which is perhaps why there is confusion!

% ls ~/.esmvaltool
ls: cannot access '~/.esmvaltool': No such file or directory
% ls ~/.config/esmvaltool
ls: cannot access '~/.config/esmvaltool': No such file or directory
% mkdir ~/.esmvaltool
% cat > ~/.esmvaltool/config-user.yml << EOF
max_parallel_tasks: 2
EOF
% more ~/.esmvaltool/config-user.yml
max_parallel_tasks: 2
% mkdir test_config
% cat > test_config/config-user.yml << EOF
max_parallel_tasks: 4
EOF
% more test_config/config-user.yml
max_parallel_tasks: 4
% esmvaltool run recipe_radiation_budget.yml --config_dir=test_config
[...]
2025-05-30 10:41:54,653 UTC [1324506] INFO    Reading configuration files from:
/envs/esmvt-2.12.0/lib/python3.12/site-packages/esmvalcore/config/configurations/defaults (defaults)
~/.esmvaltool/config-user.yml (single configuration file [deprecated])
[...]
2025-05-30 10:41:57,071 UTC [1324506] INFO    Running tasks using at most 2 processes

I understand not wanting to advertise this, but I feel it may cause issues if people don't realise that the old configuration is merged with the new configuration (the RTW is now failing because of this!) 😭 How would you feel about adding another "Warning" admonition (or extending the current one) to explain this? It would be less of an advertisement if it's shown as a warning! 😊

If an old configuration file is found (i.e., either the file ~/.esmvaltool/config-user.yml exists or the command line argument --config-file is given), this file will be read just like before our configuration overhaul to be fully backwards-compatible. There is NO merging going on in this case. All potentially available new configuration files will be IGNORED.

OK, so if I'm finally catching up now, this is what my example above is demonstrating: if an ~/.esmvaltool/config-user.yml file exists, all new configuration options (e.g. --config_dir) will be ignored. If this is the case then I really do feel this should be documented! 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants