Moved extraction of reference datasets for (horizontal and vertical) regridding into preprocessor functions #1455

schlunma · 2022-02-02T13:27:31Z

Description

This PR moves the extraction of reference datasets used for horizontal and vertical regridding to the dedicated preprocessor functions. This allows us to specify reference datasets that need to be (automatically) downloaded first.

In addition, it adds fixes prior to the loading step of the reference dataset for horizontal regridding, similar to what is already done for the target levels for vertical regridding.

Closes #1454
Closes #56

Link to documentation:

Before you get started

☝ Create an issue to discuss what you are going to do

Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.

🧪 The new functionality is relevant and scientifically sound
🛠 This pull request has a descriptive title and labels
🛠 Code is written according to the code quality guidelines
🧪 and 🛠 Documentation is available
🛠 Unit tests have been added
🛠 Changes are backward compatible
🛠 Any changed dependencies have been added or removed correctly
🛠 The list of authors is up to date
🛠 All checks below this pull request were successful

To help with the number pull requests:

🙏 We kindly ask you to review two other open pull requests in this repository

…r functions

schlunma · 2022-02-02T13:28:23Z

@valeriupredoi @zklaus Before I adapt the tests here, can I briefly get your feedback on this implementation? Does it make sense in your opinion?

valeriupredoi · 2022-02-03T11:56:25Z

thanks a bunch @schlunma - looking at it now, sorry it escaped my wits yesterday!

valeriupredoi

draft request changes to a draft PR, not gonna be anal before it's actually RfR 😁

esmvalcore/preprocessor/_regrid.py

valeriupredoi · 2022-02-03T12:02:41Z

esmvalcore/preprocessor/_regrid.py

+    ----
+    If ``levels`` is a ``dict`` and it does not contain the key ``filename``,
+    it is automatically assumed that you specified the target grid with
+    ``start_longitude``, ``end_longitude``, etc. If a valid reference dataset


wait what? A grid is specified by an MxN specification, not by defining a box, am confusado here

ah nevermind, it's the _spec_to_latlonvals() stuff - can you add a pointer to that func here maybe, that confused me, prob gonna confuse somebody else too

valeriupredoi · 2022-02-03T12:21:28Z

so here's the thing - as I understand it, your are now passing all the basic ingredients to the regrid preprocessor, rather than have some bits done first in recipe and then others (like the reference level extraction) be done via the regridder preprocessor - I like it! I like that stuff gets done by the module that needs to do it, and not outside it. What happens if there's an issue with the reference dataset, other than a CMOR issue, that might not even be picked up if the user chooses to run with all CMOR shields down? Will that surface only when the regridder is called? That might be after a lot of stuff has already been computed

Co-authored-by: Valeriu Predoi <[email protected]>

schlunma · 2022-02-03T12:36:43Z

What happens if there's an issue with the reference dataset, other than a CMOR issue, that might not even be picked up if the user chooses to run with all CMOR shields down? Will that surface only when the regridder is called? That might be after a lot of stuff has already been computed

It will surface once the reference dataset has entered the preprocessor chain, so probably a little bit earlier than the actual regridding. I agree that this is not ideal because it also increases the computation times by duplicating the loading for every dataset, but I don't see another way right now.

Something that might be possible is to add an additional step between the check for data availability (and the downloading) and the actual start of the preprocessing, but this is something I can't do for v2.5.

I'm not sure how to proceed. Should I try to get this in or leave it to v2.6 and come up with a cleaner solution?

valeriupredoi · 2022-02-03T12:58:33Z

thanks for the note, Manu! The question is what's the immediate benefit of this vs the cost it introduces and vs a longer mulling over and possible implementation of a download me/check me/use me function for datasets before the preprocessor starts? I'd also want to hear what @zklaus and @bouweandela think about this 🍺

schlunma · 2022-02-03T13:04:53Z

The short-term benefit is that the bug mentioned in #1454 is fixed.

But after giving this some more thoughts I think this should be implemented properly, e.g., by adding a function _load_reference_datasets() between the download and run here:

ESMValCore/esmvalcore/_recipe.py

Lines 1752 to 1763 in 3a3e5f5

    
           def run(self): 
        
               """Run all tasks in the recipe.""" 
        
               self.write_filled_recipe() 
        
               if not self.tasks: 
        
                   raise RecipeError('No tasks to run!') 
        
               # Download required data 
        
               if not self._cfg['offline']: 
        
                   esgf.download(self._download_files, self._cfg['download_dir']) 
        
               self.tasks.run(max_parallel_tasks=self._cfg['max_parallel_tasks']) 
        
               self.write_html_summary()

Maybe I can address this for v2.5, but if not we can do a bugfix or even postpone this to v2.6, I don't think it's that urgent.

valeriupredoi · 2022-02-03T13:21:16Z

thanks, Manu! I am of the personal opinion that this should not be hurried up, but rather done in a take it easy, test proper manner - up to you and the others, am not gonna block progress 🇨🇳 😁

…ore_using

schlunma · 2022-05-19T09:01:06Z

I cannot finish this PR in time for v2.6, moving this to v2.7.

bouweandela · 2022-06-20T14:43:59Z

This would also be addressed (differently) in #1609.

schlunma added 2 commits February 2, 2022 12:34

Moved extraction of reference dataset for regridding into preprocesso…

354fc2c

…r functions

Cleaned _regrid.py

e0f4c72

schlunma added the bug Something isn't working label Feb 2, 2022

schlunma added this to the v2.5.0 milestone Feb 2, 2022

schlunma requested review from zklaus and valeriupredoi February 2, 2022 13:27

schlunma self-assigned this Feb 2, 2022

valeriupredoi requested changes Feb 3, 2022

View reviewed changes

Update esmvalcore/preprocessor/_regrid.py

a780acc

Co-authored-by: Valeriu Predoi <[email protected]>

schlunma modified the milestones: v2.5.0, v2.6.0 Feb 4, 2022

Merge remote-tracking branch 'origin/main' into fix_target_levels_bef…

db9092f

…ore_using

schlunma modified the milestones: v2.6.0, v2.7.0 May 19, 2022

schlunma closed this Aug 30, 2022

schlunma deleted the fix_target_levels_before_using branch March 1, 2023 16:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Moved extraction of reference datasets for (horizontal and vertical) regridding into preprocessor functions #1455

Moved extraction of reference datasets for (horizontal and vertical) regridding into preprocessor functions #1455

schlunma commented Feb 2, 2022

schlunma commented Feb 2, 2022

valeriupredoi commented Feb 3, 2022

valeriupredoi left a comment

valeriupredoi Feb 3, 2022

valeriupredoi Feb 3, 2022

valeriupredoi commented Feb 3, 2022

schlunma commented Feb 3, 2022

valeriupredoi commented Feb 3, 2022

schlunma commented Feb 3, 2022

valeriupredoi commented Feb 3, 2022

schlunma commented May 19, 2022

bouweandela commented Jun 20, 2022

Moved extraction of reference datasets for (horizontal and vertical) regridding into preprocessor functions #1455

Moved extraction of reference datasets for (horizontal and vertical) regridding into preprocessor functions #1455

Conversation

schlunma commented Feb 2, 2022

Description

Before you get started

Checklist

schlunma commented Feb 2, 2022

valeriupredoi commented Feb 3, 2022

valeriupredoi left a comment

Choose a reason for hiding this comment

valeriupredoi Feb 3, 2022

Choose a reason for hiding this comment

valeriupredoi Feb 3, 2022

Choose a reason for hiding this comment

valeriupredoi commented Feb 3, 2022

schlunma commented Feb 3, 2022

valeriupredoi commented Feb 3, 2022

schlunma commented Feb 3, 2022

valeriupredoi commented Feb 3, 2022

schlunma commented May 19, 2022

bouweandela commented Jun 20, 2022