-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dataset.from_files
always returns empty
#1896
Comments
Thanks for reporting the issue! What are the names and paths of the files you have on your computer and what settings are you using in config-user.yml? |
CFG
Local file:
|
The facets of the local files are read from the subdirectories in which the files are stored relative to the rootpath. In your case, this fails because the data on your computer is not using any subdirectories per facet. I'll add a note about this to the documentation and think a bit about how to improve the way the duplicate sets of facets are removed from the list. Maybe we could try to keep only those sets that have the largest number of facets? |
@Peter9192 This should work better now, let me know if you still encounter problems. |
@bouweandela thanks, I noticed #1609 and #1924. I'm currently trying it out. |
Describe the bug
I tried out the new dataset facet search functionality following the example in https://github.com/ESMValGroup/ESMValCore/blob/main/notebooks/discovering-data.ipynb. However, I never seem to get any results. I tried with the same query, and also with CMIP5 instead of CMIP6.
Upon further investigation it looks like the problem for CMIP5 at least lies in the filtering out of identical facetsets. In my case, it finds 2 local files on my laptop, for which
ESMValCore/esmvalcore/dataset.py
Line 111 in 69a284d
returns an empty dict. There seem to be some files on ESGF that do not have a complete facetset either.
Therefore, the
same
checker:ESMValCore/esmvalcore/dataset.py
Lines 98 to 100 in 69a284d
will always see the empty (and otherwise the incomplete) set as a subset of every other set. This results in all files being filtered out.
Changing the
same
function from above tofacets_a.issubset(facets_b)
fixes the issue for me, but now it also returns incomplete facetsets which still have wildcards in them. Can we somehow require that thefacetset
must be complete?The text was updated successfully, but these errors were encountered: