Merge reload_boutdataset() functionality into open_boutdataset()#137
Merge reload_boutdataset() functionality into open_boutdataset()#137johnomotani merged 5 commits intomasterfrom
Conversation
|
Hello @johnomotani! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-09-10 12:24:52 UTC |
Global coordinates are useful for later re-loading (using combine="by_coords") Datasets which were split up, so use add_geometry() to include a coordinate for each dimension even if geometry=None so no physical coordinates are needed.
Use combine='by_coords' option to xr.open_mfdataset() so reload_boutdataset() can load datasets that were saved split (in any dimension) into several files. pre_squashed option to reload_boutdataset() is no longer needed, so is removed.
Remove reload_boutdataset() function, putting its functionality into open_boutdataset. Also removes pre_squashed argument - datasets saved with xBOUT are automatically detected now.
f255b4a to
e83f3a0
Compare
|
@TomNicholas about backward compatibility - the test for a file that was saved by xBOUT relies on metadata being saved in the way implemented in #124, which was only merged in May, so I guess you probably have data files that don't have |
|
@johnomotani I was actually just testing this!
I don't think I do - I tried it on my big 3D run that I resaved and it seems to work... This necessitates corresponding changes in xstorm which I was just trying to do though |
TomNicholas
left a comment
There was a problem hiding this comment.
Thanks for this @johnomotani
I like the idea, I probably should have just used reload_boudataset originally but this is a better solution.
Only thing I would say is that I don't think there should be a filetype_fake return type, because code should never behave differently depending on whether or not it thinks it is being tested, otherwise the tests aren't testing the real code!
I agree, but the trouble is the test suite was timing out on Travis (taking more than 50 minutes). Having the ability to create a test dataset without writing to disk and reading from disk, when testing non-I/O related features, gets this down to ~34 minutes. Pragmatically then we have a choice between testing fewer options, or introducing the complication of a second code path to avoid necessarily writing-to/reading-from disk... I prefer the second one (and pushed it in in #132), but as I write this I realise that there probably needs to be a test that the from-disk and 'faked' versions do exactly the same thing... I'll make another PR for that. |
|
...also maybe 'fake' was a bad choice of name. I was thinking of the ability to create a BoutDataset from a list of xr.Dataset as 'just for testing' but in principle it's a feature, e.g. you could run a simulation with |
TomNicholas
left a comment
There was a problem hiding this comment.
I didn't realise there was a practical problem with tests timing out. I'm quite surprised that they take that long... But we can deal with that later - this general approach should still be merged I think.
|
Yes, when I added the |
Codecov Report
@@ Coverage Diff @@
## master #137 +/- ##
==========================================
- Coverage 77.88% 77.62% -0.26%
==========================================
Files 14 14
Lines 2139 2150 +11
Branches 480 486 +6
==========================================
+ Hits 1666 1669 +3
- Misses 304 308 +4
- Partials 169 173 +4
Continue to review full report at Codecov.
|
Removes
reload_boutdataset(). Instead updatesopen_boutdataset()to automatically detect and reload datasets saved from xBOUT, including those saved withseparate_vars=True.pre_squashedargument is no longer needed so is removed.Replaces #136.
@TomNicholas maybe overkill for your issue, but I think this solution is nicer! Does it fix the problem for you? If the broken backward compatibility with
pre_squashedis going to be much effort for you to fix (the re-opened Datasets will be different now, e.g. will include ametadataattribute), then we could keeppre_squashedwith the old behaviour on a deprecation cycle.