Merge reload_boutdataset() functionality into open_boutdataset() by johnomotani · Pull Request #137 · boutproject/xBOUT

johnomotani · 2020-09-09T21:29:37Z

Removes reload_boutdataset(). Instead updates open_boutdataset() to automatically detect and reload datasets saved from xBOUT, including those saved with separate_vars=True. pre_squashed argument is no longer needed so is removed.

Replaces #136.

@TomNicholas maybe overkill for your issue, but I think this solution is nicer! Does it fix the problem for you? If the broken backward compatibility with pre_squashed is going to be much effort for you to fix (the re-opened Datasets will be different now, e.g. will include a metadata attribute), then we could keep pre_squashed with the old behaviour on a deprecation cycle.

pep8speaks · 2020-09-09T21:29:46Z

Hello @johnomotani! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-09-10 12:24:52 UTC

Global coordinates are useful for later re-loading (using combine="by_coords") Datasets which were split up, so use add_geometry() to include a coordinate for each dimension even if geometry=None so no physical coordinates are needed.

Use combine='by_coords' option to xr.open_mfdataset() so reload_boutdataset() can load datasets that were saved split (in any dimension) into several files. pre_squashed option to reload_boutdataset() is no longer needed, so is removed.

Remove reload_boutdataset() function, putting its functionality into open_boutdataset. Also removes pre_squashed argument - datasets saved with xBOUT are automatically detected now.

johnomotani · 2020-09-10T10:40:46Z

@TomNicholas about backward compatibility - the test for a file that was saved by xBOUT relies on metadata being saved in the way implemented in #124, which was only merged in May, so I guess you probably have data files that don't have metadata:keep_yboundaries in the attributes? So I'll put pre_squashed back in for backward compatibility.

TomNicholas · 2020-09-10T10:50:42Z

@johnomotani I was actually just testing this!

the test for a file that was saved by xBOUT relies on metadata being saved in the way implemented in #124, which was only merged in May, so I guess you probably have data files that don't have metadata:keep_yboundaries in the attributes?

I don't think I do - I tried it on my big 3D run that I resaved and it seems to work...

This necessitates corresponding changes in xstorm which I was just trying to do though

TomNicholas

Thanks for this @johnomotani

I like the idea, I probably should have just used reload_boudataset originally but this is a better solution.

Only thing I would say is that I don't think there should be a filetype_fake return type, because code should never behave differently depending on whether or not it thinks it is being tested, otherwise the tests aren't testing the real code!

johnomotani · 2020-09-10T11:58:22Z

Only thing I would say is that I don't think there should be a filetype_fake return type, because code should never behave differently depending on whether or not it thinks it is being tested, otherwise the tests aren't testing the real code!

I agree, but the trouble is the test suite was timing out on Travis (taking more than 50 minutes). Having the ability to create a test dataset without writing to disk and reading from disk, when testing non-I/O related features, gets this down to ~34 minutes. Pragmatically then we have a choice between testing fewer options, or introducing the complication of a second code path to avoid necessarily writing-to/reading-from disk... I prefer the second one (and pushed it in in #132), but as I write this I realise that there probably needs to be a test that the from-disk and 'faked' versions do exactly the same thing... I'll make another PR for that.

johnomotani · 2020-09-10T12:02:26Z

...also maybe 'fake' was a bad choice of name. I was thinking of the ability to create a BoutDataset from a list of xr.Dataset as 'just for testing' but in principle it's a feature, e.g. you could run a simulation with boutcore, stick the outputs into one or several Datasets (might need more features added in boutcore to gather arrays to a single process or something, so this might not actually be possible right now...) and then use open_boutdataset() to convert them into a BoutDataset. Honestly though, it's just useful for speeding the tests up!

TomNicholas

I didn't realise there was a practical problem with tests timing out. I'm quite surprised that they take that long... But we can deal with that later - this general approach should still be merged I think.

johnomotani · 2020-09-10T13:11:26Z

Yes, when I added the Region functionality, I added a lot of tests for different cases, and when they all run the test suite gets pretty long! I've added more tests of the new open-from-list-of-Datasets approach in #139.

codecov-commenter · 2020-09-10T13:33:49Z

Codecov Report

Merging #137 into master will decrease coverage by 0.25%.
The diff coverage is 75.55%.

@@            Coverage Diff             @@
##           master     #137      +/-   ##
==========================================
- Coverage   77.88%   77.62%   -0.26%     
==========================================
  Files          14       14              
  Lines        2139     2150      +11     
  Branches      480      486       +6     
==========================================
+ Hits         1666     1669       +3     
- Misses        304      308       +4     
- Partials      169      173       +4

Impacted Files	Coverage Δ
xbout/geometries.py	`78.87% <73.33%> (+0.30%)`	⬆️
xbout/load.py	`79.40% <75.34%> (-1.83%)`	⬇️
xbout/__init__.py	`100.00% <100.00%> (ø)`
xbout/utils.py	`80.42% <100.00%> (-0.07%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ec95524...59bf07e. Read the comment docs.

johnomotani added the enhancement New feature or request label Sep 9, 2020

johnomotani requested a review from TomNicholas September 9, 2020 21:29

johnomotani added 4 commits September 9, 2020 22:33

Add trivial coordinates even when geometry=None

315a78d

Global coordinates are useful for later re-loading (using combine="by_coords") Datasets which were split up, so use add_geometry() to include a coordinate for each dimension even if geometry=None so no physical coordinates are needed.

Put reload functionality into open_boutdataset()

7fde555

Remove reload_boutdataset() function, putting its functionality into open_boutdataset. Also removes pre_squashed argument - datasets saved with xBOUT are automatically detected now.

Test reloading dataset saved with separate_vars=True

e83f3a0

johnomotani force-pushed the merge-reload-into-open branch from f255b4a to e83f3a0 Compare September 9, 2020 21:34

Fix grid load tests

59bf07e

TomNicholas requested changes Sep 10, 2020

View reviewed changes

TomNicholas approved these changes Sep 10, 2020

View reviewed changes

johnomotani merged commit c90b790 into master Sep 10, 2020

johnomotani deleted the merge-reload-into-open branch September 10, 2020 13:40

johnomotani mentioned this pull request Sep 10, 2020

Bugfix: Metadata doesn't exist if pre-squashed #136

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge reload_boutdataset() functionality into open_boutdataset()#137

Merge reload_boutdataset() functionality into open_boutdataset()#137
johnomotani merged 5 commits intomasterfrom
merge-reload-into-open

johnomotani commented Sep 9, 2020

Uh oh!

pep8speaks commented Sep 9, 2020 •

edited

Loading

Uh oh!

johnomotani commented Sep 10, 2020

Uh oh!

TomNicholas commented Sep 10, 2020

Uh oh!

TomNicholas left a comment

Uh oh!

johnomotani commented Sep 10, 2020

Uh oh!

johnomotani commented Sep 10, 2020

Uh oh!

TomNicholas left a comment

Uh oh!

johnomotani commented Sep 10, 2020

Uh oh!

codecov-commenter commented Sep 10, 2020 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

johnomotani commented Sep 9, 2020

Uh oh!

pep8speaks commented Sep 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated at 2020-09-10 12:24:52 UTC

Uh oh!

johnomotani commented Sep 10, 2020

Uh oh!

TomNicholas commented Sep 10, 2020

Uh oh!

TomNicholas left a comment

Choose a reason for hiding this comment

Uh oh!

johnomotani commented Sep 10, 2020

Uh oh!

johnomotani commented Sep 10, 2020

Uh oh!

TomNicholas left a comment

Choose a reason for hiding this comment

Uh oh!

johnomotani commented Sep 10, 2020

Uh oh!

codecov-commenter commented Sep 10, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pep8speaks commented Sep 9, 2020 •

edited

Loading

codecov-commenter commented Sep 10, 2020 •

edited

Loading