Autobatch #543

beckdaniel · 2018-11-14T08:04:39Z

This pull request implement a new training regimen that leverages DyNet autobatching. It's very similar to SimpleTrainingRegimen but accumulates losses sequentially before computing the forward and backward passes. batch_size is expected to be fixed at 1 while update_every will encode the "actual batch size".

This request also add a few YAML files to benchmark standard batching vs. autobatching.

msperber

Thanks! I have some minor comments and think we can merge after those are addressed.

msperber · 2018-11-14T09:18:05Z

xnmt/train/regimens.py

+          self.checkpoint_and_save(save_fct)
+        if self.should_stop_training(): break
+
+  def checkpoint_and_save(self, save_fct):


Maybe we can avoid code duplication by making this a subclass of SimpleTrainingRegimen instead of TrainingRegimen, and removing checkpoint_and_save and update?

Done. There is a little hack to call backward at the right time though, with a comment. Not sure if it is the best solution.

xnmt/train/regimens.py

xnmt/git_rev.py

msperber · 2018-11-14T09:22:06Z

examples/23a_big_model.yaml

@@ -0,0 +1,56 @@
+# Similar to the standard setup, but with big data to compare


It would improve clarity to put all experiments in one file with a defaults experiment, and then several experiments that only change the relevant points. But this is not a well-documented feature and it's also ok for me to just do this myself after merging this PR.

xnmt/train/regimens.py

beckdaniel · 2018-11-15T05:28:36Z

Thanks! Started to work on the changes but might need a few more days. A few questions:

Should I do some profiling analysis before getting this merged? @neubig mentioned something like this in the issue tracker
batch_size in theory could be > 1 but that doesn't seem to make sense in this new regimen. Should I enforce it to be 1 through an assert? Less flexible, but would ensure users set the batch size through the right parameter (update_every)

beckdaniel · 2018-11-16T05:58:47Z

Hmmm, I'm bit of a newbie with Github reviews... Should I just click on "Resolve conversation" on the points I addressed? Do I ask for another review when I addressed all the requested changes?

msperber · 2018-11-16T06:52:57Z

Sure, but you can also just write a message when you're done and I'll take another look, either way is fine.

I think checking that the batch size is set through update_every and not through the batcher and potentially throwing a ValueError would be a good idea.

Regarding the profiling, that would also be interesting information! We could probably also merge without that, but if you're planning on doing some profiling we could wait for that. In this case, you can prepend [WIP] to the title of the PR to communicate that this is not ready to merge.

beckdaniel · 2018-11-20T07:40:15Z

Ok, I think it is ready for another review now. The only that should be missing from the previous review is to unify the YAML files into a single one: not sure exactly how to do that.

As for the profiling, I'm ok with this being merged without profiling since it might be useful as is. I can always do another pull request with an improved version later.

msperber · 2018-11-20T08:53:16Z

Sure, that sounds good! The code looks good to me, and I'll send a PR to simplify the config later.

philip30 · 2018-11-21T02:02:07Z

Addressing #536

beckdaniel added 8 commits October 16, 2018 16:36

first try

07a8145

yaml

40baeee

fixed tracker, still need to fix batch sizes

00b11a9

added examples

a5c7ac9

proper vocabs

687760c

a last update before checkpointing

22c3267

toy now uses the 10k data

bfca3db

some cleanup

d266865

msperber requested changes Nov 14, 2018

View reviewed changes

beckdaniel added 3 commits November 15, 2018 15:26

started refactoring

ded7d9f

weird hack to ensure backward is called properly

3d90e2d

refactored into simple training regimen subclass

3f13630

beckdaniel added 2 commits November 15, 2018 16:29

cleanup

5164dc8

doc

b1ec7b8

msperber mentioned this pull request Nov 19, 2018

Lattice-to-sequence #547

Merged

beckdaniel added 3 commits November 20, 2018 16:44

forcing internal batch size to 1

3782800

changed assertion to exception

352d73d

naming

7b7c0cc

msperber approved these changes Nov 20, 2018

View reviewed changes

msperber merged commit 98d82ac into neulab:master Nov 20, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Autobatch #543

Autobatch #543

beckdaniel commented Nov 14, 2018

msperber left a comment

msperber Nov 14, 2018

beckdaniel Nov 20, 2018

msperber Nov 14, 2018

beckdaniel commented Nov 15, 2018

beckdaniel commented Nov 16, 2018

msperber commented Nov 16, 2018

beckdaniel commented Nov 20, 2018

msperber commented Nov 20, 2018

philip30 commented Nov 21, 2018 •

edited

Loading

		@@ -0,0 +1,56 @@
		# Similar to the standard setup, but with big data to compare

Autobatch #543

Autobatch #543

Conversation

beckdaniel commented Nov 14, 2018

msperber left a comment

Choose a reason for hiding this comment

msperber Nov 14, 2018

Choose a reason for hiding this comment

beckdaniel Nov 20, 2018

Choose a reason for hiding this comment

msperber Nov 14, 2018

Choose a reason for hiding this comment

beckdaniel commented Nov 15, 2018

beckdaniel commented Nov 16, 2018

msperber commented Nov 16, 2018

beckdaniel commented Nov 20, 2018

msperber commented Nov 20, 2018

philip30 commented Nov 21, 2018 • edited Loading

philip30 commented Nov 21, 2018 •

edited

Loading