Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix to beam search stopping criteria #572

Merged
merged 1 commit into from
May 23, 2019
Merged

fix to beam search stopping criteria #572

merged 1 commit into from
May 23, 2019

Conversation

armatthews
Copy link
Contributor

This patch fixes Beam Search's stopping criteria. If searching e.g. with a beam size of five, beam search should return the five best-scoring hypotheses. Previously beam search was returning the five shortest hypotheses. Thankfully with modest beam sizes, the end of sentence marker is not usually in the beam until it's at least somewhat appropriate, so the effect is shockingly small.

The new termination strategy is to wait until the score of the best active hypothesis is lower than the Nth best complete hypothesis (where N=beam size). Assuming each subsequent word has a negative score, this criteria guarantees that no active hypothesis could ever be better than the N complete hypotheses already found.

To verify the effectiveness of this patch I've done some experiments with a Chinese--English system trained on TED data, and decoded the dev set with a beam size of 5 and found the following:
Before: 17.45 BLEU 50.5|23.3|12.6|7.2 (brev=0.967)
After: 17.50 BLEU 50.2|23.1|12.5|7.1 (brev=0.976)
206 of 4558 sentences (4.5%) have different translations. The length ratio went from 96.79% to 97.66%.

One might also think that beam size interacts with this. Perhaps if the beam size were larger then would show up earlier in the search and hamper the results. To verify, I re-ran with a beam size of 50:
Before: 17.69 50.9|23.7|12.9|7.4 (brev=0.958)
After: 17.70 50.2|23.1|12.5|7.1 (brev=0.976)
This time 299 of 4558 sentences (6.6%) have different translations. The length ratio went from 95.87% to 97.69%.

Surprisingly the bigger beam size does not show larger gains. Nonetheless, this fix seems to yield small improvements and make the length ratio more stable over different beam sizes.

@neubig neubig merged commit ca4eac4 into master May 23, 2019
msperber added a commit to msperber/xnmt that referenced this pull request Dec 9, 2019
fix typo

fix cudnn lstm

better cudnn lstm padding check

fix tb-reporting LR for dynet optimizers

fix longtensor device

cudnn lstm: move seq_lengths to device

fix to beam search stopping criteria (neulab#572)

torch.no_grad() for LossEvalTask

no_grad() for inference code

update doc string

unit tests for cudnn lstm (passing even though training behavior seems buggy)

comment for cudnn lstm

save memory by freeing training data

fix a unit test

initial resource code

fix type annot

implement ResourceFile synta

resolve ResourceFile when loading saved models

made resource naming and _remove_data_dir() compatible

more convenient message for existing log files

support recent pyyaml

new 'pretend' settings

standard example: revert back no epochs

fix error when trying to subsample more sentences than are in the training set

fix previous fix

cudnn lstm: use total_length option

attempted cudnn lstm fix

removed unused code in cudnn lstm

fix missing train=True events in multi task training

attempt transplosed plot fix

fix code indentation in unicode tokenizer

OOVStatisticsReporter: don't crash in case of empty hypo

SkipOutOfMemory for simple training regimen (pytorch only)

cleaned up manual tests; fix grad logging

fix missing desc string in WER/CER scores
msperber added a commit to msperber/xnmt that referenced this pull request Dec 9, 2019
worked on optimizers, model saving+reverting

tested model loading

param init working

handling device

run dynet backend w/o torch installed

WIP:  toward using torch backend w/o dynet installed

introduced decorators for backends

WIP: toward working w/o dynet installed

finished separating out dynet and pytorch code

settings and command line arguments

remove dynet_profiling flag

building API doc works

fixed some unit test problems

make tensorboard optional, it's causing some interference with unit tests

run unit tests in either dynet or torch mode

bugfix: reload_example

skip unsupported test_beam_search + better error message

WIP: bug fixes + skip unit tests unsupported by torch backend

all unit tests running or skipped if unsupported by backend

merge dynet/torch classifier unit tests to use same config file

backend-agnostic LM running

update .gitignore

seq_labeler works independently of backend

torch/GPU fixes

fix loss function

add missing call to optimizer.step()

init forget gate params to 1

flexible loop-based lstm functional

init forget gate biases to 1

fix bug when no mask is set

fix mask device

fix LSTMCell device

fixed case of multiple layers for flexible lstm

implemented variational dropout for LSTMs

fixed device for dropout masks

remove unused code

wiring together of uni LSTMs works regardless of backend

seq2seq standard example working

fix torch MLP attender on GPU

NoBridge works with torchh backend

missing embedder features; fix multi-layer bilstm

torch version of DenseWordEmbedder

GPU fix for dense embedder

small bugfix

small fix

another small fix

another try

speech example working

fix reporting

ensembling working

runnable self-attention torch version

attempt at fixing longtensor device

attempt at fixing longtensor device

fix self-attention lineaar transforms device

fix layer-norm device

another device fix

doc update

fixes to kftt recipe

fix broadcasting issue

workaround for speech features for very short audios

remove unused code

label smoothing w/ pytorch backend

fix linear bridge and multilayer rnn decoder for torch backend

resolve deprecation warning

added amsgrad

minor cleanup

refactor transforms

fix to lazy expression sequence

fixed downsampling for TransformSeqTransducer

made param initialization more convenient

introduce BaseParamCollection to reduce code duplication

pytorch version of batchnorm

mini cleanup

CNN and transposed sequence tensors

hide InitializableModuleList(nn.ModuleList) from dynet backend

fix previous fix

fixed typo

fixed None check

MaxPoolCNNLayer: pooling optional

remove unused files

h5/npz reader refactored and support delta features

fix masking for subsampling MaxPoolCNNLayer

fix reverting tranposed torch tensors

implemented DotAttenderTorch

adam and sgd support all pytorch-implemented features, including weight decay

fix unit tests

add some unit tests supported by torch backend by now

WIP: fixed more unit tests

fix for torch 0.4.1

more 0.4.1 fixes

fix label smoothing

fix unit test

all unit tests passing

less verbose data loading

uncomment tensorboard logging

remove unused commandline_args

move train loss tracker

fix loss tracking when losses are averaged across minibatches

fix tensorboard step counter

minor doc fix

implemented skip_noisy

consistency rename for layer norm

fix feat stacking for older numpy version

separate out clip_grads and rescale_grads

fix major bug: pytorch gradients were not reset properly

fix sentpiece output proc

clean up comments

fix typo

small code simplification

small code simplification

clean up import

fix for same batch multitask regimen

set pytorch seed

fix numpy resize issue

fix typo

bug check for cudnn lstm

fix cnn device

expr seq gpu fix

fix typo

remove import

allow minor upgrade of pyyaml

anomaly detection

remove some comments

fix loss tracker when using multiple losses

fix batched L2 norm computation for fix_norm option

safer expression sequence arguments checks

update tensorboard writer to support histograms

fix LazyNumpyExpressionSequenceDynet with transposed tensors

print torch computation graph

fix print_cg_torch

gitignore visualized computation graphs

update gitignore

fixed feedback loss for batch size > 1

fix reporting of sentence losses

tensorboard visualize embeddings

TensorboardCustomWriter coding style

fix add_scalars

check/delete both .log and .log.tb

fix skip_noisy when parts of the params have not received gradients

fix loss tracker behavior for non-accumulative mode: accumulate minibatches since last report instead of reporting only most recent minibatch at time of report

calc_context tensor dimensions consistent between dynet and torch backends

tensorboard-log gradient norm

fix commit that made dy/torch dimension consistent

fix UniLSTMSeqTransducer, both torch and dynet implementations had bugs

fix dropout mask batch size for per-timestep rnn unfolding

fix embedder with numpy initializer

safer check for TB logger being ready

WIP: traceable tensor methods

first version of trace working

small bugfix to trace

fix dim() for ReversedExpressionSequenceTorch

include decoder state and final transducer state

numpy initializer for torch backend

for consistent behavior: dynet's numpy initializer checks dimensions of input array

fix torch's lstm forget gate initialization

fix to LazyNumpyExpressionSequenceTorch

turn off tracing by default

trying file reorg

move tiny model

fix reload test

added manual test (WIP)

remove test data from examples data dir

implemented InitializerSequence

fix switch of H/C when using lstm as decoder

manual training unit test running

added two-layer manual test

refactor InitializerSequence to use __getitem__

update bi-lstm's handling of sequence initializers

add manual test w/ bi-lstms

expanded manual tests

disable sparse dynet updates

better error msg for mismatching init arrays

manual gradients unit test

introduce ManualTestingBaseClass

seq2seq grad check

more work on unit tests; singled out failing tests for seq2seq training with more than one step

working on manual tests

updated lstm params match with some tricks

effectively disable the redundant lstm bias_hh

all manual tests passing now

WIP: manual full LAS test

work on full las manual test

mlp attender supports manual init

better error msg

pyramidal lstm supports param_init, bias_init

intermediate las model: passing the trained weights manual check

added fix_norm to test

fix (minor?) bug with label smoothing

added label smoothing to manual test

more work on manual tests

produced a test failing with SGD as well (not only Adam)

attender fix?

manual classifier tests: refactor + better precision

working on basic seq2seq test

basic s2s tests refactored

worked up to failing mlp att test

cleaned up mlp attender and tests

finished unit test refactoring round

add basic sec2sec grad test + clean up some

manual tests fairly complete and passing

WIP: load dynet weights into pytorch tensors

loading dynet models into pytorch backend works

cleaner solution for ignoring redundant lstm bias

remove reference to outdated backward hook

grad rescaling unit test

fix lattice attender: incorrect var name

simplified / unified grad clip configuration

fix tensorboardx version

document tensor tools

fix type annotation

fix variational recurrent dropout

consistent use of sent_len()

replace usages of dim() by more readable semantic accessors

TB: always log grads, + log LR

fix typo

fix cudnn lstm

better cudnn lstm padding check

fix tb-reporting LR for dynet optimizers

fix longtensor device

cudnn lstm: move seq_lengths to device

fix to beam search stopping criteria (neulab#572)
neubig pushed a commit that referenced this pull request Feb 8, 2020
* allow minor upgrade of pyyaml

* WIP: first runnable torch example

* first non-crashing test

* worked on optimizers, model saving+reverting

* tested model loading

* param init working

* handling device

* run dynet backend w/o torch installed

* WIP:  toward using torch backend w/o dynet installed

* introduced decorators for backends

* WIP: toward working w/o dynet installed

* finished separating out dynet and pytorch code

* settings and command line arguments

* remove dynet_profiling flag

* building API doc works

* fixed some unit test problems

* make tensorboard optional, it's causing some interference with unit tests

* run unit tests in either dynet or torch mode

* bugfix: reload_example

* skip unsupported test_beam_search + better error message

* WIP: bug fixes + skip unit tests unsupported by torch backend

* all unit tests running or skipped if unsupported by backend

* merge dynet/torch classifier unit tests to use same config file

* backend-agnostic LM running

* update .gitignore

* seq_labeler works independently of backend

* torch/GPU fixes

* fix loss function

* add missing call to optimizer.step()

* init forget gate params to 1

* flexible loop-based lstm functional

* init forget gate biases to 1

* fix bug when no mask is set

* fix mask device

* fix LSTMCell device

* fixed case of multiple layers for flexible lstm

* implemented variational dropout for LSTMs

* fixed device for dropout masks

* remove unused code

* wiring together of uni LSTMs works regardless of backend

* seq2seq standard example working

* fix torch MLP attender on GPU

* NoBridge works with torchh backend

* missing embedder features; fix multi-layer bilstm

* torch version of DenseWordEmbedder

* GPU fix for dense embedder

* small bugfix

* small fix

* another small fix

* another try

* speech example working

* fix reporting

* ensembling working

* runnable self-attention torch version

* attempt at fixing longtensor device

* attempt at fixing longtensor device

* fix self-attention lineaar transforms device

* fix layer-norm device

* another device fix

* doc update

* fixes to kftt recipe

* fix broadcasting issue

* workaround for speech features for very short audios

* remove unused code

* label smoothing w/ pytorch backend

* fix linear bridge and multilayer rnn decoder for torch backend

* resolve deprecation warning

* added amsgrad

* minor cleanup

* refactor transforms

* fix to lazy expression sequence

* fixed downsampling for TransformSeqTransducer

* made param initialization more convenient

* introduce BaseParamCollection to reduce code duplication

* pytorch version of batchnorm

* mini cleanup

* CNN and transposed sequence tensors

* hide InitializableModuleList(nn.ModuleList) from dynet backend

* fix previous fix

* fixed typo

* fixed None check

* MaxPoolCNNLayer: pooling optional

* remove unused files

* h5/npz reader refactored and support delta features

* fix masking for subsampling MaxPoolCNNLayer

* fix reverting tranposed torch tensors

* implemented DotAttenderTorch

* adam and sgd support all pytorch-implemented features, including weight decay

* fix unit tests

* add some unit tests supported by torch backend by now

* WIP: fixed more unit tests

* fix for torch 0.4.1

* more 0.4.1 fixes

* fix label smoothing

* fix unit test

* all unit tests passing

* less verbose data loading

* uncomment tensorboard logging

* remove unused commandline_args

* move train loss tracker

* fix loss tracking when losses are averaged across minibatches

* fix tensorboard step counter

* minor doc fix

* implemented skip_noisy

* consistency rename for layer norm

* fix feat stacking for older numpy version

* separate out clip_grads and rescale_grads

* fix major bug: pytorch gradients were not reset properly

* fix sentpiece output proc

* clean up comments

* fix typo

* small code simplification

* small code simplification

* clean up import

* fix for same batch multitask regimen

* set pytorch seed

* fix numpy resize issue

* fix typo

* bug check for cudnn lstm

* fix cnn device

* expr seq gpu fix

* fix typo

* remove import

* anomaly detection

* remove some comments

* fix loss tracker when using multiple losses

* fix batched L2 norm computation for fix_norm option

* safer expression sequence arguments checks

* update tensorboard writer to support histograms

* fix LazyNumpyExpressionSequenceDynet with transposed tensors

* print torch computation graph

* fix print_cg_torch

* gitignore visualized computation graphs

* update gitignore

* fixed feedback loss for batch size > 1

* fix reporting of sentence losses

* tensorboard visualize embeddings

* TensorboardCustomWriter coding style

* fix add_scalars

* check/delete both .log and .log.tb

* fix skip_noisy when parts of the params have not received gradients

* fix loss tracker behavior for non-accumulative mode: accumulate minibatches since last report instead of reporting only most recent minibatch at time of report

* WIP: traceable tensor methods

* first version of trace working

* small bugfix to trace

* calc_context tensor dimensions consistent between dynet and torch backends

* fix dim() for ReversedExpressionSequenceTorch

* include decoder state and final transducer state

* tensorboard-log gradient norm

* fix commit that made dy/torch dimension consistent

* fix UniLSTMSeqTransducer, both torch and dynet implementations had bugs

* fix dropout mask batch size for per-timestep rnn unfolding

* fix embedder with numpy initializer

* safer check for TB logger being ready

* numpy initializer for torch backend

* for consistent behavior: dynet's numpy initializer checks dimensions of input array

* fix torch's lstm forget gate initialization

* fix to LazyNumpyExpressionSequenceTorch

* turn off tracing by default

* trying file reorg

* added manual test (WIP)

* move tiny model

* fix reload test

* remove test data from examples data dir

* implemented InitializerSequence

* fix switch of H/C when using lstm as decoder

* manual training unit test running

* added two-layer manual test

* refactor InitializerSequence to use __getitem__

* update bi-lstm's handling of sequence initializers

* add manual test w/ bi-lstms

* expanded manual tests

* disable sparse dynet updates

* better error msg for mismatching init arrays

* manual gradients unit test

* introduce ManualTestingBaseClass

* seq2seq grad check

* more work on unit tests; singled out failing tests for seq2seq training with more than one step

* working on manual tests

* updated lstm params match with some tricks

* effectively disable the redundant lstm bias_hh

* all manual tests passing now

* WIP: manual full LAS test

* work on full las manual test

* mlp attender supports manual init

* better error msg

* pyramidal lstm supports param_init, bias_init

* intermediate las model: passing the trained weights manual check

* added fix_norm to test

* fix (minor?) bug with label smoothing

* added label smoothing to manual test

* more work on manual tests

* produced a test failing with SGD as well (not only Adam)

* attender fix?

* manual classifier tests: refactor + better precision

* working on basic seq2seq test

* basic s2s tests refactored

* worked up to failing mlp att test

* cleaned up mlp attender and tests

* finished unit test refactoring round

* add basic sec2sec grad test + clean up some

* manual tests fairly complete and passing

* WIP: load dynet weights into pytorch tensors

* loading dynet models into pytorch backend works

* cleaner solution for ignoring redundant lstm bias

* remove reference to outdated backward hook

* grad rescaling unit test

* fix lattice attender: incorrect var name

* simplified / unified grad clip configuration

* fix tensorboardx version

* document tensor tools

* fix type annotation

* fix variational recurrent dropout

* consistent use of sent_len()

* replace usages of dim() by more readable semantic accessors

* TB: always log grads, + log LR

fix typo

fix cudnn lstm

better cudnn lstm padding check

fix tb-reporting LR for dynet optimizers

fix longtensor device

cudnn lstm: move seq_lengths to device

fix to beam search stopping criteria (#572)

torch.no_grad() for LossEvalTask

no_grad() for inference code

update doc string

unit tests for cudnn lstm (passing even though training behavior seems buggy)

comment for cudnn lstm

save memory by freeing training data

fix a unit test

initial resource code

fix type annot

implement ResourceFile synta

resolve ResourceFile when loading saved models

made resource naming and _remove_data_dir() compatible

more convenient message for existing log files

support recent pyyaml

new 'pretend' settings

standard example: revert back no epochs

fix error when trying to subsample more sentences than are in the training set

fix previous fix

cudnn lstm: use total_length option

attempted cudnn lstm fix

removed unused code in cudnn lstm

fix missing train=True events in multi task training

attempt transplosed plot fix

fix code indentation in unicode tokenizer

OOVStatisticsReporter: don't crash in case of empty hypo

SkipOutOfMemory for simple training regimen (pytorch only)

cleaned up manual tests; fix grad logging

fix missing desc string in WER/CER scores

* first non-crashing test

worked on optimizers, model saving+reverting

tested model loading

param init working

handling device

run dynet backend w/o torch installed

WIP:  toward using torch backend w/o dynet installed

introduced decorators for backends

WIP: toward working w/o dynet installed

finished separating out dynet and pytorch code

settings and command line arguments

remove dynet_profiling flag

building API doc works

fixed some unit test problems

make tensorboard optional, it's causing some interference with unit tests

run unit tests in either dynet or torch mode

bugfix: reload_example

skip unsupported test_beam_search + better error message

WIP: bug fixes + skip unit tests unsupported by torch backend

all unit tests running or skipped if unsupported by backend

merge dynet/torch classifier unit tests to use same config file

backend-agnostic LM running

update .gitignore

seq_labeler works independently of backend

torch/GPU fixes

fix loss function

add missing call to optimizer.step()

init forget gate params to 1

flexible loop-based lstm functional

init forget gate biases to 1

fix bug when no mask is set

fix mask device

fix LSTMCell device

fixed case of multiple layers for flexible lstm

implemented variational dropout for LSTMs

fixed device for dropout masks

remove unused code

wiring together of uni LSTMs works regardless of backend

seq2seq standard example working

fix torch MLP attender on GPU

NoBridge works with torchh backend

missing embedder features; fix multi-layer bilstm

torch version of DenseWordEmbedder

GPU fix for dense embedder

small bugfix

small fix

another small fix

another try

speech example working

fix reporting

ensembling working

runnable self-attention torch version

attempt at fixing longtensor device

attempt at fixing longtensor device

fix self-attention lineaar transforms device

fix layer-norm device

another device fix

doc update

fixes to kftt recipe

fix broadcasting issue

workaround for speech features for very short audios

remove unused code

label smoothing w/ pytorch backend

fix linear bridge and multilayer rnn decoder for torch backend

resolve deprecation warning

added amsgrad

minor cleanup

refactor transforms

fix to lazy expression sequence

fixed downsampling for TransformSeqTransducer

made param initialization more convenient

introduce BaseParamCollection to reduce code duplication

pytorch version of batchnorm

mini cleanup

CNN and transposed sequence tensors

hide InitializableModuleList(nn.ModuleList) from dynet backend

fix previous fix

fixed typo

fixed None check

MaxPoolCNNLayer: pooling optional

remove unused files

h5/npz reader refactored and support delta features

fix masking for subsampling MaxPoolCNNLayer

fix reverting tranposed torch tensors

implemented DotAttenderTorch

adam and sgd support all pytorch-implemented features, including weight decay

fix unit tests

add some unit tests supported by torch backend by now

WIP: fixed more unit tests

fix for torch 0.4.1

more 0.4.1 fixes

fix label smoothing

fix unit test

all unit tests passing

less verbose data loading

uncomment tensorboard logging

remove unused commandline_args

move train loss tracker

fix loss tracking when losses are averaged across minibatches

fix tensorboard step counter

minor doc fix

implemented skip_noisy

consistency rename for layer norm

fix feat stacking for older numpy version

separate out clip_grads and rescale_grads

fix major bug: pytorch gradients were not reset properly

fix sentpiece output proc

clean up comments

fix typo

small code simplification

small code simplification

clean up import

fix for same batch multitask regimen

set pytorch seed

fix numpy resize issue

fix typo

bug check for cudnn lstm

fix cnn device

expr seq gpu fix

fix typo

remove import

allow minor upgrade of pyyaml

anomaly detection

remove some comments

fix loss tracker when using multiple losses

fix batched L2 norm computation for fix_norm option

safer expression sequence arguments checks

update tensorboard writer to support histograms

fix LazyNumpyExpressionSequenceDynet with transposed tensors

print torch computation graph

fix print_cg_torch

gitignore visualized computation graphs

update gitignore

fixed feedback loss for batch size > 1

fix reporting of sentence losses

tensorboard visualize embeddings

TensorboardCustomWriter coding style

fix add_scalars

check/delete both .log and .log.tb

fix skip_noisy when parts of the params have not received gradients

fix loss tracker behavior for non-accumulative mode: accumulate minibatches since last report instead of reporting only most recent minibatch at time of report

calc_context tensor dimensions consistent between dynet and torch backends

tensorboard-log gradient norm

fix commit that made dy/torch dimension consistent

fix UniLSTMSeqTransducer, both torch and dynet implementations had bugs

fix dropout mask batch size for per-timestep rnn unfolding

fix embedder with numpy initializer

safer check for TB logger being ready

WIP: traceable tensor methods

first version of trace working

small bugfix to trace

fix dim() for ReversedExpressionSequenceTorch

include decoder state and final transducer state

numpy initializer for torch backend

for consistent behavior: dynet's numpy initializer checks dimensions of input array

fix torch's lstm forget gate initialization

fix to LazyNumpyExpressionSequenceTorch

turn off tracing by default

trying file reorg

move tiny model

fix reload test

added manual test (WIP)

remove test data from examples data dir

implemented InitializerSequence

fix switch of H/C when using lstm as decoder

manual training unit test running

added two-layer manual test

refactor InitializerSequence to use __getitem__

update bi-lstm's handling of sequence initializers

add manual test w/ bi-lstms

expanded manual tests

disable sparse dynet updates

better error msg for mismatching init arrays

manual gradients unit test

introduce ManualTestingBaseClass

seq2seq grad check

more work on unit tests; singled out failing tests for seq2seq training with more than one step

working on manual tests

updated lstm params match with some tricks

effectively disable the redundant lstm bias_hh

all manual tests passing now

WIP: manual full LAS test

work on full las manual test

mlp attender supports manual init

better error msg

pyramidal lstm supports param_init, bias_init

intermediate las model: passing the trained weights manual check

added fix_norm to test

fix (minor?) bug with label smoothing

added label smoothing to manual test

more work on manual tests

produced a test failing with SGD as well (not only Adam)

attender fix?

manual classifier tests: refactor + better precision

working on basic seq2seq test

basic s2s tests refactored

worked up to failing mlp att test

cleaned up mlp attender and tests

finished unit test refactoring round

add basic sec2sec grad test + clean up some

manual tests fairly complete and passing

WIP: load dynet weights into pytorch tensors

loading dynet models into pytorch backend works

cleaner solution for ignoring redundant lstm bias

remove reference to outdated backward hook

grad rescaling unit test

fix lattice attender: incorrect var name

simplified / unified grad clip configuration

fix tensorboardx version

document tensor tools

fix type annotation

fix variational recurrent dropout

consistent use of sent_len()

replace usages of dim() by more readable semantic accessors

TB: always log grads, + log LR

fix typo

fix cudnn lstm

better cudnn lstm padding check

fix tb-reporting LR for dynet optimizers

fix longtensor device

cudnn lstm: move seq_lengths to device

fix to beam search stopping criteria (#572)

* torch.no_grad() for LossEvalTask

no_grad() for inference code

update doc string

unit tests for cudnn lstm (passing even though training behavior seems buggy)

comment for cudnn lstm

save memory by freeing training data

fix a unit test

initial resource code

fix type annot

implement ResourceFile synta

resolve ResourceFile when loading saved models

made resource naming and _remove_data_dir() compatible

more convenient message for existing log files

support recent pyyaml

new 'pretend' settings

standard example: revert back no epochs

fix error when trying to subsample more sentences than are in the training set

fix previous fix

cudnn lstm: use total_length option

attempted cudnn lstm fix

removed unused code in cudnn lstm

fix missing train=True events in multi task training

attempt transplosed plot fix

fix code indentation in unicode tokenizer

OOVStatisticsReporter: don't crash in case of empty hypo

SkipOutOfMemory for simple training regimen (pytorch only)

cleaned up manual tests; fix grad logging

fix missing desc string in WER/CER scores

* fix unit tests

* remove dev-time unit tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants