Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lattice-to-sequence #547

Merged
merged 43 commits into from
Nov 19, 2018
Merged

Lattice-to-sequence #547

merged 43 commits into from
Nov 19, 2018

Conversation

msperber
Copy link
Contributor

This is a port of the code of my lattice-to-sequence paper ( https://arxiv.org/abs/1704.00559 ). It includes the following components:

  • preproc.LatticeFromPlfExtractor: creates node-labeled lattices with properly normalized lattice scores from given lattices in .plf format.
  • sent.Lattice / sent.LatticeNode: data structure
  • input_readers.LatticeReader: can read lattices from the output of the extractor
  • LatticeLSTMTransducer / LatticeLSTMTransducer: lattice LSTM
  • attenders.LatticeBiasedMlpAttender: attention that is biased via the lattice confidence scores
  • A unit test with toy data
  • note: the lattice LSTM currently does not consider lattice scores (unlike the attention), and the attention bias does not have a peakiness coefficient, as both of these yielded only minor gains in the paper.

There is also a bit of refactoring to make preproc.py more YAML-like, which will not need a review.

@msperber msperber requested a review from neubig November 19, 2018 10:24
@neubig
Copy link
Contributor

neubig commented Nov 19, 2018

Hi, this basically looks good! I just had a quick clarification that I couldn't tell immediately by looking at the code, which is how the fact that you have a lattice annotated interacts with batching, etc. Previously we had a somewhat sub-optimal implementation that basically assumed a sentence was a list of word IDs, and downstream operations such as batching were predicated on this. Has this been fixed in this commit or previously?

I'm mostly asking because @armatthews @cindyxinyiwang and I were also discussing implementing the Eriguchi et al. tree-to-sequence model, and would run in to the same problem.

@msperber
Copy link
Contributor Author

I think this issue should be solved by the introduction of the Batch and Sentence classes, which replace the nested lists of unclear semantics from before. Is that what you were referring to? This PR doesn't change anything about that. It assumes the batch size is set to 1, and once #543 is merged it should be easy to speed things up via auto batching.

@neubig
Copy link
Contributor

neubig commented Nov 19, 2018

OK maybe it does, thanks! Anyway, I browsed this briefly and think it's OK to merge, although I haven't checked carefully.

@msperber
Copy link
Contributor Author

Ok, thanks for taking a look!

@msperber msperber merged commit 8ee8fd5 into master Nov 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants