Releases: nyu-mll/jiant
v1.2.1
v1.2.0
Highlighted changes:
- Add support for RoBERTa, XLM, and GPT-2 via
pytorch_transformers
1.2. - Add support for pip installation (and moved the body of
main.py
and theconfig
directory to accomodate that change). - Fix a bug that produced invalid micro/macro average scores during validation.
Minor changes:
- Refactor old GPT (v1) implementation to use
pytorch_transformers
. - Make the code that adds git status information to logs more robust.
- Minor cleanup to data loading and to MNLI data handling logic.
- Fix a short-lived bug invalidating hypothesis-only MNLI results.
- Restore (partial) support for sequence-to-sequence tasks, though with no fully supported demonstration tasks in place yet.
Dependency changes:
- Updated requirement
pytorch_transformers
to 1.2.0. - Updated requirement to NLTK 3.4.5 to avoid a potential security issue.
v1.1.0
We expect another release within a week or two that will add support for RoBERTa (see #890), but this is a quick intermediate release now that XLNet support is stable/working.
Highlighted changes:
- Full support for XLNet and the whole-word-masking variants of BERT.
- Many small improvements to Google Cloud Platform/Kubernetes/Docker support.
- Add small but handy option to automatically delete checkpoints when a job finishes.
max_vals
is now used when computing warmup time with optimizers that use warmup.- New
auto
option for tokenizer chooses an appropriate tokenizer for any given input module. - Some internal changes to how
<SOS>
/<EOS>
/[SEP]
/[CLS]
tokens are handled during task preprocessing. This will require small changes to custom task code along the lines of what is seen in #845.
Dependency changes:
- AllenNLP 0.8.4 now required
- pytorch_transformers 1.0 now required when using BERT or XLNet.
Warnings:
- Upgrading to 1.1 will break existing checkpoints for BERT-based models.
v1.0.1
Bug fixes:
- Addresses an issue that prevented temporary checkpoints from being deleted.
v1.0.0
The first stable release of jiant
.
Highlighted changes:
- Support for the SuperGLUE v2.0 set of tasks, including all the baselines discussed in the SuperGLUE paper.
- A simpler and more standard code structure.
- Cleaner, more-readable logs.
- Simplified logic for checkpointing and evaluation, with fewer differences between pretraining and target task training.
- Fewer deprecated/unsupported modules.
- Many small bug fixes and improvements to errors and warnings.
Dependency changes:
- Upgrade to AllenNLP 0.8.4, which adds the option to use the GitHub development version of pytorch-pretrained-bert, and with it, the whole-word-masking variants of BERT.
Warnings:
- Upgrading from 0.9 to 1.0 will break most older model checkpoints and cached preprocessed data.
"Can You Tell Me How to Get Past Sesame Street?" code
This release contains code to recreate part of the experiments from the paper "Can You Tell Me How to Get Past Sesame Street? Sentence-Level Pretraining Beyond Language Modeling". For the remaining experiments, see this branch.
v0.9.1
v0.9.0
The initial work-in-progress release coinciding with the launch of SuperGLUE.
Highlights:
We currently support two-phase training (pretraining and target task training) using various shared encoders, including:
- BERT
- OpenAI GPT
- Plain Transformer
- Ordered Neurons (ON-LSTM) Grammar Induction Model
- PRPN Grammar Induction Model
We also have support for SuperGLUE baselines, sentence encoder probing experiments, and STILTS-style training.
Examples
They can be found in https://github.com/nyu-mll/jiant/tree/master/config/examples
"ELMo's Friends" paper experiment code
r1 s2s decoder update (make more params active; add projection layer) (#…