You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+3-3
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ This implementation comprises **a script to load in the PyTorch model the weight
6
6
7
7

8
8
9
-
The model classes and loading script are located in [model_py.py](model_pytorch.py).
9
+
The model classes and loading script are located in [model_pytorch.py](model_pytorch.py).
10
10
11
11
The names of the modules in the PyTorch model follow the names of the Variable in the TensorFlow implementation. This implementation tries to follow the original code as closely as possible to minimize the discrepancies.
12
12
@@ -15,7 +15,7 @@ This implementation thus also comprises a modified Adam optimization algorithm a
15
15
- scheduled learning rate as [commonly used for Transformers](http://nlp.seas.harvard.edu/2018/04/03/attention.html#optimizer).
16
16
17
17
## Requirements
18
-
To use the model it-self by importing [model_py.py](model_pytorch.py), you just need:
18
+
To use the model it-self by importing [model_pytorch.py](model_pytorch.py), you just need:
19
19
- PyTorch (version >=0.4)
20
20
21
21
To run the classifier training script in [train.py](train.py) you will need in addition:
@@ -37,7 +37,7 @@ model = TransformerModel(args)
37
37
load_openai_pretrained_model(model)
38
38
```
39
39
40
-
This model generates Transformer's hidden states. You can use the `LMHead` class in [model.py](model.py) to add a decoder tied with the weights of the encoder and get a full language model. You can also use the `ClfHead` class in [model.py](model.py) to add a classifier on top of the transformer and get a classifier as described in OpenAI's publication. (see an example of both in the `__main__` function of [train.py](train.py))
40
+
This model generates Transformer's hidden states. You can use the `LMHead` class in [model_pytorch.py](model_pytorch.py) to add a decoder tied with the weights of the encoder and get a full language model. You can also use the `ClfHead` class in [model_pytorch.py](model_pytorch.py) to add a classifier on top of the transformer and get a classifier as described in OpenAI's publication. (see an example of both in the `__main__` function of [train.py](train.py))
41
41
42
42
To use the positional encoder of the transformer, you should encode your dataset using the `encode_dataset()` function of [utils.py](utils.py). Please refer to the beginning of the `__main__` function in [train.py](train.py) to see how to properly define the vocabulary and encode your dataset.
0 commit comments