Skip to content

Commit c207595

Browse files
committed
Proof-reading
1 parent 1d4c041 commit c207595

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -78,8 +78,8 @@ def calculate_log_probabilities(model: PreTrainedModel, tokenizer: Tokenizer, in
7878
```
7979

8080
Explanation:
81-
- we drop the logits for the last token, because it corresponds to the probability of the next token (which we don't have)
82-
- we compute the softmax over the last dimension (vocab size), to obtain probability distribution over all tokens
81+
- we drop the logits for the last token, because they correspond to the probability of the next token (we have no use for it because we are not generating text)
82+
- we compute the softmax over the last dimension (vocab size), to obtain the probability distribution over all tokens
8383
- we drop the first token because it is a start-of-sequence token
8484
- `log_probs[0, range(log_probs.shape[1]), tokens]` indexes into log_probs such as to extract
8585
- at position 0 (probability distribution for the first token after the start-of-sequence token) - the logprob value corresponding to the actual first token

0 commit comments

Comments
 (0)