Skip to content

Commit

Permalink
Accuracy fix for llama3.1-70B in eager/torch.compile mode (#1746)
Browse files Browse the repository at this point in the history
Co-authored-by: Vivek Goel <[email protected]>
  • Loading branch information
ckvermaAI and vivekgoe authored Feb 7, 2025
1 parent 3d7b2fa commit a0d14d2
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion optimum/habana/transformers/models/llama/modeling_llama.py
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,8 @@ def __init__(

def _set_cos_sin_cache(self, seq_len, device, dtype):
self.max_seq_len_cached = seq_len
t = torch.arange(self.max_seq_len_cached, device=device, dtype=self.inv_freq.dtype)
# Use torch.int32 to avoid loss due to low precision with BF16 (refer to SW-215204)
t = torch.arange(self.max_seq_len_cached, device=device, dtype=torch.int32)

freqs = torch.outer(t, self.inv_freq)
# Different from paper, but it uses a different permutation in order to obtain the same calculation
Expand Down

0 comments on commit a0d14d2

Please sign in to comment.