-
Notifications
You must be signed in to change notification settings - Fork 366
Pull requests: NVIDIA/TransformerEngine
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add efficient Cross-Entropy by cuda kernel to accelerate training speed and reduce cross-entropy memory usage during training.
#995
opened Jul 8, 2024 by
cb521
1 task
[pre-commit.ci] pre-commit suggestions
wontfix
This will not be worked on
#979
opened Jul 2, 2024 by
pre-commit-ci
bot
•
Draft
[Pytorch] Implement fp32 accumulation for attention with context parallel in both forward and backward pass.
#821
opened Apr 28, 2024 by
Yuxin-CV
Loading…
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.